Archive for October, 2007

Lock Collection

Memory management is a standard problem for programs written in C and C++. It is important to free memory when it is no longer needed, because otherwise you will run out. Unfortunately, it is complex to keep track of which memory is no longer needed. The symptoms of memory management problems are typically either running out of memory, or obscure and difficult to debug memory corruption.

Any program can do memory management successfully with a disciplined approach. However, the problems are pervasive enough that there is a standard solution for them: garbage collection. Java, for example, always uses garbage collection, and therefore has no memory management problem. The same is true of Lisp, and of most interpreted languages. Garbage collection decreases performance and increases memory usage. Still, many people find it to be a valuable technique.

Is there any comparable approach which will work for parallel programming? The problem in parallel programming is race conditions. The standard solution is mutexes and condition variables, and variants thereof. These can work with a disciplined approach. However, in practice, the problems are pervasive. Is there something like garbage collection which can provide a complete solution?

There has been a lot of recent work on transactional memory. This is a technique which tracks all reads and writes in a locked section, and unrolls the writes if two different threads turn out to be in the section simultaneously. I don’t personally see the point. Transactional memory still requires the programmer to place locks correctly. The advantage it provides is that it can provide good performance even with coarse-grained locks. However, the good performance only comes if lock contention is low. When lock contention is high, the bookkeeping overhead of transactional memory is high. And the coarser the locks, the higher the lock contention. So it doesn’t seem like a significant change to the current systems.

What we need is some way for multiple threads of control to share access to data structures in memory without requiring locks. And we need some way to guarantee data structure invariants, so that while the structure is being updated by one thread other threads do not see an inconsistent state. It’s difficult to see how we can maintain invariants without annotations from the programmer.

The goal I am setting is a high one: to be able to write code in a relatively straightforward manner, and to permit that code to use multiple threads of control. I don’t see a solution.

A fallback position might be the one I discussed earlier: require the separate threads to operate in separate address spaces, and provide library calls to move data back and forth. This is a more difficult programming model, so it doesn’t seem like the perfect solution. But I don’t see a good one.

Comments (3)

Just More War

Nathan made a long series of comments on my earlier post on terrorism and Just War theory. I’m going to try to reset a bit.

In that post I argued, I think correctly, that terrorism is not justified under Just War theory. But: is Just War theory applicable today? Do we believe that everybody should follow it?

The first question is whether war can have rules at all. There is a long history of rules of warfare, most recently the Geneva conventions. There is also a long history of people ignoring those rules. Obviously nobody in Iraq today is scrupulously following the Geneva conventions (though I hope that most of the U.S. military is trying to do so).

One way to see whether the rules mean anything at all is to look to see what happens when the war is over. Are people who violated the rules punished? A relatively recent development is international tribunals which attempt to punish people for violating the rules–not so much the Geneva convention, as extreme so-called crimes against humanity. These tribunals can be effective ways to punish the leaders in a war, though in general they can only punish the leaders of the side who lost.

Another way to see whether the rules mean anything is to see who follows them. Until the Iraq war, the U.S. was generally careful to follow the Geneva conventions, and other agreed upon laws of war. Will the decision by the Bush administration to ignore the conventions have lasting bad effects on the U.S.? I tend to think that it will, but it’s obviously difficult to prove. Russia, for example, ignored the Geneva conventions in their battles in Chechnya; no bad effects are evident to date.

If there are no rules, then does it make sense to speak of terrorism as violating the rules? In some cases we can say that there is no state of war, in which case the terrorist attack is simply a horrible crime. I would say that this is true of Timothy McVeigh’s attack in Oklahoma City. McVeigh claimed to be seeking revenge for Waco and Ruby Ridge, but there were other means of redress. It was correct for the U.S. government to charge him with a crime and convict him (though I am personally opposed to the death penalty which he received).

What about 9/11? I think Al Qaeda would say that there was a state of war between them and the U.S., although many in the U.S. were unaware of it. Al Qaeda believed that the U.S. had been attacking people in the Muslim world by putting armed forces in Saudi Arabia and by supporting Israel. It does not imply supporting 9/11 to observe that their argument is rational, nor does that imply that I actually agree with it. If Al Qaeda was at war with the U.S., and if there are no rules in war, then do we have any grounds for arguing that their attack was unethical?

I think there is an important observation to make here. Whether or not there are rules of war, the U.S. (and, for that matter, Israel) do generally try to avoid accidental deaths. The U.S. has made many mistakes in Iraq, causing many accidental deaths, but in general they were mistakes. In some cases, soldiers have been punished for them. In almost all cases, the deaths were considered to be regrettable. Al Qaeda’s 9/11 attack was, of course, in no way a mistake.

Is that a key difference between terrorism and other forms of warfare? That terrorism intentionally kills people at random, while non-terrorism kills people by accident? It is a subtle distinction, but not, I think, an irrelevant one.

However, we must then also consider the ethics of creating a situation in which it is likely that people will be killed by accident. The unprovoked invasion of Iraq clearly created such a situation.

It is odd to observe that the Bush administration has on the one hand called for idealistic efforts to create worldwide democracy, while on the other hand it has adopted a very pragmatic and “ends justify the means” approach to actually (attempting to) create it.

Comments (2)

Financial Complexity

The current news about bad management of credit risks by banks reminds me of one of my ongoing concerns about the modern financial markets. Modern finance is very large and very complex. Once upon a time financial markets were relatively straightforward: you bought stocks, bonds, future, options. These are not simple instruments, but it’s fairly easy to understand the risk. These days, people have constructed a broad range of so-called derivatives, which are basically bets on a formula. The inputs to the formula are more-or-less real things with market prices. The formula itself, however, is almost arbitrarily complex.

There is nothing wrong with derivatives per se. However, it can be very difficult to understand exactly what the risks are. When the inputs to the formula are combined, it takes some thought to understand how changes in the inputs affect the output. It then takes a lot of thought to understand what would be likely to change the inputs, and how the inputs might move together or separately. Poorly understood inputs caused the collapse of Long Term Capital and Enron (Enron was a fraudulent company which would have eventually collapsed in any case, but the specific collapse was triggered by some bad bets they made–although arguably those bets were themselves fraudulent actions by the CFO).

There is also nothing wrong with wealthy people taking poorly-understood risks with their money. My concern is that many different organizations are making the same bets without realizing it. That means that many organizations could run into trouble at the same time. And that would be big trouble for the global financial system.

I think we’re seeing a small taste of that now, though in a simpler way. Many organizations invested in mortgate securities. Almost none of them properly understood the risks. A long series of banks is now taking big write-downs because of their bad investments. The damage is not too large–most people do pay their mortgages in the end, and evena foreclosure retains some value–so the mortgage securities are mostly not entirely worthless. But still we see that many different organizations, with different policies, all made the same sort of mistake.

If that happens in a big way–e.g., many people bet against some currency collapse, but it collapses anyhow–we could see some serious pain in the financial system. And that could be very bad for the economy as a whole.

Comments (4)

Blog Spam

One of the weirder aspects of writing this blog is that I get regular spam comments. The comments usually have just a few words, along of the lines of “great post, this will really help me with my essay,” along with a link to some commercial web site. At the moment I’m running about 40 of these per day. I simply delete them all, which fortunately is fast and easy. I expect it will be even easier when I get around to upgrading to the newer version of WordPress.

I can see two possible motivations for this spam. The first is that it is aimed at Google’s well known PageRank algorithm. Nobody reading some random blog is likely to be in need of cheap airline tickets or whatever they are selling (here I judge by the name of the link). And given the rate at which these arrive, any blog which doesn’t delete the spam comments will simply be inundated with them, so no person would ever read the comments anyhow. But if these links appear on a bunch of random sites, then it seems at least possible that Google’s search engine would misjudge the link target as a popular and important web site. The thing is, Google isn’t really that dumb. The spam comments are so regular that even a simple computer program could detect a spam blog. So it’s hard to believe that these spam comments really increase the rating of the site in a search engine. If anything I would think that they would decrease it. (I should say that although I work at Google, I do not work on search, and I don’t know any details about the actual algorithms which are used. In this paragraph, I am simply guessing as to how it probably works.)

The second motivation I can see is that the links might be traps. If somebody running an unpatched Internet Explorer clicks on the link, they might get infected and have their computer incorporated into a botnet. That would seem like a good reason to put the links out there. Again, though, no human would be fooled by these links. I know that a lot of people don’t really understand how bad Microsoft’s browser security model is. Still, the links look like obvious spam; why would anybody click on them?

I haven’t been able to think of any other reasons, and these reasons are very weak. So why do the spammers bother? Why do they bother putting comments on my blog, when none of the comments ever show up on the actual web site? The comments are obviously created by a computer program, so they are cheap. Is it just a case of getting a benefit from one comment in ten million, so you might as well try?

I haven’t bothered to check, but I hope that organizations like Project Honeypot run some simple blogs to collect comment spam. That would help them map botnets. Unfortunately we then need some approach to actually fixing botnets. As far as I know, we don’t have one.

Comments (1)

Again, Parallel Programming

Yesterday’s post was on the incoherent side, even for me. Let’s take it again, a little more slowly.

Our brains are inherently parallel processors. Our selves are composed of many impulses, actions, and proto-thoughts. However, our logical thought, the inner voice we forced to use when writing programs (or, for that matter, when writing blog entries), is a sequential series of thoughts. The ideas that inject themselves into our thoughts come from all parts of our brain, but the thoughts themselves proceed linearly. At least, that is how it is for me, and it seems it is much the same for everybody.

When our speech and our thoughts refer to an object which changes, our future references to that object refer to the changed object, not the original, unchanged, object. Referring to the original object requires awkward turns of phrase. Consider a rose. The rose is red. All of sudden, it turns blue. What do we have here? A blue rose. Notice how easy it is for me for to refer to the blue rose. Compare to thinking about the rose before it turned blue. Is that the red rose? Well, no; there is no red rose. It is the rose which was red before it turned blue.

Now, watch the rhetorical trick: I’m going to use an ad hoc just-so story with no actual evidence. It seems to me that these facts about our thought condition our programming languages. Parallel languages don’t catch on because people find it difficult to think about many things happening at once. Functional languages don’t catch on because people naturally think in terms of objects which change over time–side-effects–and find it hard to express ideas not in terms of objects which change, but in terms of layers of modifications around existing objects which do not change.

We have imperative languages because they way we have to think to program in those languages matches the way we think anyhow.

Unfortunately the technical evolution of our processors is such that, if we want to continue to take advantage of the steadily increasing performance to which we have become accustomed, we need to learn how to program them in a different way.

It seems likely to me that this different way is going to be whatever most naturally adapts imperative languages to a parallel programming model. For example, perhaps we should just start adapting the client-server model to operate within a single machine. A connection between a client and a server is easy to conceptualize as a function call. On the server side, many function calls can be handled in parallel, but this is not confusing as they have no significant interaction. Perhaps the client and server could even be different threads within the same process, but they would use a rigid API to transfer data, rather than casually using the same address space as thread currently do today (a sharing which in practice is error-prone). A DMA transfer of data between different threads in the same process could be extremely efficient.

Comments (9)

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »