Again, Parallel Programming

Yesterday’s post was on the incoherent side, even for me. Let’s take it again, a little more slowly.

Our brains are inherently parallel processors. Our selves are composed of many impulses, actions, and proto-thoughts. However, our logical thought, the inner voice we forced to use when writing programs (or, for that matter, when writing blog entries), is a sequential series of thoughts. The ideas that inject themselves into our thoughts come from all parts of our brain, but the thoughts themselves proceed linearly. At least, that is how it is for me, and it seems it is much the same for everybody.

When our speech and our thoughts refer to an object which changes, our future references to that object refer to the changed object, not the original, unchanged, object. Referring to the original object requires awkward turns of phrase. Consider a rose. The rose is red. All of sudden, it turns blue. What do we have here? A blue rose. Notice how easy it is for me for to refer to the blue rose. Compare to thinking about the rose before it turned blue. Is that the red rose? Well, no; there is no red rose. It is the rose which was red before it turned blue.

Now, watch the rhetorical trick: I’m going to use an ad hoc just-so story with no actual evidence. It seems to me that these facts about our thought condition our programming languages. Parallel languages don’t catch on because people find it difficult to think about many things happening at once. Functional languages don’t catch on because people naturally think in terms of objects which change over time–side-effects–and find it hard to express ideas not in terms of objects which change, but in terms of layers of modifications around existing objects which do not change.

We have imperative languages because they way we have to think to program in those languages matches the way we think anyhow.

Unfortunately the technical evolution of our processors is such that, if we want to continue to take advantage of the steadily increasing performance to which we have become accustomed, we need to learn how to program them in a different way.

It seems likely to me that this different way is going to be whatever most naturally adapts imperative languages to a parallel programming model. For example, perhaps we should just start adapting the client-server model to operate within a single machine. A connection between a client and a server is easy to conceptualize as a function call. On the server side, many function calls can be handled in parallel, but this is not confusing as they have no significant interaction. Perhaps the client and server could even be different threads within the same process, but they would use a rigid API to transfer data, rather than casually using the same address space as thread currently do today (a sharing which in practice is error-prone). A DMA transfer of data between different threads in the same process could be extremely efficient.


Posted

in

by

Tags:

Comments

9 responses to “Again, Parallel Programming”

  1. fche Avatar

    > […] but the thoughts themselves proceed linearly […]

    That is conventional wisdom, but I wonder how true it is. Clearly, when we communicate, we have to serialize our ideas. But when I’m just quietly thinking about stuff, I can sometimes feel many thoughts whirling around in there at once. Some of them time-slice; some of them fleet; some of them linger. A few of them get expressed.

  2. Ian Lance Taylor Avatar

    I completely agree. We are full of all sorts of thoughts. But I think that when we focus specifically on writing code, those thoughts become linear. We may sit back and try to figure out where to go next, and then we may think of anything. But when we start writing, we are focusing on one thing.

    For example, try to think about a multi-threaded program by considering what all the threads are doing simultaneously. It’s very very difficult, at least for me. It’s much easier for me to think about one thread at a time, and to consider the synchronization points separately. Unfortunately, that is an error-prone approach, even more error-prone than most programming.

  3. Sed Avatar

    There is http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.html to feed your reflexion about parallel programming.

  4. ncm Avatar

    I know a few examples of successfully abstracted parallelism. The best is what happens in VSIPL++, a matrix-, signal-, and image-processing library. Programs using VSIPL++ effectively look single-threaded, and express ordinary textbook matrix operations as C++ statements with ordinary named variables, operators, and function-call notation. Under the hood, though, the library is furiously calling up MPI parallel operations on parts of matrices, and parceling the work out to available processors. The only serious leakage to the main-program level is that the (matrix) variables are declared as to how they should be partitioned (e.g. row-wise, column-wise, tiled), and must sometimes be copied to another variable with a different layout.

    But I agree that the most successful uses of parallelism today are client-server and peer-to-peer interactions. The hardest parts of parallel programming are those involving locked access to shared memory. Eliminate (obviously) shared memory, and most of what is hard vanishes.

    Once you eliminate shared memory, you are left with the Unix process model, which has also been very, very successful.

  5. Ian Lance Taylor Avatar

    That’s a good point. I don’t know VISPL++, but I agree that it is possible to write a highly parallel matrix library. That’s kind of a special case, though. The program looks entirely sequential. It just happens to be very easy to mechanically dig out a lot of parallelism, because it manipulates matrices.

    Perhaps the trick is to figure out how to eliminate shared memory from the programming model while retaining it in the implementation, so that you keep the speed advantage while losing the mental complexity.

  6. Ian Lance Taylor Avatar

    Sed: thanks for the pointer.

  7. tromey Avatar

    I like to think that programmers prefer imperative languages because we’re bossy people at heart.

    I tend not to think about what code will run in parallel, or about sychronization (aside from designing a general locking strategy — which is analogous to designing a general memory management strategy), but instead about depenencies. I dress this up by saying it is inspiration from the declarative or functional worlds, but really I just wrote one too many Makefiles.

    Generally I don’t worry about parallel programming being too hard. I remember in the late 80s reading articles about the new, confusing world of event-driven programming and how awfully counter-intuitive it was. Nobody says that now, though. I imagine we’ll see a lot of experimentation, some truly awful designs, and then slow convergence on some set of “best practices” (a phrase I dislike but still use).

    Maybe that is naive. After all, I haven’t tried to scale a program to 100 processors.

  8. Ian Lance Taylor Avatar

    Perhaps you are right.

    In the currently popular threads programming model, I think you’ve swept the difficult stuff under “designing a general locking strategy”. I’m not sure how well that works over time as different programmers work on the code.

    Your comparison to memory management is interesting. Memory management is sufficiently complex that many programmers have moved to garbage collection schemes like Java. Is there an equivalent to garbage collection for locks?

  9. tromey Avatar

    Yeah, I guess I did sweep stuff under the rug.

    I’ve thought a bit about how I think about locking strategies. In recent memory I’ve done this for gcjx and also for my multi-threading work on gcc itself. These are somewhat similar cases, I suppose, both being compiler front ends. I think there are three different things I consider.

    First, there are dependencies. This helps me pick what work can meaningfully be done in parallel. For instance, in gcjx, I noticed that there are no dependencies when parsing a file. So, all the parsers can be run in parallel.

    Next there is handling of locks. Even though there are no dependencies when parsing a source file, there might be multiple threads asking for a particular class to be read. So, some kind of locking is needed here.

    I use a couple of heuristics here. One is to try to limit locking to collections, and have the elements in the collection be immutable or otherwise provably single-threaded.

    Another heuristic I use is to think of locking as analogous to scoping. Dynamic scope is yucky and hard to use; lexical scoping is much simpler to reason about. Similarly for locking: if a hierarchy of locks is needed, I try to make it simple to prove that the correct order is always used. So, I try to reduce locks to the smallest scope possible — for instance it is better to have a file-scope lock whose use is constrained to a single module than to have a lock which is a field on some object and which conceptually could be acquired by any function anywhere. And, if you do need the latter, it is better for the methods of the class to acquire the lock than to have the callers of the methods acquire the lock, etc.

    Speaking of which… one thing I like about Java is that locking is in the language as a block-like control structure. This makes it harder to make mistakes here. In C++ I use RAII, of course, but I always find this sort of thing a bit uglier. Maybe I’m a lisper at heart. (BTW another thing I like about Java is that it uses recursive locks. I know many folks don’t like these but I think it helps with abstraction and hiding.)

    Another part of the job is documentation. For instance, in GCC’s garbage collector, some functions must be run with the GC lock held, and some functions must be run without the lock held. Here I add comments explaining the locking strategy. I probably ought to document this with assertions as well. The analogy to memory management comes into play here — I try to document locking requirements the same way that I would document how to destroy the object returned by some C function.

    When I look back at this I think the ideas are basically specializations of general programming techniques. And, again analogous to memory management in C, tools can help — for instance the data-race-checking valgrind skin (which, I must admit, I have not used).

Leave a Reply