Lock Collection

Memory management is a standard problem for programs written in C and C++. It is important to free memory when it is no longer needed, because otherwise you will run out. Unfortunately, it is complex to keep track of which memory is no longer needed. The symptoms of memory management problems are typically either running out of memory, or obscure and difficult to debug memory corruption.

Any program can do memory management successfully with a disciplined approach. However, the problems are pervasive enough that there is a standard solution for them: garbage collection. Java, for example, always uses garbage collection, and therefore has no memory management problem. The same is true of Lisp, and of most interpreted languages. Garbage collection decreases performance and increases memory usage. Still, many people find it to be a valuable technique.

Is there any comparable approach which will work for parallel programming? The problem in parallel programming is race conditions. The standard solution is mutexes and condition variables, and variants thereof. These can work with a disciplined approach. However, in practice, the problems are pervasive. Is there something like garbage collection which can provide a complete solution?

There has been a lot of recent work on transactional memory. This is a technique which tracks all reads and writes in a locked section, and unrolls the writes if two different threads turn out to be in the section simultaneously. I don’t personally see the point. Transactional memory still requires the programmer to place locks correctly. The advantage it provides is that it can provide good performance even with coarse-grained locks. However, the good performance only comes if lock contention is low. When lock contention is high, the bookkeeping overhead of transactional memory is high. And the coarser the locks, the higher the lock contention. So it doesn’t seem like a significant change to the current systems.

What we need is some way for multiple threads of control to share access to data structures in memory without requiring locks. And we need some way to guarantee data structure invariants, so that while the structure is being updated by one thread other threads do not see an inconsistent state. It’s difficult to see how we can maintain invariants without annotations from the programmer.

The goal I am setting is a high one: to be able to write code in a relatively straightforward manner, and to permit that code to use multiple threads of control. I don’t see a solution.

A fallback position might be the one I discussed earlier: require the separate threads to operate in separate address spaces, and provide library calls to move data back and forth. This is a more difficult programming model, so it doesn’t seem like the perfect solution. But I don’t see a good one.


Posted

in

by

Tags:

Comments

3 responses to “Lock Collection”

  1. baruch Avatar
    baruch

    One way that can help is to use immutable objects and use copy-on-write on the object collections to provide a consistent view on the objects. While this is a nice idea I still don’t know how to implement it. There is a blog by Wesner Moise that discussed such issues a while ago, but that was in a dot-net environment.

  2. ncm Avatar

    The technique baruch describes is implemented as RCU in the Linux kernel. In the form it’s used there, it’s patented by IBM, and freely licensed only for use in GPL code. It turns out the hard part is knowing when you can free the old versions kept around for the benefit of threads that might be looking at them. The patent is on one solution, involving noticing when every processor has experienced a context switch, after which any code looking must use the new version.

  3. tromey Avatar

    Immutable data structures are quite nice when you can use them. These days whenever I see a language with a mutable String type, I immediately think “yuck”, because I’ve gotten so much mileage over the years from having an immutable String in Java. Immutable objects can be memoized and this is helpful in a number of situations.

    In general it is hard to think about ways to make this invisible to the programmer. But, GC isn’t really invisible either… you pay a performance cost, you sacrifice some low-level control and determinacy (though this is complicated), and you do still have to remember to null out references in order to avoid memory leaks. So, maybe it is worth considering what cost you would pay for automated multi-threading. E.g., for gold you are avoiding fine-grained locks due to the performance problem — but perhaps an automated approach would require a sacrifice here.

    I suppose the message-passing approach would end up looking like Erlang (yet another language I know of but don’t actually know). Note that separate address spaces are only required if you are writing in a low-level, unconstrained language like C. A different language may provide guarantees that eliminate this need. (I don’t know whether this is true for Erlang.)

    Languages can help in other ways, for instance Java’s “synchronized” block structure… this eliminates some flexibility, but, IMNSHO, at a very low cost.

    Like ncm’s earlier VSIPL++ example, I think there is a lot to be said for libraries which do much of the difficult work. Java has various kinds of containers: some are thread-safe but lock-free (due to atomic operations), others are explicitly unsynchronized, but there is a corresponding library facility to create synchronized variants — very useful. This latter feature is done in an ad hoc way but perhaps a language (I’m looking at you, C++) could provide straightforward ways to wrap classes to add locking via a generic mechanism. Or, maybe a language could provide ways to assert that, say, a container may only contain objects of an immutable type.

    I suppose each library, like VSIPL++, is a sort of special case. And, maybe you lose some flexibility this way. But, that is ok. A lesson of the Java class libraries and the STL is that, in the majority of situations, it doesn’t matter.

    Finally, I wonder if you’ve looked at: OpenMP, C[], UPC, or the new parallel STL stuff that Benjamin checked in recently. I haven’t found the time to look at any of these yet 🙁

Leave a Reply