Multithreaded Garbage Collection

Garbage collection is the traditional solution to the problem of managing memory. Multithreaded programming is the current wave of the future. I’ve written about the difficulties of multithreaded programming before, but people are going to do it regardless. In which case: how do we garbage collection in a multithreaded program?

Let’s assume that we don’t want to halt the whole program during garbage collection, because that is expensive. In that case, it’s not too hard to understand how it can be done if you can 1) halt the whole program (other than the garbage collection thread) for a brief period; 2) any change to a heap object will put the object on a list of changed objects; and 3) you can assume that all pointer loads and stores are atomic with respect to each other. Then the garbage collection thread can halt the program while it scans the roots, let the program run while it does a mark pass, halt the program again and scan the changed objects, and let the program while it does a sweep. (This has to be an in-place garbage collector, not one that moves the valid objects).

It’s possible to implement those requirements for an interpreted language like the traditional setting for Java. You can still JIT code that uses the heap, and it will help to do some escape analysis to see whether a heap pointer can possibly escape the function.

I don’t really see how to implement those requirements for a native code language like C++. In particular tracking the changed objects seems somewhat painful. There was a garbage collection proposal for the next C++ standard, though I believe that it may have been withdrawn. But I don’t see how to implement garbage collection efficiently in a multi-threaded programming. I did some web searches, but the most helpful sounding ideas I could find were all in academic papers which weren’t online. I wonder if there are any actual implementations which try to implement my suggested requirements.


Posted

in

by

Tags:

Comments

3 responses to “Multithreaded Garbage Collection”

  1. Jonathan Avatar
    Jonathan

    I noticed this paper yesterday – it may (or may not) be of interest : Cell GC: Using the Cell Synergistic Processor as a Garbage Collection Coprocessor http://www.research.ibm.com/cell/papers/2008_vee_cellgc.pdf

  2. ncm Avatar

    The GC proposal for C++ was withdrawn because it was dreadfully immature. There was lots of interest, in principle, but it was clear that there was no way it could be ready in time for C++0x. There was also lots of opposition, not helped by advocates’ dismissive attitude toward problems, and spontaneously spurious statistics (90% and 99% figured prominently).

    Part of the problem, in my opinion, is that people who advocate built-in GC are chasing two opposed goals. One, the basis for the business of the proposal author, is rescuing buggy, leaky programs. The other is sound resource management. Everybody agrees, on paper, that GC is no substitute for running destructors, but a program that runs destructors gets vanishingly little benefit from built-in GC. A program that fails to run destructors leaks memory, but that’s likely to be the least of its problems.

  3. Ian Lance Taylor Avatar

    Jonathan: thanks for the link. They seem to be talking about concurrent garbage collection of interpreted Java.

    ncm: A program like gcc uses GC not because it doesn’t use destructors, but because the lifetime of memory objects is uncertain, and there can be multiple pointers to a single memory object.

Leave a Reply