Race Conditions

The Go language is generally safe, in that errors in your program do not lead to unpredictable crashes (unless you use the facilities in the unsafe package). There is one classic cause of problems, however, which Go does not protect you from: race conditions. A race condition occurs when one goroutine modifies a variable and another reads it or modifies it without any synchronization.

Go makes correct synchronization easy, following the slogan of “do not communicate by sharing memory; instead, share memory by communicating.” Race conditions only occur when sharing memory. However, although correct synchronization is easy, the language does not enforce it. Race conditions are possible and I have seen them in real Go programs.

Is it possible for a language to be safe with regard to race conditions? In a language that supports multiple threads of execution, it’s always going to be possible to construct race conditions by doing things like having multiple threads write to the same offset in the file. Let’s rule out that somewhat exotic variety and concentrate on race conditions which don’t involve external constructs. Let’s also not worry about other threading problems like deadlock or starvation; they are real problems, but they are usually easier to understand and avoid.

Race conditions in memory always involve global variables, in which I include memory which can be accessed via a pointer. When more than one thread can access a global variable, the variable can become a nonlocal side effect: something which can affect program behaviour while changing in a way that is not obvious from reading the program code. When that variable can change without any synchronization, you may have a race condition.

One approach, then, is to prohibit variable modifications. When no variables can be modified, no race conditions are possible. This may sound odd to programmers used to conventional imperative languages like C++, Java, or Go, but this is the approach used by strict functional languages. There are few truly strict functional languages outside of academia, but languages like Haskell approach the goal by clearly delineating the non-strict set of operations. Functional languages generally do not have explicit threading models, but the lack of side effects makes it possible to automatically parallelize them. E.g., when a function is called with two arguments, both arguments can be evaluated in parallel; since there are no global variables, there is no way that the evaluation of one argument can affect the evaluation of the other.

While functional languages have some clear benefits, I think they face a problem of history. They’ve been around for a long time, but they are not widely used. It’s possible that this is merely a matter of education, but I think it’s a more fundamental issue. The real world is full of global state and non-local side effects. I think it’s hard for programmers to adapt to languages which don’t support them. Operations that are conceptually very simple, like sorting a list of numbers, become harder to understand in a functional language. Functional languages will always have adherents and will always have good uses in specific domains, but I think widespread adoption is unlikely. Go is obviously not a functional language.

A somewhat less strict approach for avoiding race conditions is for different threads of execution to not share memory. Without shared memory, all thread communication must use a limited set of communication operations which can avoid races. The Erlang language follows this model. Erlang is a functional language, though somewhat less strict than Haskell. Erlang has an explicit thread model. However, threads do not share memory, and can only communicate via message passing. In effect this enforces Go’s slogan: all sharing must be done by communication. This model also has a nice advantage in that threads can run on the same machine or a different one with no change in the program, but only a change in efficiency.

Go did not adopt this model because of efficiency considerations. Threads communicating in shared memory are far more efficient that threads communicating via message passing. In shared memory you can pass around pointers to large data structures. Without shared memory, large data structures must be sent as a whole. Real implementations of Erlang have a variety of implementation optimizations to reduce the cost of sending large data structures, but that means that programmers have to be aware of the available optimizations when writing programs.

One could argue that correct program behaviour is always more important than efficiency, and that the efficiency advantages of shared memory should be disregarded in the name of correctness. However, that argument does not hold up in the real world, where efficiency does matter. Erlang made the right choice for an environment in which threads can run on different machines and can even move from one machine to another. Go may not be the right choice for that environment. Of course, it is also not the environment in which most programs run.

I don’t know of any other approaches to avoiding race conditions in a programming language, though I’m sure there are some. One possibility would be for the language to maintain a global synchronization state. Each operation on a global variable would then be annotated with the current state. If two operations were done using an incompatible state, meaning operations done by different threads with no explicit synchronization, you would get an error.

If this were done purely dynamically, the result would be similar to Thread Sanitizer or helgrind. With language support, it could be constructed to avoid all race conditions. However, since race conditions would only be detected dynamically, it would mean that your program could still crash occasionally at runtime. This would be much better than the current situation, in which your program can behave unpredictably at runtime. The dynamic information would also provide a lot of the information required to debug the problem.

This approach could also be used statically, but I suspect it would require fairly extensive annotations on functions to indicate their synchronization status. It might be possible to deduce some of that statically using whole program analysis, but in a language like Go with channels I think that would be difficult in practice and impossible in theory (i.e., some programs would provably require annotations, and most programs would in practice require annotations).

So what would the dynamic approach look like? It would basically mean instantiating the Go memory model in each operation, tracking the happens before relationships. For example, give every goroutine a ticker. When a goroutine sends a value on a channel or acquires or releases a lock, increment the ticker. When sending a value on a channel or releasing a lock, attach the current goroutine and its tick. When receiving a value on a channel or acquiring a lock, record a happens-before relationship from the attached goroutine/ticker to the current goroutine/ticker. When modifying any global variable, including any shared memory, annotate the variable with the goroutine and its ticker. When reading any global variable, look at the current annotation, and require that there be a happens-before relationship between that annotation and the executing goroutine.

What’s nice about this approach is that it doesn’t require any change to the language or to your program. What’s not nice is that you have to pay a heavy execution cost, and you only catch all actual race conditions, you don’t catch all possible race conditions. It would be nice if the heavy execution cost could be mitigated by static analysis, but in general I don’t think it can. Escape analysis can prove that some variable reads don’t require checking annotations, but I think that even simple programs will defy such analysis for most reads.

On the other hand, with some caching, perhaps the execution cost would not be prohibitive. Most reads are going to be of values written by the current goroutine. That requires an extra read and comparison for each memory read, which would be bad but perhaps supportable. Since one would presumably want to track individual memory words, memory usage would double. Alternatively, since the Go memory model is written in terms of variables, one could instead have an extra word per variable. Then pointer accesses would require identifying the appropriate variable, even if the pointer were to the middle of the variable.

I doubt this could ever be the default way to run Go, but it would certainly be interesting to try an implementation and find the real performance effects.

15 Comments »

  1. rwild said,

    February 11, 2011 @ 12:17 am

    While pure happens-before analysis is safe in the sense that it does not
    report false positives, it is not sufficient to catch all errors. See:
    http://code.google.com/p/data-race-test/wiki/ThreadSanitizerAlgorithm

    My knowledge about Go is not very deep, but I wouldn’t see why that would not apply there.

  2. Ian Lance Taylor said,

    February 11, 2011 @ 9:54 am

    Thanks for the comment. Not catching all errors is what I meant when I wrote “you only catch all actual race conditions, you don’t catch all possible race conditions.” So it’s not perfect for fixing your program, but I believe it is sufficient to ensure that a Go program behaves in a safe manner: any actual race condition will cause a panic. This is similar to the way an out of range array reference will cause a panic in Go. That doesn’t help you fix your invalid program, but it does at least prevent your program from doing something unsafe.

  3. jyasskin said,

    February 11, 2011 @ 7:46 pm

    It’s also possible to take a middle path between Haskell and Erlang: allow immutable data to be shared between threads while mutable data can’t be passed across a channel. This is approximately the technique Rust is trying to use, although their garbage collector isn’t strong enough to actually share the immutable data between threads.

  4. Dmitry Vyukov said,

    February 12, 2011 @ 7:43 am

    Hi Ian,

    > I don’t know of any other approaches to avoiding race conditions in a programming language, though I’m sure there are some.

    Another approach is Unique Types. In short, a language’s type system enforces that there is at most one reference to an object at any given moment in time, so it’s safe to pass a pointer to a mutable object between threads. It allows to hand-off and object from one thread to another, which is indispensable for producer-consumer/pipeline patterns. For more info refer to:
    http://en.wikipedia.org/wiki/Uniqueness_type
    http://bartoszmilewski.wordpress.com/2009/11/30/unique-objects
    http://bartoszmilewski.wordpress.com/2009/09/22/ownership-systems-against-data-races

  5. Dmitry Vyukov said,

    February 12, 2011 @ 8:04 am

    > For example, give every goroutine a ticker…

    It’s definitely a viable approach. However note that the ticker must be a vector clock which records timestamps for *every* goroutine in a system (perhaps with the exception of zero entries which can be implicit). So that lightweight concurrency with thousands of goroutines can turn out badly.

  6. Dmitry Vyukov said,

    February 12, 2011 @ 8:21 am

    > While pure happens-before analysis is safe in the sense that it does not
    report false positives, it is not sufficient to catch all errors.

    It’s not that way. Pure happens-before analysis does catch all races that actually happen. And there is nothing you can do above that in a dynamic verification system, because it does not have any information about a program other than it observes as it happens.
    ThreadSanitizer’s Hybrid mode do not catch any more races than pure happens-before mode. It just makes additional guesses which can be real races or false positives, which is generally a bad thing.
    A way to catch all possible races is to proactively model and investigate all possible thread/process/goroutine interleavings (and memory model relaxations if a language supports them) on all possible inputs. It’s sort of undecidable problem (hard NP-hard problem). However from practical point of view it gives very good results – even if we can’t examine all executions, examination of millions of different executions is invaluable (it won’t work in interactive mode, though).

  7. Ian Lance Taylor said,

    February 14, 2011 @ 1:56 pm

    Thanks for the comments.

    Unique types are an interesting idea. I think that using them for a language like Go would require getting into some sort of distinction between mutable and immutable data. I’ll have to see if I can think of a way to make it work.

    I don’t see why the ticker has to be a vector clock, as long as all ticker values are associated with a goroutine. Or, to put it another way, if we make the goroutine part of the ticker value, then we can efficiently increment the ticker.

  8. Dmitry Vyukov said,

    February 15, 2011 @ 6:24 am

    > would require getting into some sort of distinction between mutable and immutable data

    Note that unique types are mutable.

    > I don’t see why the ticker has to be a vector clock

    I’m not sure how you can handle happens-before without vector clocks. Do you mean that goroutines and mutexes (synchronization primitives) contain only single tick value? May you outline how that will work?

  9. Ian Lance Taylor said,

    February 16, 2011 @ 7:30 am

    Unique types are mutable, yes, but in practice there are two kinds of pointers programs send on channels: pointers which transfer ownership of mutable data, and pointers which point to shared immutable data. It would not be appropriate for Go to treat all cases where a pointer is sent on a channel as a unique type. So it would require some way to distinguish the cases where ownership is transferred from the cases where it is not.

    I’m suggesting that goroutines have a single ticker value. Mutexes and channel sends have {goroutine,ticker} pairs. Locking a mutex or receiving a value from a channel creates a happens-before relationship from the {goroutine,ticker} pair attached to the channel/mutex to the unlocking/receiving {goroutine,ticker} pair. Those happens-before relationships have to be stored in some shared data structure which can represent transitive relationsihps. But I don’t see why individual goroutines need anything more complex than a single ticker.

    Obviously the shared data structure becomes a point of contention, but I’m suggesting (second to last paragraph of blog post) that caching may reduce the cost sufficiently to make it possible to use. And of course the shared data structure would only be accessed on a mutex or channel operation; other operations would consult the cached list of {goroutine,ticker} values known to happen before the currently running goroutine.

  10. Dmitry Vyukov said,

    February 17, 2011 @ 11:29 am

    > other operations would consult the cached list of {goroutine,ticker}

    Isn’t it a vector clock? 😉

    As for sending, since happens-before is a transitive relation, a goroutine needs to send whole own vector clock in explicit or implicit form. So it can send only {goroutine,ticker} and other state will be implicitly passed via that global shared structure. But for me it looks no more than a copy-on-write optimization. That is, instead of copying vector clock on send, a goroutine basically sends a reference to it (which is coded as {goroutine,ticker}), then it can’t mutate the clock, so it will create a new copy on modification (increment). Since sending and increment are always combined, it does not seem to buy much.
    Do I get your idea correctly?

  11. Ian Lance Taylor said,

    February 17, 2011 @ 8:44 pm

    I think you are describing my thoughts correctly. I think it makes a difference because sending a value on a channel does not change the global shared structure, only receiving a value does. Similarly unlocking a mutex does not change the global shared structure, only locking does. So it’s useful to be able to increment the ticker without needing any locks.

    I do agree that conceptually they are the same.

  12. Sharkey said,

    February 20, 2011 @ 8:57 pm

    What you describe sounds a bit like the transactional locking scheme described by Dice et al, http://home.comcast.net/~pjbishop/Dave/GVTL-TL2-Disc06-060711-Camera.pdf

    As with most STM implementations, a solid type system is much more important that the transactional part, but Go might be able to make it work…

  13. jnwhiteh said,

    February 24, 2011 @ 1:23 pm

    I have to say that I ran into a situation where I really did need to pass and relinquish ownership of a mutable data structure and after much looking decided that uniqueness types in the form of an explicit annotation would be an interesting addition to Go.

    Unfortunately it winds up looking a little like const creep for precisely the reason you mention above: it’s not appropriate for Go to do the same thing with any value sent on a channel.

    I don’t have a solution, but I’m definitely interested in doing work on uniqueness types in Go.. if you’re looking for a guinea pig!

  14. viswanathan said,

    March 24, 2011 @ 10:43 pm

    Race Condition
    ________________
    PROTOTYPE
    ————-

    Golobal variable Declaration

    ThreadA
    {
    condition check based on the global variable
    {
    modify the global variable
    }
    }

    ThreadB
    {
    condition check based on the global variable
    {
    modify the global variable
    }
    }

    Way to solve the race condition and capture who is tring to Access the resource which is already in use
    —————————————————————
    PROTOTYPE

    ThreadA
    {
    Check for the resource lock?
    1. if it is not free print the threadname. By this way you capture who is trying to access the resource which is already occupied
    2. if the resource is free proceed further
    Lock the resource // Using Semaphore–> count, binary, mutex
    Condition check based on the global variable
    {
    modify the global variable
    }
    Release the resource
    }

    ThreadB
    {
    Check for the resource lock?
    1. if it is not free print the threadname. By this way you capture who is trying to access the resource which is already occupied
    2. if the resource is free proceed further
    Lock the resource // Using Semaphore
    Condition check based on the global variable
    {
    modify the global variable
    }
    Release the resource
    }

    If we want to check whose is accessing the globally value current you can print the curent thread id.

  15. Somejan said,

    August 20, 2011 @ 1:04 pm

    I realize I’m a bit late commenting on this, but if you’re interested in other parallel processing approaches, you should really have a look at Clojure. If you already know about that, ignore this, but reading the post and comments, you may not.

    Clojure is a non-pure functional language that implements Software Transactional Memory that Sharkey above also mentioned. The idea is simple: shared variables can only be modified while in a transaction, and modifications only become visible once a transacton ends, just like with databases. If a collision happens, one of the transactions is stopped and later restarted automatically. This requires that the code within a transaction can safely be re-executed, but since most code is functional that’s not a big problem in Clojure.

    http://clojure.org/refs

RSS feed for comments on this post · TrackBack URI

You must be logged in to post a comment.