gccgo panic/recover

Back in March Go picked up a dynamic exception mechanism. Previously, when code called the panic function, it simply aborted your program. Now, it walks up the stack to the top of the currently running goroutine, running all deferred functions. More interestingly, if a deferred function calls the new recover function, the panic is interrupted, and stops walking the stack at that point. Execution then continues normally from the point where recover was called. The recover function returns the argument passed to panic, or nil if there is no panic in effect.

I just completed the implementation of this in gccgo. It turned out to be fairly complex, so I’m writing some notes here on how it works.

The language requires that panic runs the deferred functions before unwinding the stack. This means that if the deferred function calls runtime.Callers (which doesn’t work in gccgo, but never mind, it will eventually) it gets a full backtrace of where the call to panic occurred. If the language did not work that way, it would be difficult to use recover as a general error handling mechanism, because there would be no good way to dump a stack trace. Building up a stack trace through each deferred function call would be inefficient.

The language also requires that recover only return a value when it is called directly from a function run by a defer statement. Otherwise it would be difficult for a deferred function to call a function which uses panic and recover for error handling; the recover might pick up the panic for its caller, which would be confusing.

As a general gccgo principle I wanted to avoid requiring new gcc backend features. That raised some difficulty in implementing these Go language requirements. How can the recover function know whether it is being invoked directly by a function started by defer? In 6g, walking up the stack is efficient. The panic function can record its stack position, and the recover function can verify that it is at the correct distance below. In gccgo, there is no mechanism for reliably walking up the stack other than exception stack unwinding, which does not provide a helpful API. Even if it did, gccgo’s split stack code can introduce random additional stack frames which are painful to account for. And there is no good way for panic to mark the stack in gccgo.

What I did instead was have the defer statement check whether the function is it deferring might call recover (e.g., it definitely calls recover, or it is a function pointer so we don’t know). In that case, the defer statement arranges to have the deferred thunk record the return address of the deferred function at the top of the defer stack. This value is obtained via gcc’s address-of-label extension, so no new feature was required. This gives us a value which a function which calls recover can check, because a function can always reliably determine its own return address via gcc’s __builtin_return_address function.

However, if the stack is split, then __builtin_return_address will return the address of the stack splitting cleanup code rather than the real caller. To avoid that problem, a function which calls recover is split into two parts. The first part is a small thunk which is marked to not permit its stack to be split. This thunk gets its return address and checks whether it is being invoked directly from defer. It passes this as a new boolean parameter to the real function, which does permit a split stack. The real function checks the new parameter before calling recover; if it is false, it just produces a nil rather than calling recover. The real function is marked uninlinable, to ensure that it is not inlined into its only call site, which could blow out the stack.

That is sufficient to let us know whether recover should return a panic value if there is one, at the cost of having an extra thunk for every function which calls recover. Now we can look at the panic function. It walks up the defer stack, calling functions as it goes. When a function sucessfully calls recover, the panic stack is marked. This stops the calls to the deferred functions, and starts a stack unwind phase. The stack unwinding is done exactly the way that g++ handles exceptions. The g++ exception mechanism is general and cross-language, so this part was relatively easy. This means that every function that calls recover has an exception handler. The exception handlers are all the same: if this is the function in which recover returned a value, then simply return from the current function, effectively stopping the stack unwind. If this is not the function in which recover returned a value, then resume the stack unwinding, just as though the exception were rethrown in C++.

This system is somewhat baroque but it appears to be working. Everything is reasonably efficient except for a call to recover which does not return nil; that is as expensive as a C++ exception. Perhaps I will think of ways to simplify it over time.

Comments

Leave a Reply