Archive for Programming

Signed or Unsigned

C has always permitted comparisons between any integer type, and C++ follows its lead. Comparing signed types to signed types is straightforward: you sign extend the smaller type. Likewise, when comparing unsigned types to unsigned types, you zero extend. When comparing signed and unsigned types, the rules are less clear.

The C standard specifies a type ordering: long long > long > int > short > char. If the unsigned type appears in that ordering before the signed type, then the signed value is converted to the unsigned type. Note that this happens even if the types are the same size (e.g., either long long and long or long and int are often the same size). Otherwise, if the signed type is larger than the unsigned type, in the sense of having more bits, then the unsigned value is converted to the signed type. Otherwise both values are converted to the unsigned type which corresponds to the signed type.

Pre-standard K&R C used a different rule, but that is old enough now that we no longer have to worry about it.

What this rule means is that if you write portable code, such that you don’t know the sizes of types, you can not predict whether the comparison will be done as a signed comparison or an unsigned comparison. Therefore, the gcc compiler has an option -Wsign-compare. However, this option is sufficiently awkward to avoid that it is not part of -Wall, though it is part of -Wextra (the difference between -Wall and -Wextra is that the former gives warnings for which false positives are easy to avoid through simple code changes; the latter gives warnings which are generally useful but for which false positives are harder to avoid).

There are good reasons to use signed types: they don’t have odd behaviour around zero, so you can write i < limit - 1 without worrying about the case limit == 0. There are good reasons to use unsigned types for things like the number of elements in a container: you get the full range of sizes, rather than limiting yourself to only the positive half. In particular, the C++ standard containers use unsigned types as their size. Combining these two rules gets you in trouble with portable code. The only reasonable answer I can see for portable code is to use -Wsign-compare and work around the many false positive warnings.

Go avoids these problems in two ways. First, there are no implicit conversions, so you can never be surprised by having a comparison become unsigned when you expected signed. You have to explicitly say which type of conversion you mean. Second, Go intentionally discards half of memory, and takes the philosophy that if you want a container which can hold more values than fit in a signed int, you should write a special purpose large container.

Comments (2)

Thread Sanitizer

I recently ran the gold linker under Thread Sanitizer. It’s a nice plugin for Valgrind which looks for race conditions in multi-threaded programs. To describe it briefly, it builds Happens-Before relationships based on mutex operations and warns when it notices a write and a read/write to the same memory location without a Happens-Before relationship. This approach can yield false positives to be sure, but it does a very nice job of identifying real problems.

It was able to identify one real bug in gold, one problem that led to less efficient link time, and several cases where several threads would set a shared memory location to the same value. The latter cases are not a problem on x86 architectures, though they could be on other processors. In order to get clean Thread Sanitizer results in the future, I fixed all of the cases so that I could get a clean run of gold, at least with the default settings.

The real bug that it found was a typical multi-threaded bug: the code looked fine, but it had a well-hidden error. Gold uses a workqueue of tasks to execute, with a pool of worker threads. Many of the tasks are run using a blocker token. The blocker token is set to the number of tasks that precede it. As each task completes, it decrements the blocker count. When the count goes to zero, the next set of tasks can start. This is a simple way to parallelize linker operations, in which one set of operations (e.g., process the symbol tables) must be run before the next set (e.g., process the relocations) can begin. Naturally I paid close attention to the blocker behaviour when a task completed, and there were no problems there. The problem arose in setting the blocker count when the tasks started. The code was doing a loop of “increment the blocker count” and then “queue the task.” What I forgot was that the process of queuing the task actually lets another thread in the pool start working on it immediately. When the task completed, it decremented the blocker count with a lock. But if the task completed fast enough, the initial code was still running the loop queuing up new tasks, and thus incrementing the blocker count. I didn’t think that I needed to lock the increment, since I wasn’t expecting any task to actually complete before I started all of the tasks. A dumb mistake—just the kind of mistake one makes in multi-threaded programming.

Gold is written in C++. In Go I would of course have each task communicate its completion on a channel. The locking would be handled by the runtime, and there would be no chance for me to make the same sort of error. If you write multi-threaded code, and you can’t use Go, you should definitely check out Thread Sanitizer.

Comments (4)

Go Linkage Names

All Go code lives in a package. Every Go source file starts with a package declaration which names the package that it lives in. A package name is a simple identifier; besides appearing in a package clause, package names are also used when referring to names imported from another package. That poses the problem of what to do when one program links together two different packages which use the same package name. We can’t expect the author of a large program to be aware of every package that the program uses. However, since Go compiles straight to object files, it’s natural to use the package name in the generated symbol names. How can we avoid multiple definition errors?

The gc compiler comes with its own Go specific linker. That linker now supports automatic symbol renaming at link time based on the name used to import the package. That name is presumed to be unique. This means that all imports of the same package must use the same name to import it; otherwise you might get multiple definitions of a global variable in the package. In the future there may be some need to adjust packages which are distributed without their source code, to ensure that they don’t accidentally alias locally compiled package names.

For the gccgo compiler I have so far avoided using a specific linker, or rather linker wrapper. For large programs gccgo now requires a new option, -fgo-prefix=PREFIX to be used when compiling a package. The PREFIX should be a string unique to that package; for example, in a typical installation, it could be the directory where the package is installed. This gives a unique name used in the compiled code. If the -fgo-prefix option is not used, everything will still work as long as there are not, in fact, two packages with the same name.

Comments (9)

Protected Symbols

Now for something really controversial: what’s wrong with protected symbols?

In an ELF shared library, an ordinary global symbol may be overridden if a symbol of the same name is defined in the executable or in a shared library which appears earlier in the runtime search path. This is called symbol interposition. It is often used with functions such as malloc. A shared library can define malloc and it can have code which calls malloc. If the executable linked with the shared library defines malloc itself, then the version in the executable will be used rather than the version in the shared library. This permits the executable to control the memory allocation done by the shared library, perhaps for debugging or logging purposes. In this regard, shared libraries act much as static archives do.

This has a few consequences. One of them is that within a shared library, all references to a global symbol must use the GOT and PLT, to make the overriding possible. That means that all function calls and variable accesses are slightly slower. Also, some compiler optimizations are forbidden: the compiler can not inline a call to a global symbol, since that symbol might be overridden at run time.

When building a shared library, you can provide a version script which indicates that some symbols are actually not global. That can eliminate the GOT and PLT accesses, but it does not permit the compiler optimizations, and you do have to write that version script and keep it up to date.

When compiling code that goes into a shared library, you can set the visibility of symbols. You can use hidden visibility, which means that the symbol is not visible outside the shared library. You can use internal visibility, which is a lot like hidden—I’ll skip the difference here. Or you can use protected visibility. Protected visibility means that the symbol is visible outside of the shared library, and can be accessed as usual. However, all references from within the shared library will use the definition in the shared library. In other words, the symbol acts more or less as usual, but it can not be overridden. This means that accesses to the symbol avoid the GOT and PLT, and it permits compiler optimizations.

So, what’s wrong with them? It turns out that protected symbols are slower at dynamic link time, which means that programs which use the shared library start up slower. This happens because of the C rule that two pointers to the same function must compare as equal. Since protected symbols are globally visible, you can get a pointer to a protected function in the main executable. You can also get a pointer to that same function in the shared library, of course. Those pointers have to be equal, or the C rule will break.

As noted, the access to the function in the shared library will not use the GOT or PLT. The access in the main executable obviously will use the PLT. How can we make those function pointers equal? We can’t. The executable will have a direct reference to the PLT. The shared library will have a direct reference to the function itself. In neither case will there be a relocation for the reference. So there is no way to make the results equal. (This can work for some targets, but not for ones with simple function references like the x86 targets.)

So, I must have lied. The lie was that there is a case where you need to use the GOT for a protected symbol: when compiling position independent code for a shared library, and taking the address of a protected function, you need to use the GOT. Unfortunately, gcc for the x86_64 target, surely the most widely used gcc target today, gets this wrong: http://gcc.gnu.org/PR19520. This generally reveals itself as an error report when you go to create a shared library: relocation R_X86_64_PC32 against protected symbol `NAME' can not be used when making a shared object.

In any case, when the compiler gets it right, the dynamic linker has to fill in that GOT entry. In order to make the function pointers compare as equal, it has to fill in the entry with the address of the PLT in the executable (or the earlier shared library). But remember, this is a protected symbol, and protected symbols don’t support symbol interposition. So the dynamic linker must only use the PLT of the executable if the reference in the executable refers to the definition in the shared library. That means that when the dynamic linker sees a reloc against a protected symbol in a shared library, it has to do another walk through the executable and earlier shared libraries to see if any of them have a definition for the symbol, in which case the GOT entry must not be set to that earlier PLT entry but must instead be set to the address of the symbol in the shared library itself. This check has to be done for every symbol in the shared library.

Those extra symbol resolution passes means a slow down for every program which uses the shared library, and that is what is wrong with protected symbols.

So how do you get the compiler and linker speedups available by avoiding symbol interpositioning? Unfortunately, you have to give your symbols hidden visibility, which means that they can not be accessed from other modules. Assuming you do want them to be accessed, you need to define symbol aliases for the ones which should be publicly visible. That means that you need to use different names for the hidden symbols. This is awkward at best. Unfortunately I have nothing better to offer. ELF is designed to support symbol interpositioning, and there is no very good way to avoid that without causing other consequences.

Comments (2)

Version Scripts

I recently spent some time sorting through linker version script issues, so I’m going to document what I discovered.

Linker symbol versioning was invented at Sun. The Solaris linker lets you use a version script when you create a shared library. This script assigns versions to specific named symbols, and defines a version hierarchy. When an executable is linked against the shared library, the versions that it uses are recorded in the executable. If you later try to dynamically link the executable with a shared library which does not provide the required versions, you get a sensible error message.

Sun’s scheme (as I understand it) only permits you to add new versions and new symbols. Once a symbol has been defined at a specific version, you can not change that in later releases. if you change the behaviour of a symbol, you don’t change the version of the symbol itself, instead you add a new version to the library even if it does not define any symbols. That is sufficient to ensure that an executable will not be dynamically linked against a version of the shared library which is too old.

Eric Youngdale and Ulrich Drepper introduced a more sophisticated symbol versioning scheme in the GNU linker and the GNU/Linux dynamic linker. The GNU linker permits symbols to have multiple versions, of which only one is the default. These versions are specified in the object files linked together to form the shared library. The assembler .symver directive is used to assign a version to a symbol (the version is simply encoded in the name of the symbol). This scheme permits using symbol versioning to actually change the behaviour of a symbol; older executables will continue to use the old version. This also permits deleting symbols, by removing the default version. The older versions of the symbol remain but are inaccessible.

That is all fine. The problems come in with the extensions to the version script language. First, the GNU linker permits wildcards in version scripts. Second, the GNU linker permits symbols to match against demangled names, again typically using wildcards. Third, the GNU linker permits the version script to hide symbols which have explicit versions in input object files.

Every symbol can only have one version. When the linker asks for the version of a symbol, there can only be one answer. The support for wildcards and matching of demangled names in the GNU linker script means that there may not be a unique answer for the version to use for a given name. The fact that the GNU linker permits version scripts to hide symbols with explicit versions means that in some cases you absolutely must list a symbol two times in a version script (because you might have a local: *; entry which must not match your symbol with an old version). This potential confusion means that using linker scripts correctly with wildcards requires a clear understanding of exactly how the linker parses a version script.

Unfortunately, this was never documented. Until now. Here are the rules which the GNU linker uses to parse version scripts, as of 2010-01-11.

The GNU linker walks through the version tags in the order in which they appear in the version script. For each tag, it first walks through the global patterns for that tag, then the local patterns. When looking at a single pattern, it first applies any language specific demangling as specified for the pattern, and then matches the resulting symbol name to the pattern. If it finds an exact match for a literal pattern (a pattern enclosed in quotes or with no wildcard characters), then that is the match that it uses. If finds a match with a wildcard pattern, then it saves it and continues searching. Wildcard patterns that are exactly “*” are saved separately.

If no exact match with a literal pattern is ever found, then if a wildcard match with a global pattern was found it is used, otherwise if a wildcard match with a local pattern was found it is used.

This is the result:

  • If there is an exact match, then we use the first tag in the version script where it matches.
    • If the exact match in that tag is global, it is used.
    • Otherwise the exact match in that tag is local, and is used.
  • Otherwise, if there is any match with a global wildcard pattern:
    • If there is any match with a wildcard pattern which is not “*”, then we use the tag in which the last such pattern appears.
    • Otherwise, we matched “*”. If there is no match with a local wildcard pattern which is not “*”, then we use the last match with a global “*”. Otherwise, continue.
  • Otherwise, if there is any match with a local wildcard pattern:
    • If there is any match with a wildcard pattern which is not “*”, then we use the tag in which the last such pattern appears.
    • Otherwise, we matched “*”, and we use the tag in which the last such match occurred.

As mentioned above, there is an additional wrinkle. When the GNU linker finds a symbol with a version defined in an object file due to a .symver directive, it looks up that symbol name in that version tag. If it finds it, it matches the symbol name against the patterns for that version. If there is no match with a global pattern, but there is a match with a local pattern, then the GNU linker marks the symbol as local.

I want gold to be compatible, but I also want gold to be efficient. I’ve introduced a hash table in gold to do fast lookups for exact matches. That makes it impossible for gold to follow the exact rules when matching demangled names. Currently gold does not do the final lookup to see if a symbol with an explicit version should be forced local; I don’t understand why that is useful. It is possible that I will be forced to add that to gold at some later date.

Here are the current rules for gold:

  • If there is an exact match for the mangled name, we use it.

    • If there is more than one exact match, we give a warning, and we use the first tag in the script which matches.
    • If a symbol has an exact match as both global and local for the same version tag, we give an error.
  • Otherwise, we look for an extern C++ or an extern Java exact match. If we find an exact match, we use it.
    • If there is more than one exact match, we give a warning, and we use the first tag in the script which matches.
    • If a symbol has an exact match as both global and local for the same version tag, we give an error.
  • Otherwise, we look through the wildcard patterns, ignoring “*” patterns. We look through the version tags in reverse order. For each version tag, we look through the global patterns and then the local patterns. We use the first match we find (i.e., the last matching version tag in the file).
  • Otherwise, we use the “*” pattern if there is one. We give a warning if there are multiple “*” patterns.

I hope for your sake that this information never actually matters to you.

Comments

« Previous entries Next Page » Next Page »