Linkers part 5

Shared Libraries Redux

Yesterday I talked about how shared libraries work. I realized that I should say something about how linkers implement shared libraries. This discussion will again be ELF specific.

When the program linker puts position dependent code into a shared library, it has to copy more of the relocations from the object file into the shared library. They will become dynamic relocations computed by the dynamic linker at runtime. Some relocations do not have to be copied; for example, a PC relative relocation to a symbol which is local to shared library can be fully resolved by the program linker, and does not require a dynamic reloc. However, note that a PC relative relocation to a global symbol does require a dynamic relocation; otherwise, the main executable would not be able to override the symbol. Some relocations have to exist in the shared library, but do not need to be actual copies of the relocations in the object file; for example, a relocation which computes the absolute address of symbol which is local to the shared library can often be replaced with a RELATIVE reloc, which simply directs the dynamic linker to add the difference between the shared library’s load address and its base address. The advantage of using a RELATIVE reloc is that the dynamic linker can compute it quickly at runtime, because it does not require determining the value of a symbol.

For position independent code, the program linker has a harder job. The compiler and assembler will cooperate to generate spcial relocs for position independent code. Although details differ among processors, there will typically be a PLT reloc and a GOT reloc. These relocs will direct the program linker to add an entry to the PLT or the GOT, as well as performing some computation. For example, on the i386 a function call in position independent code will generate a R_386_PLT32 reloc. This reloc will refer to a symbol as usual. It will direct the program linker to add a PLT entry for that symbol, if one does not already exist. The computation of the reloc is then a PC-relative reference to the PLT entry. (The 32 in the name of the reloc refers to the size of the reference, which is 32 bits). Yesterday I described how on the i386 every PLT entry also has a corresponding GOT entry, so the R_386_PLT32 reloc actually directs the program linker to create both a PLT entry and a GOT entry.

When the program linker creates an entry in the PLT or the GOT, it must also generate a dynamic reloc to tell the dynamic linker about the entry. This will typically be a JMP_SLOT or GLOB_DAT relocation.

This all means that the program linker must keep track of the PLT entry and the GOT entry for each symbol. Initially, of course, there will be no such entries. When the linker sees a PLT or GOT reloc, it must check whether the symbol referenced by the reloc already has a PLT or GOT entry, and create one if it does not. Note that it is possible for a single symbol to have both a PLT entry and a GOT entry; this will happen for position independent code which both calls a function and also takes its address.

The dynamic linker’s job for the PLT and GOT tables is to simply compute the JMP_SLOT and GLOB_DAT relocs at runtime. The main complexity here is the lazy evaluation of PLT entries which I described yesterday.

The fact that C permits taking the address of a function introduces an interesting wrinkle. In C you are permitted to take the address of a function, and you are permitted to compare that address to another function address. The problem is that if you take the address of a function in a shared library, the natural result would be to get the address of the PLT entry. After all, that is address to which a call to the function will jump. However, each shared library has its own PLT, and thus the address of a particular function would differ in each shared library. That means that comparisons of function pointers generated in different shraed libraries may be different when they should be the same. This is not a purely hypothetical problem; when I did a port which got it wrong, before I fixed the bug I saw failures in the Tcl shared library when it compared function pointers.

The fix for this bug on most processors is a special marking for a symbol which has a PLT entry but is not defined. Typically the symbol will be marked as undefined, but with a non-zero value–the value will be set to the address of the PLT entry. When the dynamic linker is searching for the value of a symbol to use for a reloc other than a JMP_SLOT reloc, if it finds such a specially marked symbol, it will use the non-zero value. This will ensure that all references to the symbol which are not function calls will use the same value. To make this work, the compiler and assembler must make sure that any reference to a function which does not involve calling it will not carry a standard PLT reloc. This special handling of function addresses needs to be implemented in both the program linker and the dynamic linker.

ELF Symbols

OK, enough about shared libraries. Let’s go over ELF symbols in more detail. I’m not going to lay out the exact data structures–go to the ELF ABI for that. I’m going to take about the different fields and what they mean. Many of the different types of ELF symbols are also used by other object file formats, but I won’t cover that.

An entry in an ELF symbol table has eight pieces of information: a name, a value, a size, a section, a binding, a type, a visibility, and undefined additional information (currently there are six undefined bits, though more may be added). An ELF symbol defined in a shared object may also have an associated version name.

The name is obvious.

For an ordinary defined symbol, the section is some section in the file (specifically, the symbol table entry holds an index into the section table). For an object file the value is relative to the start of the section. For an executable the value is an absolute address. For a shared library the value is relative to the base address.

For an undefined reference symbol, the section index is the special value SHN_UNDEF which has the value 0. A section index of SHN_ABS (0xfff1) indicates that the value of the symbol is an absolute value, not relative to any section.

A section index of SHN_COMMON (0xfff2) indicates a common symbol. Common symbols were invented to handle Fortran common blocks, and they are also often used for uninitialized global variables in C. A common symbol has unusual semantics. Common symbols have a value of zero, but set the size field to the desired size. If one object file has a common symbol and another has a definition, the common symbol is treated as an undefined reference. If there is no definition for a common symbol, the program linker acts as though it saw a definition initialized to zero of the appropriate size. Two object files may have common symbols of different sizes, in which case the program linker will use the largest size. Implementing common symbol semantics across shared libraries is a touchy subject, somewhat helped by the recent introduction of a type for common symbols as well as a special section index (see the discussion of symbol types below).

The size of an ELF symbol, other than a common symbol, is the size of the variable or function. This is mainly used for debugging purposes.

The binding of an elf symbol is global, local, or weak. A global symbol is globally visible. A local symbol is only locally visible (e.g., a static function). Weak symbols come in two flavors. A weak undefined reference is like an ordinary undefined reference, except that it is not an error if a relocation refers to a weak undefined reference symbol which has no defining symbol. Instead, the relocation is computed as though the symbol had the value zero.

A weak defined symbol is permitted to be linked with a non-weak defined symbol of the same name without causing a multiple definition error. Historically there are two ways for the program linker to handle a weak defined symbol. On SVR4 if the program linker sees a weak defined symbol followed by a non-weak defined symbol with the same name, it will issue a multiple definition error. However, a non-weak defined symbol followed by a weak defined symbol will not cause an error. On Solaris, a weak defined symbol followed by a non-weak defined symbol is handled by causing all references to attach to the non-weak defined symbol, with no error. This difference in behaviour is due to an ambiguity in the ELF ABI which was read differently by different people. The GNU linker follows the Solaris behaviour.

The type of an ELF symbol is one of the following:

  • STT_NOTYPE: no particular type.
  • STT_OBJECT: a data object, such as a variable.
  • STT_FUNC: a function
  • STT_SECTION: a local symbol associated with a section. This type of symbol is used to reduce the number of local symbols required, by changing all relocations against local symbols in a specific section to use the STT_SECTION symbol instead.
  • STT_FILE: a special symbol whose name is the name of the source file which produced the object file.
  • STT_COMMON: a common symbol. This is the same as setting the section index to SHN_COMMON, except in a shared object. The program linker will normally have allocated space for the common symbol in the shared object, so it will have a real section index. The STT_COMMON type tells the dynamic linker that although the symbol has a regular definition, it is a common symbol.
  • STT_TLS: a symbol in the Thread Local Storage area. I will describe this in more detail some other day.

ELF symbol visibility was invented to provide more control over which symbols were accessible outside a shared library. The basic idea is that a symbol may be global within a shared library, but local outside the shared library.

  • STV_DEFAULT: the usual visibility rules apply: global symbols are visible everywhere.
  • STV_INTERNAL: the symbol is not accessible outside the current executable or shared library.
  • STV_HIDDEN: the symbol is not visible outside the current executable or shared library, but it may be accessed indirectly, probably because some code took its address.
  • STV_PROTECTED: the symbol is visible outside the current executable or shared object, but it may not be overridden. That is, if a protected symbol in a shared library is referenced by other code in the shared library, that other code will always reference the symbol in the shared library, even if the executable defines a symbol with the same name.

I’ll described symbol versions later.

More tomorrow.

11 Comments »

  1. christian schorn » Blog Archive » links for 2007-08-30 said,

    August 30, 2007 @ 1:22 pm

    [...] Airs – Ian Lance Taylor » Linkers part 5 (tags: programming basics) [...]

  2. lev said,

    September 19, 2007 @ 9:48 pm

    I’m finding this series of posts on linkers very interesting, and mostly very clear. I have a couple of questions regarding this one, though.

    1) I understand why it’s wrong giving the address of the PLT entry when C code takes the address of a function. But you say that the way around this is to specially mark such uses of a function, with a special symbol that has the value of the address of the PLT entry. Isn’t this the same thing? I’m obviously not following that paragraph properly.

    2) What possible difference can it make to the linker whether a symbol is marked STV_INTERNAL vs STV_HIDDEN? I can understand that the compiler might be able to do some optimizations if it knows that the function will never be called from outside the executable/shared lib — maybe can avoid loading the PIC register since you know it’s already done by the caller. But that’s the compiler: why would the linker need to know the difference between internal and hidden?

    Thanks for some interesting articles.

  3. Ian Lance Taylor said,

    September 19, 2007 @ 11:15 pm

    Thanks for the note.

    It’s OK to make the address of the function be the address of the PLT entry, what matters is that every reference to the function, no matter where it occurs, get that same address. So there is a two-step process. First, the program linker marks the dynamic symbol in a special way by giving it a non-zero value A, and it also uses A for any relocations which reference the function. The dynamic linker then makes sure to use A for any reference to the function other than actually calling it. That is, in the main executable, A is used for any reloc other than a PLT reloc. And likewise in a shared library (typically the only other reloc would be a GOT reloc). This ensures that every reference to the function, other than calling it, gets the value A. Thus comparisons of the function address for equality will work.

    Note that this is not an issue for a function defined in the executable. In that case the dynamic linker will always use the address in the executable for all references to the function. This is also not an issue for a function reference from a shared library. The shared library will naturally have dynamic relocations for the function, and the usual dynamic linker algorithm will ensure that all those relocations refer to the same value.

    The problem only arises for a function reference from the main executable. In this case there may not be a dynamic relocation for all references to the function, since the program linker will be able to resolve those relocations to the PLT address. So the special non-zero value in the dynamic symbol table records that there was a reference to the function other than calling it, and it tells the dynamic linker that it must use that address when resolving dynamic relocations in shared libraries other than calling the function.

    I hope that makes some sense.

    Finally, you’re right, both the program linker and the dynamic linker should treat internal and hidden symbols exactly the same. Explicitly recording both types in the ELF symbol visibility field is just for information. Actually I don’t know of any systems which actually treat internal symbols differently from hidden symbols in any way, though no doubt there are some.

  4. lev said,

    September 25, 2007 @ 7:53 am

    Thanks for the further explanation. That was the clue I needed. It took me a while looking at the assembler that gets generated, but I figured it out. I had the wrong idea about how the relocations worked for the main executable.
    This stuff is certainly confusing, at least if you’re not used to the proper way of thinking.

    As for the difference between internal and hidden, I found this discussion:
    http://groups.google.com/group/generic-abi/browse_thread/thread/1a84adc15666164
    where Jim Dehnert, apparently the SGI representative who originally requested the addition of STV_INTERNAL to the gABI, posts here:
    http://groups.google.com/group/generic-abi/browse_thread/thread/2c3c04f556d9b84d
    He can’t remember exactly why they needed it, but thinks it was only relevant to link-time (interprocedural) optimization. Everyone else in that discussion (8 authors) says that they treat STV_HIDDEN and STV_INTERNAL identically.

    Finally, if you’re not fed up with answering questions about visibility…. In Ulrich Drepper’s DSO how-to:
    http://people.redhat.com/drepper/dsohowto.pdf
    Drepper says that protected visibility sounds nice but is even more expensive than default visibility. I can’t see why this would be. I see that it would be very tricky if you were allowed use protected function addresses in a non-call way in the DSO. But the gnu toolchain specifically forbids this. Eg:

    cmt:~/dso> cat w.c
    void prot(void) __attribute__ (( visibility (“protected”) ));
    int f(void (*p)(void) )
    {
    return p==prot;
    }

    void prot(void)
    {
    /*nothing*/
    }

    cmt:~/dso> gcc -fpic -o w.so -shared w.c
    /usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/bin/ld: /tmp/ccsNpSI0.o: relocation R_386_GOTOFF against protected function `prot’ can not be used when making a shared object
    /usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/bin/ld: final link failed: Bad value
    collect2: ld returned 1 exit status

    As long as one can deal with this restriction, shouldn’t protected visibility be an optimal solution for both intra-DSO calls (bypassing the dynamic linker and the PLT) and calls and non-call references from outside the DSO (which just use the same mechanisms as they would with default visibility)? Seems like it has just the same effect as Drepper’s suggested:

    void __attribute__(( visibility(“default”) )) prot(void)
    {
    }
    extern __typeof(prot) prot_int __attributes__ (( alias(“prot”), visibility(“hidden”) ));

    …where you then have to remember to use prot_int when referring to the function from within the DSO. The toolchain does allow to take the address of prot_int for this method, but you do have to be careful since it won’t be same as the address of prot.

    On the other hand, I’m reluctant to assume that I know any better than Ulrich Drepper about this stuff — he generally seems to know what he’s talking about :-) so… any thoughts? (I spotted some tricky-looking code in glibc’s elf/dl-lookup.c maybe relating to this, but I don’t really follow it).

    I’m done reading up to part 17 of your series, and none of the other sections have puzzled me as much as this whole thing about the meaning of symbol visibilities.

  5. Ian Lance Taylor said,

    September 25, 2007 @ 11:43 pm

    Thanks for the comment.

    Ulrich is saying that a protected function symbol is expensive because if a shared library references it without calling it, and if the application also references it without calling it, then both references have to return the same address. I personally don’t think this is worth worrying about, as the dynamic linker can tell, based on the relocation, whether the function is being called or referenced. This means that a reference rather than a call in a shared library is not optimally efficient. But I don’t immediately see why it has to be any more expensive than an ordinary reference to a function in a shared library. In any case, references to functions are not the normal case.

    The GNU linker’s restriction on using a GOTOFF reloc for a protected function symbol seems to be an attempt to avoid a bug in getting the address of the function. But it seems to be the wrong approach. It should really be marking the GOT entry with an appropriate reloc so that the dynamic linker can resolve it. I don’t see any reason that that can not work.

    So, yes, I think protected function symbols should work fine, and I don’t see any reason to avoid them (modulo toolchain bugs). But I also don’t see them as an optimal solution in general. Making a symbol protected changes the semantics: the symbol can no longer be overriden from outside the shared library. If that is what you want, then fine. But if you want the default semantics, then protected visibility is not helpful.

  6. lev said,

    September 26, 2007 @ 5:26 am

    Thanks for responding.

    I think the GNU linker’s restriction on using GOTOFF for a protected function symbol is because it would be impossible for the dynamic linker to get it right in all possible cases. When the executable references a function of the same name as the protected symbol, there are possibilities the dynamic linker has to distinguish between: 1) the executable’s reference will resolve to the protected function (in which case the reference in the DSO has to be resolved to the executable’s PLT address, just as in the default visibility case); 2) the executable’s reference will resolve to a different function (in which case the reference can be resolved, for example, to the protected function’s load address). Unfortunately at the time of resolving the GOTOFF reference in the DSO, the dynamic linker has no way of choosing between these two possibilities (in particular, it might change in the future, in the presence of dlopen(…,RTLD_DEEPBIND) and so on). So, it seems necessary to disallow references to protected symbols.

    As for changing the semantics and preventing the symbol from being overridden, this is desirable in the case that Ulrich describes — he’s trying to minimize the number of dynamic relocations needed, in order to speed startup of large applications with many libraries. His suggested solution using an internal hidden version of the symbol has the same semantics and also cannot be overridden. I guess I’ll ask Ulrich what his concern was with protected symbols.

  7. quietdragon said,

    November 23, 2008 @ 9:09 am

    I think it is also worthy of reference when distinguishing weak from non-weak symbols that the TIS ELF specification says:

    > When the link editor searches archive libraries, it extracts archive
    > members that contain definitions of undefined global symbols. The
    > member’s definition may be either a global or a weak symbol. The
    > link editor does not extract archive members to resolve undefined
    > weak symbols. Unresolved weak symbols have a zero value.

    The penultimate sentence is key.

  8. ELF Special Sections | Ben.ZH said,

    September 27, 2010 @ 4:58 pm

    [...] st_other: currently holds 0. GNU use it  to mark the visibility of the symbole to other compments. Its value are ‘DFAULT’ ‘HIDDEN’ ‘INTERNAL’ and ‘PROTECTED’. ‘DFAULT’ means the symbol is visible anywhere. Other three discribed in “GNUAssembler Directives“. One googled blog talk about it too, http://www.airs.com/blog/archives/42 [...]

  9. Ma.Jiang said,

    May 19, 2011 @ 1:05 am

    Thank you Taylor.This is really a very usuful article.
    But I have a question: why should the address of a function defined in a shared lib be the the address of its PLT entry (not the real virtual address of the function)?
    I think use the real address is OK,only if all the places used the same value. And i note ,on the x86 architecture,if the executable file was compiled with -fpic, the address of the function in libs were their real address.

  10. Ian Lance Taylor said,

    May 19, 2011 @ 9:19 pm

    You’re right: using the real address would be OK if all places used the same value. The problem is the executable. The executable is usually not compiled with -fPIC. That means that the references to function in the code will be compiled to refer to some absolute value that the linker must fill in. Using a dynamic relocation for the code would not be a good idea, as then the code could not be shared. The linker has to use some address that will work as the address of the function, but since the function is (for this example) defined in a shared library, the real address is not known at link time. So the linker uses the address of the PLT entry in the executable. The dynamic linker then has to do the same thing, so that all references use the same value. Hope that makes sense.

  11. Ma.Jiang said,

    May 19, 2011 @ 10:59 pm

    Thank you for the answer.
    I think i’ve got your idea: the key problem is that the executable might be compiled without -fPIC.
    Thank you again!

RSS feed for comments on this post · TrackBack URI

Leave a Comment

You must be logged in to post a comment.