Kernel Linker Features

As I continue trying to build the kernel with gold, I’ve had to copy several features from the GNU linker to gold.

Historically, the GNU linker implemented the -R option to mean that it should only use the symbols found in the named object; the object should not actually be included in the output file. This feature is convenient for some types of embedded system work: you link a loadable module against an object or executable which represents the operating system. This gives you a simple-minded system call interface, in which you don’t have to specify an ABI, you just have to rebuild all your programs whenever the OS changes. This is also more or less how SVR3 shared libraries are used.

Historically, ELF linkers used -R to add a directory to the runtime search path. That is, the historic GNU linker -R was equivalent to –just-symbols, and the historic ELF linker -R was equivalent to -rpath. This caused a natural conflict when ELF support was added to the GNU linker. The compiler wanted to pass -R to the linker in some cases (in those days–I don’t think it does anymore), but -R was already implemented to mean something different.

Back in 1994 I resolved this issue in the GNU linker in a simple way: I made -R do different things based on whether the argument was a file or a directory. Given a file, it acts like –just-symbols; given a directory, it acts like -rpath. In retrospect, I should have decided that one of those choices was going to be the new default, and issued a warning whenever the other one was used. That would have encouraged people to change, and by now the ambiguity could have been removed. Admittedly at the time it was not so clear that ELF was going to become the dominant object file format.

In gold I originally simply made -R be equivalent to -rpath, as is appropriate for an ELF linker. Unfortunately, I’ve now discovered that the Linux build uses -R to mean –just-symbols. This is part of the vsyscall implementation. They build a shared library and a relocatable object in the same way using the same linker script. Then they refer to the relocatable object using -R. The effect is that they can make direct calls to the symbols defined in the shared library without having the linker introduce any of the usual PLT overhead. This of course assumes that the shared library will always be loaded at the specified address, which they ensure by other means.

So I’ve changed gold to make -R ambiguous in the same way as the GNU linker. I also had to implement –just-symbols. Maybe I should introduce that -R warning now, so that 14 years later the ambiguity can be removed.

Now, as it happens, that same linker script includes version specifications, and those version specifications force some symbols to be local. That is what the kernel wants for the shared library, but not for the relocatable object. The relocatable object needs to have a symbol be accessible although it is local in the shared library. This works with the GNU linker because the GNU linker simply ignores version specs when doing a relocatable link. Version specs do make sense for a relocatable link; in particular the ability to force symbols to be local. So gold used to honor them more or less as one would expect–actually this happened by default. Unfortunately I have now had to change that, and gold now ignores version scripts for relocatable links. Perhaps I will add a new option to enable them.

A minor feature used by the kernel build is to create an empty output file (not literally empty, but an ELF file with no symbols and no sections) when there are no input files. The GNU linker rejects being invoked with no input arguments. However, it is possible to pass it an empty archive, and it will proceed to generate an empty output file. Implementing this in gold was more complicated than one might expect, because up until now gold had no notion of the default target. gold always generated an output file in the same format as the input files. I had to introduce a default target based on the configuration options. I hope it will not be necessary to support the full power of the –oformat option, which essentially uses magic names to pick which type of ELF file to generate.

However, it did turn out to be necessary to support a special case of the –oformat option: –oformat binary. This directs the GNU linker to generate a binary file rather than a regular object file. It only makes sense for an executable whose load addresses are all at or near zero. The kernel build uses this feature to build a boot sector. It turned out to be fairly easy to implement in gold. gold uses mmap to build the output file, and already supported using an anonymous map and writing it to the output when complete. This was introduced to support -o /dev/null while still generating all linker warnings. It also permits writing the output file to standard output, which is slightly cool though useless. To implement binary output I just generated the regular ELF file to a memory buffer, and then copied the contents to the real output file using the load addresses.

The next feature I have to implement is –format binary, in which an input file is not an ELF file, but is treated as a binary blob. With luck that will be the last newlinker feature the kernel requires.

2 Comments »

  1. ncm said,

    February 7, 2008 @ 1:14 pm

    I think you’re selling short the Linux maintainers’ willingness to make improvements. They may be enthusiastic enough about abandoning GNU ld to provide some forward compatibility. After all, your linker only needs to work with current kernels, not historical ones. My guess is that the limit to such changes is reached only at the point where they can’t link with GNU ld any more.

  2. Ian Lance Taylor said,

    February 7, 2008 @ 6:13 pm

    I may resort to that if I have to, but so far I’ve only spent a couple of days on this part of the effort. Some support for linker scripts would have been required in any case.

RSS feed for comments on this post · TrackBack URI

You must be logged in to post a comment.