Archive for Programming

Relational Databases Considered Harmful

My first real job, at the actual company called AIRS and the original owner of the domain, was working on what was then called a fourth generation language system. Nowadays we would simply call it a database. It was pretty powerful, and we had customers, but we only ran on a system called the Alpha Micro which ran a proprietary operating system called AMOS. So that’s why you’ve never heard of AIRS, or if not the only reason at least a sufficient one.

Anyhow, what I wanted to say here is, we had a database, but it was not a relational database. It was what then called a network database. Nowadays, we would simply say “huh?” But in fact I thought then and continue to think now that relational databases are basically a mistake.

The idea behind relational databases, as invented back in the ’70s by E.F. Codd at IBM, is that you can access them via a structured query language. We now know that language at SQL. A relational database is conceptually just a set of tables. SQL lets you express a database search by conceptually joining tables together in a few different ways (the most common way being matching some index field) and then describing some condition to be satisfied on the joined table. SQL is nice because you can compose complex queries in a fairly logical manner.

The problem is that data in the real world doesn’t typically come in table format. In order to put it into table format you have to insert all various key fields. That’s not so bad, actually, but when you have one-to-many or many-to-many relationships, SQL requires that you describe them as table joins. When the tables get large, actually joining a table becomes expensive–prohibitively expensive. So instead the database has to figure out an efficient way to do the query which doesn’t involve doing an actual join, but instead involves traversing the tables in some manner collecting the desired results. When the query is at all complicated, that becomes a hard problem. So for practical use relational database implementations require you to define various indexes which they can use, as well as more sophisticated solutions which I won’t go into.

The end result is that when you anticipate a user’s query and create the appropriate indexes, the query is fast. When the user does something you didn’t anticipate, the query is slow–for a large database, prohibitively slow. The user who doesn’t know which indexes you created has no idea which queries will be fast and which will be slow. So SQL gives you this logically powerful query language which is supposed to be simple to use, but in actual practice has traps for the unwary user.

In a network database, you have to express your one-to-many and many-to-many relationships explicitly. And when you do a query, you have to follow the links defined by the database–there are no joins. The result is that an efficient query is easy to write, and an inefficient query is hard to write. This makes it harder to do data mining, but it’s actually not that bad. You can still support a structured query language, even SQL itself, you just use a system which rewrites it into something which can be implemented before it gets to the database rather than after. (At AIRS we had a programming language for queries which we called Logic, but most people used a visual query creator, with saved standard queries, etc.).

I can’t argue that SQL is really nice to use. But I can and do argue that relational databases are not the right way to organize data for efficient access.

Comments (3)

Archive Alignment

gold normally simply mmaps input files and reads the data directly from the mapped memory. In general this requires that all the data structures be properly aligned in the file, which is guaranteed by the ELF standard. The x86, of course, does not require proper alignment for memory loads and stores, so this was never tested. When David Miller tested on a SPARC host, he saw crashes.

It turns out that the problem was object files in archives. Archives only guarantee an alignment to 16-bit boundaries. ELF data structures require 32-bit or 64-bit boundaries (the latter for 64-bit ELF). Thus it is possible for an object file stored in an archive to be misaligned with respect to mmap.

I fixed this in gold in the obvious way. But it suggests that changing the archive format to enforce 64-bit alignment could give better linker performance. Unfortunately the archive format, which dates back the old a.out format, is quite simple. The size of the object file is stored, but not the size that it takes up in the archive. And there is no global table of contents pointing to each object file. So a general archive file reader must walk through the archive file one object file at a time. Archive file readers must align to 16-bit boundaries as they go. There is no provision for recording the amount of required alignment.

The only way I see to get optimal performance would be to actually define a new archive format, with a new magic string. It’s not clear that the compatibility hassles would be worth it.

Comments (9)

GCC Exception Frames

When an exception is thrown in C++ and caught by one of the calling functions, the supporting libraries need to unwind the stack. With gcc this is done using a variant of DWARF debugging information. The unwind information is loaded at runtime, but is not read unless an exception is thrown. That means that the unwind library needs to have some way of finding the appropriate unwind information at runtime.

On some systems, this is done by registering the exception frame information when the program starts. The registration is done with a variant of the handling of C++ constructors. This becomes interesting when one shared library can throw an exception which is caught by another shared library. It is possible for such a case to arise when the executable itself never throws exceptions and therefore has no frames to register. Obviously the unwinder needs to be able to find the unwind information for both shared libraries, which means that both shared libraries need to use the same registration functions. With gcc this is normally ensured by putting the unwind code in a shared library, Each shared library, and sometimes the executable, will use That ensures a single copy of the registration and unwind functions, so the library will be able to reliably unwind across shared libraries. With gcc the use of can be controlled with the -shared-libgcc and -static-libgcc options. Normally the right thing will happen by default.

That approach has a cost: there is an extra shared library, and there is a small cost of registering the unwind information at program startup or library load time (and unregistering it if a shared library is unloaded via dlclose). There is now a better way, which requires linker support.

Both gold and the GNU linker support the command line option --eh-frame-hdr. With this option, when the linker sees the .eh_frame sections used to hold the unwind information, it automatically builds a header. This header is a sorted array mapping program counter addresses to unwind information. The header is recorded as a program segment of type PT_GNU_EH_FRAME. (This is a little bit ugly since the .eh_frame sections are recognized only by name; ideally they should have a special section type.)

At runtime, the unwind library can use the dl_iterate_phdr function to find the program segments of the executable and all currently loaded shared libraries. It can use that to find the PT_GNU_EH_FRAME segments, and use the sorted array in those segments to quickly find the unwind information.

This approach means that no registration functions are required. It also means that it is not necessary to have a single shared library, since dl_iterate_phdr is available no matter which shared library throws the exception.

This all only works if you have a linker which supports generating PT_GNU_EH_FRAME sections, if all the shared libraries and the executable are linked by such a linker, and if you have a working dl_iterate_phdr function in your C library or dynamic linker. I think that pretty much restricts this approach to GNU/Linux and possibly other free operating systems. For those scenarios, I hope that gcc will soon be able to stop using by default.

Comments (2)

Concurrent linking

There is still work to do on gold. But once that is done, what is the next big step? In the long term we need an incremental linker, but I’m also interested in an idea which I call concurrent linking. A concurrent linker runs at the same time as the compiler. As each object file is compiled, the linker is notified, and the linker adds the object file to the executable that it is building. When the last compilation is completed, the linker writes out the fully linked executable.

The idea is to keep the linker from being a serializing step in a compilation. It is normally easy to run many compilations in parallel, as each compilation is independent. Traditionally, however, the linker can not start until all the compilations are complete, and it must read all the input files at that time.

A concurrent linker can instead run at the same time as the compilations. There is no long wait after the compilations are complete. Also, the newly generated object file should be in the disk cache, and so the linker will have to do less actual disk I/O. I expect that this would only be a noticeable improvement for large programs, but then those are the cases where the linker really is a bottleneck.

The key to making this work will be careful control of the symbol table, and efficient tracking of relocations. In general relocations for an object file may only be resolved when they refer to symbols defined in objects which appear on the link command line. In other cases we must remember the relocation. If the symbol is already known to be defined in an object which appears later on the link line, it would probably be appropriate to resolve the relocation to that symbol, but to record it in case the symbol definition is changed later.

In general an executable linked in this way may have additional text and data segments, and they won’t be as tightly packed. Thus the resulting executable will most likely have slightly worse paging behaviour and will run slightly slower. So this technique is only appropriate during the development cycle, not during a release build.

Comments (15)

Gold Released

I have finally released gold, the new ELF linker I’ve been working on, to the free software world. It is now part of the GNU binutils. I sent an announcement to the binutils mailing list.

Comments (6)

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »