Cross-compilation

GCC fully supports cross-compilation: building a program on one machine in order to run it another. However, it’s quite painful to actually build a full cross-toolchain with compiler, tools, and libraries. Many people new to the process have the same reaction: it simply can’t be this hard. But it is.

The basic problem is that a cross-toolchain has several different components, and those components are different projects with different release cycles, different maintainers, and different goals. Similar issues arise on a much larger scale with a complete operating system. There, the operating system, or distro, is run as a separate project, one which doesn’t actually do anything except coordinate many different projects and ensures that they work together. No such coordinating project exists for cross-toolchains. (I should say that you can purchase them from various companies, but there are no free projects.)

That’s the first thing to understand if you want to build a cross-toolchain: it really is painful and ugly. Normally when things fail to fit together in a reasonable way, you assume you’re on the wrong path. That’s a false indicator in the cross-toolchain world. Of course, you might still be on the wrong path. But don’t assume that because nothing quite works that you are doing the wrong thing.

The second thing to understand is terminology. The host system is the one where the compiler and other tools run. The target system is the one where the program that you build runs. For extra-advanced use, the build system is the one where you actually build the compiler. For GCC and friends, you specify these systems using the --host, --target, and --build options to the configure script. When the host and target system are the same, you have a native system.

There are many different kinds of target systems. They all require some sort of system library, commonly called a libc. GCC does not provide a system library. You have to figure out what libc is appropriate for your target system.

  • If your target runs GNU/Linux, you probably want glibc or uClibc, though Android systems use bionic, and there are other variants. glibc is used on most normal native systems; uClibc is designed for embedded systems, and has better documentation for cross-building.
  • If your target runs Windows, you want Mingw or Cygwin. These are fairly easy to cross-build.
  • If your target runs some other full-featured operating system, such as Solaris, you will have to copy the libc and all header files from an existing system.
  • If your target runs an embedded system such as RTEMS or eCos they will usually provide some documentation on how to build the system.
  • Finally, for a barebones embedded system, GCC is often used in conjunction with newlib. Newlib requires some sort of board support package, which handles I/O specific the the system. Some examples are in the libgloss directory.

Different choices here imply different approaches to building, and I’m not going to provide a complete recipe for any of them. Any complete recipe changes over time anyhow.

There are two different basic approaches: the one-tree build and the separate-tree build The one-tree build was developed at Cygnus by people like K. Richard Pixley and David Zuhn. The idea there is to mix all the source code together in a single directory. You can combine gcc, the GNU binutils, gdb, newlib, and/or cygwin/mingw into a single directory by simply merging the source trees. They all share the same top-level configure script and Makefile. However, in order to do this you must use sources from the same date, as there are various shared directories (include, libiberty), and they have to be the same. If you try to mix source trees from different dates, such as from different official releases, you have to be prepared to address complex failures. I can not recommend doing that.

The one-tree build does not work to build glibc. For that you need a separate-tree build. Although I was the release manager at Cygnus for a time nearly 20 years ago, these days I normally do a separate-tree build. For a separate-tree build, you just build the different projects separately. When doing this it’s essential to provide exactly the same configure options for each project. The usual procedure is this:

  1. Configure, build and install the GNU binutils.
  2. Configure GCC. Run “make all-gcc”. This builds just the compiler, not the supporting libraries. Install it using “make install-gcc”.
  3. Configure, build and install the library.
    • If you are copying the library from an existing system, then there is nothing to do there. Instead, copy the library and header files into some directory and use --with-sysroot to point to that directory when configuring the other tools.
    • Building newlib or the cygwin library is fairly straightforward here, as they expect to be built by a cross-compiler.
    • If building glibc, you have to be careful about the configure script checks, as you are using a compiler which can not build a complete program. I am unfortunately not very familiar with the problems that can arise here. Look at the crosstool-ng project for more issues and helpful scripts.
  4. Now build and install the rest of GCC, namely the supporting libraries.

At this point you probably think it’s not so bad. Here’s the catch. For any specific tools you are trying to build for a specific host and target, there’s a small but real chance that you are first person to try that specific combination. There is a very good chance that none of maintainers of any of these projects have tried the specific combination you are trying. What this means is that you are likely to encounter some bizarre problem somewhere along the way: some project will fail to build.

When I was the release manager at Cygnus we built a set of new releases every three months. My full-time job was fixing all the bizarre problems that occurred. It was good training in working with these tools. You might think that it would have gotten better over time, but it just hasn’t, because there is still no project dedicated to making it work better. Projects regularly change in incompatible ways that cause obscure combinations to break, and nobody notices.

This is an area that is ripe for somebody to come in and clean up, but it’s hard. It means continual testing and continual patching. It means finding problems fast and harassing maintainers regularly until they get fixed. It means working across releases, or figure out a way to reliably use unreleased code. It means understanding all of these different programs, understanding their goals and their internals. It’s not a completely thankless job, but after all most programmers do not do cross-development, and most programmers who do cross-development get complete cross-toolchains from some vendor, so there aren’t really all that many people who will thank you. Still, it would be good for the GCC world if somebody took this on.

9 Comments »

  1. ratmice said,

    March 22, 2011 @ 6:36 am

    Hey Ian,

    I had a while back tried to see how the single tree compile could work in combination with modern distributed version control

    basically using gits ‘sparse checkout’ feature, and checking out the files from multiple repositories into a single working tree

    then frobbing the GIT_DIR variable to switch which repository that git would work on.

    I found this useful for a number of things,
    like getting diffs between shared directories in the src and gcc repositories, and maintaining vendor patches in a branch rather than having to do any pre patching.
    which makes the build scripts simpler.

    at the time though, I had to maintain my own git conversion of the binutils cvs (which was a pain), since none of the existing conversions contained sim, it seems as though recently gdb has added that…

    anyhow, from what I did it showed it was possible, if not exactly graceful.
    this sort of multi-project single tree building is something which I wish was supported better. but the current status makes you think these tools are written for working on monolithic code bases! :D

  2. gumby said,

    March 22, 2011 @ 10:28 am

    This seems like the kind of problem that would be amenable to automated nightly builds, especially with scriptable simulators. In fact with so many public “cloud” compute cycles available I’m surprised there aren’t nightly build farms for sourceforge at the like in general.

  3. jlunz said,

    March 25, 2011 @ 12:07 pm

    Have a look at this project:

    http://freshmeat.net/projects/crosstool-ng/
    http://ymorin.is-a-geek.org/projects/crosstool


    crosstool-NG is a versatile toolchain generator, aiming at being highly configurable. It supports multiple target architectures, different components (glibc/uClibc…) and versions. crosstool-NG also features debugging utilities (DUMA, strace…) and generation tools (sstrip…).

    This great tool helped me a lot with building various cross-toolchains for x86 and ARM, provides easy handling and reproducable builds.

  4. zumbi said,

    March 27, 2011 @ 4:19 pm

    Hello Ian,

    I have been building cross toolchains for multiple architectures (more than 11) since gcc-2.95 as part of a 100% free and open source project: http://www.emdebian.org.

    The builds as fine, but when multilibs come into the game and you are trying to bootstrap the libraries from scratch then you get all the fun! And surely, you are right, nobody will thank you for the hard work and long time looking compiler messages while bootstrapping.

    Thanks for all the support you do I think it is great for the community.

    Best regards,
    — Hector Oron

  5. wiz said,

    March 29, 2011 @ 2:27 am

    I just have to mention this, since it is so useful:
    NetBSD comes with a shell script that automatically builds a cross-toolchain for the platform you’re targeting and uses it to compile all of NetBSD.
    This is the default build method, even if you are on the same architecture.
    The script usually works on Solaris, Linux, *BSD and some other platforms.

    Very useful and easy to use, just “./build.sh -m sparc64″ to get a full NetBSD build for sparc64.

  6. namhyung said,

    April 1, 2011 @ 8:21 pm

    Nice article!

    Do you mind if I translated it into Korean on my blog? I hope to share this info with more Korean guys :)

  7. Ian Lance Taylor said,

    April 2, 2011 @ 10:44 am

    I don’t mind.

  8. kanaka said,

    April 11, 2011 @ 10:45 am

    I think you should probably clarify your explanation of the GNU canonical specifiers because IMO your description will confuse people.

    Here is an explanation (not so much for Lance as for others reading the blog):

    –target only applies to toolchain programs (binutils and gcc). When you are actually building something using that toolchain, –build is the system the toolchain is running on and –host is the system where the built program will run.

    build == host => natively compile a program
    build != host => cross-compile a program

    build == host == target => build a native toolchain program
    build == host != target => build a cross-compile toolchain program
    build != host != target => build a canadian cross-compile toolchain program
    build != host == target => cross-comipile a native toolchain program (e.g. cross-compile binutils to run natively on your DD-WRT)

    One of the reasons that libc/newlib/glibc is confusing is that it is often considered (and installed as) part of the toolchain but it does not really have a target in the same sense. It’s really the first dependency that is cross-compiled. It is usually kept as part of the toolchain because everything depends on it, but it can also be stored in the new cross-compiled ROOT/DEST instead.

  9. chenwj said,

    July 20, 2012 @ 4:28 am

    Hi Ian,

    The crosstool-ng website is now moved to
    http://crosstool-ng.org/

RSS feed for comments on this post · TrackBack URI

Leave a Comment

You must be logged in to post a comment.