Archive for March, 2011

Tcl

Around 1997 I had did quite a bit of programming in the Tcl language, as part of an IDE project at Cygnus. The project failed, for several reasons, but here I’m going to write about Tcl.

I don’t hear much about Tcl these days, although it was fairly popular in its day. Tcl is a highly dynamic interpreted language. It is easy to embed the Tcl interpreter into a C program. The Tk system, which is written in Tcl but is logically independent of it, is a very clever way to easily write a GUI program. Unfortunately, despite these advantages, Tcl is a mess of a language.

In Tcl, everything is a string. Tcl has list and associative array operations, but they all operate on strings with a specific syntax. The program itself is a string: you could construct new procedures on the fly, and annotate variables. This makes the language very powerful. It also makes it very confusing.

When everything is a string, type checking is completely impossible. It’s very easy to have a list and accidentally use a string operator on it. Everything will work fine, except you won’t be able to pull out the list values as expected. You might think that is not so bad–after all, in Lisp everything is a list, and that works well. But in Lisp, atoms are not lists, and atoms have a type: you can’t apply a string operator to a number. In Tcl the atomic unit is the character, and that doesn’t have a type. Applying a string operator to a number works fine, because a number is just a string of characters which happen to be digits. So when I say that type checking is impossible, I don’t just mean the ordinary lack of static type checking for a dynamic language. I mean that even at runtime there is no type checking.

Because everything is a string, quoting becomes essential. Tcl has various quoting mechanisms: double quotes create a simple string, square brackets create a string formatted as a list which is then normally executed as code, curly braces create a string formatted as a list which is not executed. Within a string, square bracketed lists are executed and variables with a leading dollar sign are interpolated. I’m sure I’m forgetting some aspects. Within such an environment, a detailed understanding of quoting becomes essential. Unfortunately, the only comprehensible way to quote actually involves using square brackets to invoke the list function. In other words, although everything is a string and it should be very easy to stick things together when that is what you want to do, in any scenario which is even slightly complex you have to start writing list operations in order to build your strings. Even then I recently wrote some Tcl code with seven consecutive backslashes, admittedly in a complex use case. That’s too much for easy reasoning, and in practice requires trial and error to get right.

I believe that Tcl has namespaces these days, but when I used it did not. That made it a poor choice for programming in the large, because all functions and variables lived in the same namespace. Because everything is fully dynamic, redefining a procedure causes no error. You don’t discover a namespace collision when you load the program, you only discover it when trying to figure out what went wrong.

The Tk system, as I mentioned, is a very clever way to write a GUI. it lets you write very simple code with windows and buttons and text input and so forth. These days most people would write some HTML and run it in a web browser, but Tk is a more powerful environment, and you can write everything in the same language, unlike the browser’s mix of HTML, Javascript, and server-side PHP or whatever. Tk is cross-platform.

Unfortunately, Tk too has a fatal flaw: because your GUI is produced by a simple program, your programmers have to change your GUI. When a UI designer wants to move a button, the only way to do it is to change the program. And Tk’s layout procedure does a lot for you automatically, which in practice means that it’s a pain to do anything else. The effect is that a Tk program always looks OK but never looks good. This effect is exacerbated by the fact that a Tk program looks kind of the same on any platform, which means that it looks unusual on any platform. For our IDE work we spent a fair amount of time building Tk interfaces to standard Windows objects, so that the programs would look sort of OK on Windows. In other words, Tk winds up being a great prototyping system, but a terrible system to use for your final program.

Despite all these awful characteristics, I have to say that the actual Tcl implementation is great. It’s platform independent, has a nice event loop, the code is easy to read and easy to modify. The core library provides system independent facilities which are well designed and well implemented. Unfortunately a good implementation can not overcome a poorly designed language.

So there you have it: a brief overview of language design gone wrong.

Comments (8)

Cross-compilation

GCC fully supports cross-compilation: building a program on one machine in order to run it another. However, it’s quite painful to actually build a full cross-toolchain with compiler, tools, and libraries. Many people new to the process have the same reaction: it simply can’t be this hard. But it is.

The basic problem is that a cross-toolchain has several different components, and those components are different projects with different release cycles, different maintainers, and different goals. Similar issues arise on a much larger scale with a complete operating system. There, the operating system, or distro, is run as a separate project, one which doesn’t actually do anything except coordinate many different projects and ensures that they work together. No such coordinating project exists for cross-toolchains. (I should say that you can purchase them from various companies, but there are no free projects.)

That’s the first thing to understand if you want to build a cross-toolchain: it really is painful and ugly. Normally when things fail to fit together in a reasonable way, you assume you’re on the wrong path. That’s a false indicator in the cross-toolchain world. Of course, you might still be on the wrong path. But don’t assume that because nothing quite works that you are doing the wrong thing.

The second thing to understand is terminology. The host system is the one where the compiler and other tools run. The target system is the one where the program that you build runs. For extra-advanced use, the build system is the one where you actually build the compiler. For GCC and friends, you specify these systems using the --host, --target, and --build options to the configure script. When the host and target system are the same, you have a native system.

There are many different kinds of target systems. They all require some sort of system library, commonly called a libc. GCC does not provide a system library. You have to figure out what libc is appropriate for your target system.

  • If your target runs GNU/Linux, you probably want glibc or uClibc, though Android systems use bionic, and there are other variants. glibc is used on most normal native systems; uClibc is designed for embedded systems, and has better documentation for cross-building.
  • If your target runs Windows, you want Mingw or Cygwin. These are fairly easy to cross-build.
  • If your target runs some other full-featured operating system, such as Solaris, you will have to copy the libc and all header files from an existing system.
  • If your target runs an embedded system such as RTEMS or eCos they will usually provide some documentation on how to build the system.
  • Finally, for a barebones embedded system, GCC is often used in conjunction with newlib. Newlib requires some sort of board support package, which handles I/O specific the the system. Some examples are in the libgloss directory.

Different choices here imply different approaches to building, and I’m not going to provide a complete recipe for any of them. Any complete recipe changes over time anyhow.

There are two different basic approaches: the one-tree build and the separate-tree build The one-tree build was developed at Cygnus by people like K. Richard Pixley and David Zuhn. The idea there is to mix all the source code together in a single directory. You can combine gcc, the GNU binutils, gdb, newlib, and/or cygwin/mingw into a single directory by simply merging the source trees. They all share the same top-level configure script and Makefile. However, in order to do this you must use sources from the same date, as there are various shared directories (include, libiberty), and they have to be the same. If you try to mix source trees from different dates, such as from different official releases, you have to be prepared to address complex failures. I can not recommend doing that.

The one-tree build does not work to build glibc. For that you need a separate-tree build. Although I was the release manager at Cygnus for a time nearly 20 years ago, these days I normally do a separate-tree build. For a separate-tree build, you just build the different projects separately. When doing this it’s essential to provide exactly the same configure options for each project. The usual procedure is this:

  1. Configure, build and install the GNU binutils.
  2. Configure GCC. Run “make all-gcc”. This builds just the compiler, not the supporting libraries. Install it using “make install-gcc”.
  3. Configure, build and install the library.
    • If you are copying the library from an existing system, then there is nothing to do there. Instead, copy the library and header files into some directory and use --with-sysroot to point to that directory when configuring the other tools.
    • Building newlib or the cygwin library is fairly straightforward here, as they expect to be built by a cross-compiler.
    • If building glibc, you have to be careful about the configure script checks, as you are using a compiler which can not build a complete program. I am unfortunately not very familiar with the problems that can arise here. Look at the crosstool-ng project for more issues and helpful scripts.
  4. Now build and install the rest of GCC, namely the supporting libraries.

At this point you probably think it’s not so bad. Here’s the catch. For any specific tools you are trying to build for a specific host and target, there’s a small but real chance that you are first person to try that specific combination. There is a very good chance that none of maintainers of any of these projects have tried the specific combination you are trying. What this means is that you are likely to encounter some bizarre problem somewhere along the way: some project will fail to build.

When I was the release manager at Cygnus we built a set of new releases every three months. My full-time job was fixing all the bizarre problems that occurred. It was good training in working with these tools. You might think that it would have gotten better over time, but it just hasn’t, because there is still no project dedicated to making it work better. Projects regularly change in incompatible ways that cause obscure combinations to break, and nobody notices.

This is an area that is ripe for somebody to come in and clean up, but it’s hard. It means continual testing and continual patching. It means finding problems fast and harassing maintainers regularly until they get fixed. It means working across releases, or figure out a way to reliably use unreleased code. It means understanding all of these different programs, understanding their goals and their internals. It’s not a completely thankless job, but after all most programmers do not do cross-development, and most programmers who do cross-development get complete cross-toolchains from some vendor, so there aren’t really all that many people who will thank you. Still, it would be good for the GCC world if somebody took this on.

Comments (9)

8 1/2

My favorite movie has long been Fellini’s 8 1/2. It’s a movie which seems designed to appeal to a computer programmer: it’s self-referential and recursive, a movie about the making of itself. It’s also about the difficulties of the creative process, and that is where it resonates most strongly with me. The director in the movie, Guido, is struggling to create something beautiful, and is winding up with a mish-mash of scenes, some of which mostly succeed and some of which mostly fail. Fellini, the real director, is struggling with the same thing, with the same results.

It’s the same thing I feel when I write a computer program. I start out thinking that this program will be beautiful, will do what it needs to do cleanly and elegantly. In the end there are a few successes and many failures, and the whole thing is deeply compromised and unsatisfactory. I never really like revisiting my old programs, because although there is the occasional moment of appreciation for how smart I was for a short bit, there is mostly the recollection of how the whole thing never really pulled together the way I wanted.

Fellini, of course, does pull together 8 1/2 at the end, and the movie becomes something beautiful, if not perhaps quite what he or Guido set out to make. But then Fellini is a great artist, and I am not.

Comments (2)

Copying

It’s interesting that the U.S. economy has moved away from manufacturing at the same time as computers have made it very easy to copy digital goods. We see the U.S. pushing China hard to enforce their copyright laws, because much of what the U.S. has to sell is easily copied. The U.S. has developed great skill at creating complex software. That skill itself is not easily copied, based as it is on many years of experience, but the end result of applying that skill is is difficult to control. Similarly, the U.S. leads the world in developing entertaining movies, but those too can not be controlled once they have been distributed. You can enforce all the copyright laws you want, but if a digital product is both expensive and desirable, it will inevitably be copied.

Software developers have reacted by increasingly tying software to some sort of service. That is a significant business advantage for offering cloud computing: your software works without requiring distribution, which means that nobody can easily copy it. If you’re going to sell virtual goods rather than manufactured ones, it’s important to not distribute them as part of using them. In other words, you have to sell a service.

Right now the U.S. is trying to push other countries to honor the agreements it needs to sell virtual goods. I don’t see how that can work in the long run. Better to focus on selling real goods or selling services. A simple service is vulnerable to competition, but there is plenty of space for selling complex services which are difficult to develop. That seems to be the likeliest trend for successful software companies going forward. It’s even if a possible path for entertainment companies if you think in terms of games.

Comments (5)