Debt and Taxes

During the Reagan administration, the U.S. reduced tax rates and increased defense spending. The national debt as a percentage of overall GDP increased from 32.5% to 53.1% (Reagan called this increase in debt the “greatest disappointment” of his presidency). During the first Bush administration, it continued to rise, reaching 66.1%. During the Clinton administration, the government raised taxes, the economy grew, and defense spending was reduced somewhat; the debt decreased to 56.4% of GDP. During the second Bush administration, again taxes were reduced and defense spending was increased; the debt increased to 83.4% of GDP.

Today fiscal conservatives are arguing that the high levels of debt require that government spending be reduced. At the same time, the plan put forward by Republican representative Paul Ryan, and strongly supported by the Republican House, calls for more tax cuts and higher defense spending. While it’s understood that his plan will not be adopted, it’s hard to see how it can be a serious proposal for debt reduction.

It’s clear that the U.S. has a high level of debt due largely to past steps of reducing taxes while increasing spending. One can argue details back and forth quite a bit, but it’s also clear that the debt has increased significantly under Republican administrations. Fiscal conservatives now argue that the high level of debt shows that the U.S. can not afford social programs like Social Security and Medicare. But while one can argue about increasing health care costs, history suggests that that simply isn’t true. What is true is that the U.S. can not steadily cut taxes without cutting spending.

It’s perfectly consistent to say that the U.S. should be a low-tax, low-service country. But arguments about debt which don’t mention the possibility of tax increases are not telling the whole truth about how the U.S. got into its current situation. What has happened, intentionally or not, is that tax cuts are being leveraged to reduce spending on social programs.

Incidentally, I think most people agree that governments should use tax money to invest in infrastructure. It’s generally most efficient to let the government build and maintain roads and bridges, as they require a large investment and the payback is indirect. I think one could make a good argument that health care is another form of infrastructural investment, an investment in people, which is most efficiently done by government.

Comments (8)

DejaGNU

Sorry for the long delay. Anyhow, I wrote that so that I could write this.

DejaGNU is the test harness used by gcc, gdb, the GNU binutils, and probably other programs as well. Frankly, it’s a disaster. The documentation is weak, the implementation is complex and confusing, it’s slow, it does not support running tests in parallel, it’s hard to use. It has exactly two things in its favor, and they are powerful. The first is that it mostly works. The second is that people have written many different board support packages which let it test cross-compilers on simulators and real hardware.

DejaGNU was initially written at Cygnus by Rob Savoye as a way to test gdb. I don’t recall if there was any gdb testsuite prior to DejaGNU, but if there was it was largely useless. Because gdb was a command-line program, the idea for DejaGNU was to write a test harness which could run gdb, send it commands, and examine the resulting output. That was the first mistake. It meant that all the gdb tests were required to look for syntactic details of the output which were irrelevant to the test. Tests for gdb revolve around making sure that gdb stops in the right place and can print local variables and do backtraces and so forth. That is a lot of output which DejaGNU matches using regexps. I think it would have been smarter to put the effort into adding a test harness internally to gdb itself, so that a program could query gdb’s state. This could have evolved into the MI output format which would up getting added to gdb anyhow. Or it could have evolved into a library interface for gdb, something which would have been very useful for IDEs and other purposes and still does not really exist.

Anyhow, once the decision was made to test gdb as a pure command line program, Rob looked for a program which could do that. He came across expect. The tag line of the expect paper from 1990 is “Curing Those Uncontrollable Fits of Interaction.” The expect program does a nice job of that: if you have a program that you can only interact with manually, expect lets you write a program to interact with it instead. So expect is a nice choice when you need to work with a program you don’t control. In our case, we did control gdb; choosing expect was a hack to save time modifying gdb, a hack we are still paying for nearly 20 years later.

Expect uses an embedded Tcl interpreter, so expect programs are Tcl programs. This is a good use of Tcl: it means that expect has a full programming language for writing scripts. Since interacting with other programs is all about strings, it’s perfectly reasonable to use Tcl, which is also all about strings. The consequence for DejaGNU, though, is that DejaGNU is written in Tcl.

Once Cygnus was using DejaGNU as a test harness for gdb, it seemed natural to use it as a test harness for gcc as well. But of course gcc is not an interactive program, so the advantages of using expect no longer applied. The disadvantages of Tcl remained intact.

Cygnus specialized in cross-compilers, so DejaGNU grew the ability to build programs for target boards and run them there, using various different communication mechanisms. None of this had much to do with expect or Tcl, but it was all written in Tcl because that was the mechanism available. All this support is the main reason it is difficult to move away from DejaGNU today.

At least on a native system, it’s natural to want to run tests in parallel. That’s only become more important over the years. Unfortunately, Tcl doesn’t support threads (there is a thread extension available these days, but it is written in such a way that it would have to be integrated into expect before DejaGNU could use it). It’s easy enough to write Tcl code to start gcc a bunch of times, but it’s much harder to write Tcl code to examine those results. The gcc testsuite does now run in parallel, but it does so by manually creating subsets of tests and invoking DejaGNU multiple times in parallel to run those subsets. This works but is hardly optimal.

As a highly dynamic interpreted language, Tcl is relatively slow. The expect program is quite clever and sets up pseudo terminals in order to properly interact with general programs, and effort that is useless when testing a simple program like gcc. A significant amount of the CPU time taken by a gcc testsuite run is for expect, time which is largely wasted.

The DejaGNU code is complex and hard to read. This is not entirely the fault of Tcl, but Tcl is partly to blame as DejaGNU struggles with namespace issues. Function and variable names are constructed at runtime to avoid namespace collisions, which makes it very hard to figure out what code will run. It plays games like having tests return the name of the function to run to report whether the test succeed (e.g., return “pass” to invoke the function named pass), which sounds almost clever until you realize that there is nothing which prevents you from returning an invalid value.

I could continue with more specific horrors from DejaGNU, but those are in principle fixable. I hope that the earlier points show that DejaGNU itself is broken by design. We need to move away from it.

Unfortunately DejaGNU’s large knowledge base of how to run programs on embedded systems, a knowledge base which is largely represented in hand-written Tcl code, is very hard to get around. Some of these scripts can be automatically translated into a better test harness, in that they simply set flags for various tools and set a communication mechanism. Many others will require hand conversion.

Unfortunately DejaGNU has more or less blighted the world of free test harnesses. There is CodeSourcery’s qmtest program, but I don’t know how widely that is used. Fortunately, test harness need not be particularly complex. I don’t think it would be that hard for a thoughtful person to replace DejaGNU for gcc testing, and I think the benefits would be manifold. Replacing it for gdb testing would be harder, as the gdb tests rely more on string matching. As I mentioned above that is in itself a bug, but it means that recreating the tests is hard.

There are various test scripts which are built around DejaGNU’s log files, basically attempts to parse human readable information. Those will have to change for any new test harness.

Although I don’t have a good alternative, I hope I have at least demonstrated that DejaGNU must go. Effort put into working with DejaGNU is effort wasted.

Comments (1)

Tcl

Around 1997 I had did quite a bit of programming in the Tcl language, as part of an IDE project at Cygnus. The project failed, for several reasons, but here I’m going to write about Tcl.

I don’t hear much about Tcl these days, although it was fairly popular in its day. Tcl is a highly dynamic interpreted language. It is easy to embed the Tcl interpreter into a C program. The Tk system, which is written in Tcl but is logically independent of it, is a very clever way to easily write a GUI program. Unfortunately, despite these advantages, Tcl is a mess of a language.

In Tcl, everything is a string. Tcl has list and associative array operations, but they all operate on strings with a specific syntax. The program itself is a string: you could construct new procedures on the fly, and annotate variables. This makes the language very powerful. It also makes it very confusing.

When everything is a string, type checking is completely impossible. It’s very easy to have a list and accidentally use a string operator on it. Everything will work fine, except you won’t be able to pull out the list values as expected. You might think that is not so bad–after all, in Lisp everything is a list, and that works well. But in Lisp, atoms are not lists, and atoms have a type: you can’t apply a string operator to a number. In Tcl the atomic unit is the character, and that doesn’t have a type. Applying a string operator to a number works fine, because a number is just a string of characters which happen to be digits. So when I say that type checking is impossible, I don’t just mean the ordinary lack of static type checking for a dynamic language. I mean that even at runtime there is no type checking.

Because everything is a string, quoting becomes essential. Tcl has various quoting mechanisms: double quotes create a simple string, square brackets create a string formatted as a list which is then normally executed as code, curly braces create a string formatted as a list which is not executed. Within a string, square bracketed lists are executed and variables with a leading dollar sign are interpolated. I’m sure I’m forgetting some aspects. Within such an environment, a detailed understanding of quoting becomes essential. Unfortunately, the only comprehensible way to quote actually involves using square brackets to invoke the list function. In other words, although everything is a string and it should be very easy to stick things together when that is what you want to do, in any scenario which is even slightly complex you have to start writing list operations in order to build your strings. Even then I recently wrote some Tcl code with seven consecutive backslashes, admittedly in a complex use case. That’s too much for easy reasoning, and in practice requires trial and error to get right.

I believe that Tcl has namespaces these days, but when I used it did not. That made it a poor choice for programming in the large, because all functions and variables lived in the same namespace. Because everything is fully dynamic, redefining a procedure causes no error. You don’t discover a namespace collision when you load the program, you only discover it when trying to figure out what went wrong.

The Tk system, as I mentioned, is a very clever way to write a GUI. it lets you write very simple code with windows and buttons and text input and so forth. These days most people would write some HTML and run it in a web browser, but Tk is a more powerful environment, and you can write everything in the same language, unlike the browser’s mix of HTML, Javascript, and server-side PHP or whatever. Tk is cross-platform.

Unfortunately, Tk too has a fatal flaw: because your GUI is produced by a simple program, your programmers have to change your GUI. When a UI designer wants to move a button, the only way to do it is to change the program. And Tk’s layout procedure does a lot for you automatically, which in practice means that it’s a pain to do anything else. The effect is that a Tk program always looks OK but never looks good. This effect is exacerbated by the fact that a Tk program looks kind of the same on any platform, which means that it looks unusual on any platform. For our IDE work we spent a fair amount of time building Tk interfaces to standard Windows objects, so that the programs would look sort of OK on Windows. In other words, Tk winds up being a great prototyping system, but a terrible system to use for your final program.

Despite all these awful characteristics, I have to say that the actual Tcl implementation is great. It’s platform independent, has a nice event loop, the code is easy to read and easy to modify. The core library provides system independent facilities which are well designed and well implemented. Unfortunately a good implementation can not overcome a poorly designed language.

So there you have it: a brief overview of language design gone wrong.

Comments (2)

Cross-compilation

GCC fully supports cross-compilation: building a program on one machine in order to run it another. However, it’s quite painful to actually build a full cross-toolchain with compiler, tools, and libraries. Many people new to the process have the same reaction: it simply can’t be this hard. But it is.

The basic problem is that a cross-toolchain has several different components, and those components are different projects with different release cycles, different maintainers, and different goals. Similar issues arise on a much larger scale with a complete operating system. There, the operating system, or distro, is run as a separate project, one which doesn’t actually do anything except coordinate many different projects and ensures that they work together. No such coordinating project exists for cross-toolchains. (I should say that you can purchase them from various companies, but there are no free projects.)

That’s the first thing to understand if you want to build a cross-toolchain: it really is painful and ugly. Normally when things fail to fit together in a reasonable way, you assume you’re on the wrong path. That’s a false indicator in the cross-toolchain world. Of course, you might still be on the wrong path. But don’t assume that because nothing quite works that you are doing the wrong thing.

The second thing to understand is terminology. The host system is the one where the compiler and other tools run. The target system is the one where the program that you build runs. For extra-advanced use, the build system is the one where you actually build the compiler. For GCC and friends, you specify these systems using the --host, --target, and --build options to the configure script. When the host and target system are the same, you have a native system.

There are many different kinds of target systems. They all require some sort of system library, commonly called a libc. GCC does not provide a system library. You have to figure out what libc is appropriate for your target system.

  • If your target runs GNU/Linux, you probably want glibc or uClibc, though Android systems use bionic, and there are other variants. glibc is used on most normal native systems; uClibc is designed for embedded systems, and has better documentation for cross-building.
  • If your target runs Windows, you want Mingw or Cygwin. These are fairly easy to cross-build.
  • If your target runs some other full-featured operating system, such as Solaris, you will have to copy the libc and all header files from an existing system.
  • If your target runs an embedded system such as RTEMS or eCos they will usually provide some documentation on how to build the system.
  • Finally, for a barebones embedded system, GCC is often used in conjunction with newlib. Newlib requires some sort of board support package, which handles I/O specific the the system. Some examples are in the libgloss directory.

Different choices here imply different approaches to building, and I’m not going to provide a complete recipe for any of them. Any complete recipe changes over time anyhow.

There are two different basic approaches: the one-tree build and the separate-tree build The one-tree build was developed at Cygnus by people like K. Richard Pixley and David Zuhn. The idea there is to mix all the source code together in a single directory. You can combine gcc, the GNU binutils, gdb, newlib, and/or cygwin/mingw into a single directory by simply merging the source trees. They all share the same top-level configure script and Makefile. However, in order to do this you must use sources from the same date, as there are various shared directories (include, libiberty), and they have to be the same. If you try to mix source trees from different dates, such as from different official releases, you have to be prepared to address complex failures. I can not recommend doing that.

The one-tree build does not work to build glibc. For that you need a separate-tree build. Although I was the release manager at Cygnus for a time nearly 20 years ago, these days I normally do a separate-tree build. For a separate-tree build, you just build the different projects separately. When doing this it’s essential to provide exactly the same configure options for each project. The usual procedure is this:

  1. Configure, build and install the GNU binutils.
  2. Configure GCC. Run “make all-gcc”. This builds just the compiler, not the supporting libraries. Install it using “make install-gcc”.
  3. Configure, build and install the library.
    • If you are copying the library from an existing system, then there is nothing to do there. Instead, copy the library and header files into some directory and use --with-sysroot to point to that directory when configuring the other tools.
    • Building newlib or the cygwin library is fairly straightforward here, as they expect to be built by a cross-compiler.
    • If building glibc, you have to be careful about the configure script checks, as you are using a compiler which can not build a complete program. I am unfortunately not very familiar with the problems that can arise here. Look at the crosstool-ng project for more issues and helpful scripts.
  4. Now build and install the rest of GCC, namely the supporting libraries.

At this point you probably think it’s not so bad. Here’s the catch. For any specific tools you are trying to build for a specific host and target, there’s a small but real chance that you are first person to try that specific combination. There is a very good chance that none of maintainers of any of these projects have tried the specific combination you are trying. What this means is that you are likely to encounter some bizarre problem somewhere along the way: some project will fail to build.

When I was the release manager at Cygnus we built a set of new releases every three months. My full-time job was fixing all the bizarre problems that occurred. It was good training in working with these tools. You might think that it would have gotten better over time, but it just hasn’t, because there is still no project dedicated to making it work better. Projects regularly change in incompatible ways that cause obscure combinations to break, and nobody notices.

This is an area that is ripe for somebody to come in and clean up, but it’s hard. It means continual testing and continual patching. It means finding problems fast and harassing maintainers regularly until they get fixed. It means working across releases, or figure out a way to reliably use unreleased code. It means understanding all of these different programs, understanding their goals and their internals. It’s not a completely thankless job, but after all most programmers do not do cross-development, and most programmers who do cross-development get complete cross-toolchains from some vendor, so there aren’t really all that many people who will thank you. Still, it would be good for the GCC world if somebody took this on.

Comments (8)

8 1/2

My favorite movie has long been Fellini’s 8 1/2. It’s a movie which seems designed to appeal to a computer programmer: it’s self-referential and recursive, a movie about the making of itself. It’s also about the difficulties of the creative process, and that is where it resonates most strongly with me. The director in the movie, Guido, is struggling to create something beautiful, and is winding up with a mish-mash of scenes, some of which mostly succeed and some of which mostly fail. Fellini, the real director, is struggling with the same thing, with the same results.

It’s the same thing I feel when I write a computer program. I start out thinking that this program will be beautiful, will do what it needs to do cleanly and elegantly. In the end there are a few successes and many failures, and the whole thing is deeply compromised and unsatisfactory. I never really like revisiting my old programs, because although there is the occasional moment of appreciation for how smart I was for a short bit, there is mostly the recollection of how the whole thing never really pulled together the way I wanted.

Fellini, of course, does pull together 8 1/2 at the end, and the movie becomes something beautiful, if not perhaps quite what he or Guido set out to make. But then Fellini is a great artist, and I am not.

Comments (2)

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »