Sorry for the long delay. Anyhow, I wrote that so that I could write this.
DejaGNU is the test harness used by gcc, gdb, the GNU binutils, and probably other programs as well. Frankly, it’s a disaster. The documentation is weak, the implementation is complex and confusing, it’s slow, it does not support running tests in parallel, it’s hard to use. It has exactly two things in its favor, and they are powerful. The first is that it mostly works. The second is that people have written many different board support packages which let it test cross-compilers on simulators and real hardware.
DejaGNU was initially written at Cygnus by Rob Savoye as a way to test gdb. I don’t recall if there was any gdb testsuite prior to DejaGNU, but if there was it was largely useless. Because gdb was a command-line program, the idea for DejaGNU was to write a test harness which could run gdb, send it commands, and examine the resulting output. That was the first mistake. It meant that all the gdb tests were required to look for syntactic details of the output which were irrelevant to the test. Tests for gdb revolve around making sure that gdb stops in the right place and can print local variables and do backtraces and so forth. That is a lot of output which DejaGNU matches using regexps. I think it would have been smarter to put the effort into adding a test harness internally to gdb itself, so that a program could query gdb’s state. This could have evolved into the MI output format which would up getting added to gdb anyhow. Or it could have evolved into a library interface for gdb, something which would have been very useful for IDEs and other purposes and still does not really exist.
Anyhow, once the decision was made to test gdb as a pure command line program, Rob looked for a program which could do that. He came across expect. The tag line of the expect paper from 1990 is “Curing Those Uncontrollable Fits of Interaction.” The expect program does a nice job of that: if you have a program that you can only interact with manually, expect lets you write a program to interact with it instead. So expect is a nice choice when you need to work with a program you don’t control. In our case, we did control gdb; choosing expect was a hack to save time modifying gdb, a hack we are still paying for nearly 20 years later.
Expect uses an embedded Tcl interpreter, so expect programs are Tcl programs. This is a good use of Tcl: it means that expect has a full programming language for writing scripts. Since interacting with other programs is all about strings, it’s perfectly reasonable to use Tcl, which is also all about strings. The consequence for DejaGNU, though, is that DejaGNU is written in Tcl.
Once Cygnus was using DejaGNU as a test harness for gdb, it seemed natural to use it as a test harness for gcc as well. But of course gcc is not an interactive program, so the advantages of using expect no longer applied. The disadvantages of Tcl remained intact.
Cygnus specialized in cross-compilers, so DejaGNU grew the ability to build programs for target boards and run them there, using various different communication mechanisms. None of this had much to do with expect or Tcl, but it was all written in Tcl because that was the mechanism available. All this support is the main reason it is difficult to move away from DejaGNU today.
At least on a native system, it’s natural to want to run tests in parallel. That’s only become more important over the years. Unfortunately, Tcl doesn’t support threads (there is a thread extension available these days, but it is written in such a way that it would have to be integrated into expect before DejaGNU could use it). It’s easy enough to write Tcl code to start gcc a bunch of times, but it’s much harder to write Tcl code to examine those results. The gcc testsuite does now run in parallel, but it does so by manually creating subsets of tests and invoking DejaGNU multiple times in parallel to run those subsets. This works but is hardly optimal.
As a highly dynamic interpreted language, Tcl is relatively slow. The expect program is quite clever and sets up pseudo terminals in order to properly interact with general programs, and effort that is useless when testing a simple program like gcc. A significant amount of the CPU time taken by a gcc testsuite run is for expect, time which is largely wasted.
The DejaGNU code is complex and hard to read. This is not entirely the fault of Tcl, but Tcl is partly to blame as DejaGNU struggles with namespace issues. Function and variable names are constructed at runtime to avoid namespace collisions, which makes it very hard to figure out what code will run. It plays games like having tests return the name of the function to run to report whether the test succeed (e.g., return “pass” to invoke the function named
pass), which sounds almost clever until you realize that there is nothing which prevents you from returning an invalid value.
I could continue with more specific horrors from DejaGNU, but those are in principle fixable. I hope that the earlier points show that DejaGNU itself is broken by design. We need to move away from it.
Unfortunately DejaGNU’s large knowledge base of how to run programs on embedded systems, a knowledge base which is largely represented in hand-written Tcl code, is very hard to get around. Some of these scripts can be automatically translated into a better test harness, in that they simply set flags for various tools and set a communication mechanism. Many others will require hand conversion.
Unfortunately DejaGNU has more or less blighted the world of free test harnesses. There is CodeSourcery’s qmtest program, but I don’t know how widely that is used. Fortunately, test harness need not be particularly complex. I don’t think it would be that hard for a thoughtful person to replace DejaGNU for gcc testing, and I think the benefits would be manifold. Replacing it for gdb testing would be harder, as the gdb tests rely more on string matching. As I mentioned above that is in itself a bug, but it means that recreating the tests is hard.
There are various test scripts which are built around DejaGNU’s log files, basically attempts to parse human readable information. Those will have to change for any new test harness.
Although I don’t have a good alternative, I hope I have at least demonstrated that DejaGNU must go. Effort put into working with DejaGNU is effort wasted.