Archive for November, 2010

Complexity

I’ve noticed that I have a bias toward complexity. I tend to try to look at problems from all the angles I can think of. Most problems are complex from at least some points of view; otherwise, they would not be problems at all. Looking at problems from all angles encourages a focus on their complex aspects rather than their simple ones.

I do profoundly believe H.L. Mencken’s classic quote: “For every complex problem, there is a solution that is simple, neat, and wrong.” For any problem, it’s very tempting to shear away the complexity and push forward with a solution that solves the simpler part. Unfortunately solving the simpler part of the problem tends to make the complex part both worse and harder to solve later.

This is a common problem in programming. I think it is the primary reason that programs become harder to maintain over time. It is very hard to keep a focus on solving complex problems rather than ignoring the complexity in order to solve a simpler problem. This is also a common problem in politics, and I’m sure we can all find our favorite examples there.

At the same time, I’ve come to realize that focusing too much on complexity is a different and more subtle trap. It encourages you to focus on the specific issues rather than the general ones. Sometimes a complex problem can be solved by solving a different, larger, problem. Sometimes you need to take a step back. The trick is to not ignore the complexity, but to be open to the possibility that it simple doesn’t matter.

When I was younger, I always felt that Alexander cheated when he simply cut the Gordian Knot. He didn’t actually solve the problem, and indeed I think the analogy is often used poorly, to suggest that it’s somehow smarter to not solve the problem. However, I’ve slowly come to realize that there is a good use of the analogy. I don’t know whether Alexander’s solution was a good one; he didn’t live long enough to prove it one way or another. But there are times when that is the right approach.

I apologize for the abstract nature of this post. I’ve been thinking of this in the context of working on Go. I think that working on the project has helped me a lot in terms of looking for the larger simplicity without ignoring the real complexity.

Comments (1)

Versioning

One of the very nice features of Go is the package system. This permits compilation to be much faster than C++. In C++, if library A depends on library B depends on library C, that generally means that header files in A include header files in B and header files in B include header files in C. That means that when compiling A, the compiler has to parse C’s header files. In a large program, each compilation winds up parsing a lot of lines of code. In real programs this very quickly adds up to hundreds of thousands of lines of code being parsed for each input file. To make matters worse, parsing happens to be the very slowest part of a non-optimizing C++ compilation, taking up nearly 50% of overall compilation time.

Go’s package system means that this does not happen. If package A imports package B but does not import package C, then when compiling A the compiler only needs to read B’s exported data, not C’s. It is possible for B’s exported data to include some of C’s types, in which case those types will be present in B’s exported data. In practice, though, this only happens for a limited subset of C’s types, and of course it does not include C’s functions, variables or constants at all. The effect is that the Go complier sees many many fewer lines of code during a compilation. Plus, of course, parsing Go is faster than parsing C++. This is the main reason why the Go compiler is so much faster than a C++ compiler.

Many languages have similar package systems, of course. It works very well for Go but it’s hardly a unique feature.

But what I want to talk about now is dynamic linking. An issue that always arises in dynamic linking is versioning. Your program expects a certain ABI. The shared object you link against provides a certain ABI. If those ABI’s don’t match, you are in trouble. C and C++ have no internal support for providing a consistent ABI. The developer of the shared library must arrange for that to happen through some other mechanism.

On GNU/Linux, this mechanism is symbol versioning. This works well for C code. When used carefully, it permits the shared library developer to provide backward compatibility for the library: a program compiled against version N of the shared library, and linked dynamically against version N+1, will run correctly. Symbol versioning is all done outside of the C language itself, using asm statements in the code and linker version scripts when creating the shared object. The programmer has to understand what changes affect the ABI and what do not. In general it requires great care to be used correctly to provide backward compatibility.

For C++ backward compatibility is much harder to implement. C++ programs are normally written using classes with methods. Some methods are inlined and some are not. A program compiled using a specific version of some header files will have some inlined methods and some not. Those inlined methods will imply a specific layout for the class. The effect is that almost any change to a class can change the ABI of a shared library. Providing backward compatibility means that you have to provide old versions of all the class methods which may not have been inlined. Of course that class may invoke other classes. In effect, for backward compatibility in the face of changes to class layout, the library has to have a copy of the old class as well as the new one, and the methods of the old class have to be given specific versions.

Unfortunately the current tools provide no good way to support this. Symbol versions are assigned based on the name of the function. In C code old copies of the function can be renamed in the source code, and then named back to the desired names, with specific symbol versions, via asm statements. Those asm statements refer to the mangled names of the functions, which of course in C is the same as the C names. In C++ the names are different. So the programmer who wants to provide backward compatibility has to rename the class and then write asm statements for all the mangled method names.

This is so hard to get right that, as far as I know, nobody even attempts it. This has real consequences. The type std::string in gcc’s standard C++ library is a reference-counted copy-on-write implementation of string. That was a fine implementation in the single-threaded world, but it is a poor choice in today’s multi-core multi-threaded world, because it means that the reference counts must be manipulated with relatively slow atomic instructions. The gcc library includes a more efficient non-copy-on-write implementation of string under the name __gnu_cxx::__vstring. It would be nice to be able to flip the default std::string to be __gnu_cxx::__vstring. But that would require using symbol versioning to provide std::string for all existing C++ programs.

One interesting possibility might be to add a class attribute which changes the name mangling used for that class. If that were done carefully, it might be possible to use that in header files to control the names used for all class methods. Then it would be possible to use a version script to give appropriate versions to all the methods of the two similar classes. This approach just occurred to me and I don’t know if it would handle all cases.

In Go the issues are different. In Go the external ABI of a package is provided in the export information. Any change to the export information may imply a change to the ABI (though the reverse is not true—a change to the export information need not imply a change to the ABI, such as adding a function). This has an advantage over C++: it is much easier to tell when the ABI changes. In particular, the export information generated by gccgo has a checksum. Although this is not currently implemented, the checksum could be stored by the compiler for all imported packages. Then the dynamic linker, or the Go startup code, could compare the checksums for all dynamically linked packages with the checksum stored at compile time. This would implement a simple and reliable test that the program had no ABI issues, something that in C/C++ is only possible if the shared library is developed by a skilled and careful programmer.

The equivalent of the C/C++ versioning scheme would be separate sets of export information for different versions of the shared library. There would have to be some day to indicate which functions/types/etc. were available in which versions, and the names they should have in those versions. I have not thought this through yet. But I’m fairly sure that Go’s package system can be used to make this aspect of dynamic linking more reliable and easier to use than C system, and much simpler than the C++ system.

Comments (3)

QE2

In private e-mail I got a pointer to one of the real goals of the Fed’s quantitative easing, one that we’re also seeing in other countries reactions to the Fed’s plans: to weaken the dollar in order to improve the U.S. trade deficit. This is a more plausible goal than the stated ones. By making it cheaper to borrow money in the U.S., the Fed is encouraging investors to borrow in the U.S. and invest the money abroad—the carry trade. This will tend to drive up prices abroad and in particular tend to drive up foreign currency relative to the U.S. That will make U.S. exports cheaper, and thus tend to reduce the U.S. trade deficit. This will tend to make U.S. exporters hire more people, improving the economy in the U.S.

This won’t work with China, of course, because the Chinese government controls their currency levels. But it will work with most of the other countries the U.S. is running a trade deficit with. Unless, of course, they respond in some way. The trick for them will be finding a way to respond which does not involve increasing their own deficit unsustainably and does not run them afoul of WTO rules.

It’s still a gamble by the Fed, but at least it’s one that might work. The biggest risk would seem to be a steady increase in subtle trade barriers, which will tend to hurt the entire world economy. There is a real sense in which China started the game of subtle trade barriers, and this may be the U.S. response.

Comments (9)

Quantitative Easing

The Federal Reserve Bank is about to undertake another round of quantitative easing, by purchasing up to $600 billion worth of U.S. Treasury bonds. The Fed gets to make up their own money—they don’t have to get that $600 billion from anybody else—so this is a way of expanding the overall money supply. The intent is to increase the price of U.S. Treasury bonds, thereby lowering their yield and making it more attractive for people with money to seek other forms of return. Seeking other forms of returns means investing elsewhere, and in particular it means making loans to businesses which use the money to create jobs.

On paper, there is no way that this can work. The problem with the U.S. economy right now is not a shortage of available money. In fact the interest rate on loans is quite low. The interest rate is so low that companies like IBM are selling bonds directly in order to borrow money at a low interest rate. In other words, this move by the Fed isn’t going to encourage IBM to invest any more: IBM can already get all the money it needs.

It’s small businesses which are having a harder time getting loans. They’re not having a hard time because the banks don’t have money to give them. They’re having a harder time because the banks are being more stringent on their loan requirements. The Fed is hoping that by making investing in Treasury bonds even less worthwhile, banks will start making more small business loans in order to increase their profits. This isn’t going to work for the big banks, which are scared because they still don’t know how much they have at risk during the slow moving unwinding of bad mortgages. They’re going to hold onto their cash. It could work for the small banks.

But even the small banks aren’t going to start loaning money to small businesses unless the small businesses start asking for it. And the small businesses aren’t going to start asking for it until people start spending more money. And people aren’t going to start spending more money until they feel that they have enough savings. The savings rate was very low for several years, around 1%, and is now up to a more respectable 6% or so. In the long run, more savings is good for the economy. In the short run, it means that people are not spending money, which means that small businesses are not starting up and hiring people.

What this means is that the Fed is pushing on a string. Increasing the money supply is a very indirect way of creating more jobs. On paper, the newly created money is just going to get saved, mostly by large banks to improve their balance sheets. In the current economy, the way to create jobs is obvious: give people money to spend. You can’t do with income tax cuts, because most relatively poor people—the people who would spend extra money rather than saving it—pay next to no income tax. You could do it with a payroll tax holiday.

Or you could do it with stimulus spending. However, the Fed isn’t doing that, because it can’t. The Fed isn’t allowed to just hand out the money they create. They can only do specific things with it, like buy Treasury bonds. The rest of the government could hand out money, but they are unable to due to political paralysis.

The recent election is not going to help in this regard, as the Republican platform appears to be extending the Bush income tax cuts and refusing to increase spending, neither of which is going to help the current economy at all. (In fact, of course, extending the Bush income taxes is going to increase the overall deficit far more than any plausible spending cuts, so the claims by people like John Boehner to be interested in deficit reduction without clearly describing the compensating spending cuts make no sense.)

So on paper the Fed’s move is going to be useless. It has only one prospect of working: by making it clear to investors that, despite the government paralysis, the Fed is going to keep expanding the money supply until something happens. If people come to believe that, then they will be that much more likely to start spending, companies will be that much more likely to start borrowing, and jobs will start to be created. It’s a very indirect way of working. I don’t personally think it is going to work. But it could happen.

Comments (11)

GCC Summit

I gave two talks on Go at the GCC Summit last week. The first was about the gccgo frontend: the IR and the gcc interface; for that one I wrote a paper, which is mostly just a miscellany. The second was a Go tutorial, focusing on the more unusual aspects of the language. The one paper and the two slide presentations can be found on the gcc wiki.

My personal favorite presentation at the Summit was by John Regehr of the University of Utah. He spoke about using random input files to test compilers. This is a standard security technique; the interesting aspect to his work was the effort to generate programs which were well-defined, thus permitting his group to test the optimizers. They were able to generate an astonishing number of compiler crashes and miscompilations in every compiler they tested.

Comments (2)

« Previous entries Next Page » Next Page »