Archive for November, 2007

GCC Plugins

There has been recent discussion on the gcc mailing list about plugins. There was a very interesting paper at the last gcc summit about a plugin architecture, with some interesting examples.

I think plugins would be a useful addition to gcc. I think this mainly because many researchers and grad students want it. After many years, gcc is finally getting some traction in the compiler research community. That can only be a good thing for gcc, as it means that more people will learn gcc internals in school and be able to make useful contributions.

Plugins are also useful for people with special needs, like special gcc warnings or the more general case of static analysis. This is code which may not be appropriate for mainline gcc, but is still useful for a subset of the gcc user population.

The people opposed to plugins argue that it will make it too easy to extend gcc in a proprietary way. I think this is a red herring. It is already easy to extend gcc with proprietary code. Plugins will not make that more legal under the GPL, and they won’t make it significantly easier–the hard part is writing the code, not attaching it to gcc.

Plugins do make it easier to distribute those proprietary extensions, but I still think it is a red herring. There are no secrets in the compiler field; it’s not as though anybody can claim some sort of special advantage by keeping their plugin proprietary. And there is no money in the compiler field; nobody can become rich by selling their proprietary plugin, even if that were legal.

The decision as to whether to permit plugins is one about costs and benefits. The costs are low. The benefits are unknown but potentially high. To me the decision is a no-brainer. I’ve been surprised by the opposition I’ve seen on the gcc mailing list.

Comments (7)

The GNU Configure and Build System

The GNU Configure and Build System consists of autoconf, automake, and libtool. I wrote an essay about them a long time ago. Slightly more recently I was a co-author of a book about them.

David MacKenzie started writing autoconf way back in 1991. I was an early beta-tester and contributed some early features. autoconf generates a configure script written in portable Bourne shell script language. That makes sense. autoconf at the time was essentially an M4 script. That made sense at the time, but no longer does. Today autoconf is essentially a perl script that invokes M4. That doesn’t make sense today either.

automake is a completely separate program. It essentially provides a simplified Makefile language. David MacKenzie started writing it in 1994 as a portable shell script plus Makefile fragments. In 1995 Tom Tromey rewrote it in perl. I first used it in 1997, and added support for conditionals. automake works with autoconf. automake translates a Makefile.am file written in the automake language into a Makefile.in file written in the make language. The configure script generated by autoconf then applies substitutions to generate a Makefile from Makefile.in.

libtool is essentially a complicated portable shell script which creates shared libraries. It was written by Gordon Matzigkeit starting in 1996. libtool provides a standard set of commands for compiling and linking to create shared libraries, and for installing them. It does this in a rather baroque manner in which what appears to be library is actually a shell script pointing to the real library, which libtools reads when doing a link to decide just what to do. automake has support for using libtool.

This build system is used by many different tools, including pretty much all GNU tools. However, it has a major problem: it is much too complicated. It is written in portable shell, perl, and m4. It does not work in a intuitive or transparent manner. autoconf input files mix shell and m4 code. automake input files mix the automake and make languages. libtool hides its files in a hidden directory named .lib. The system does work fairly well. But it is hard to understand and hard to debug. And configure scripts generated by autoconf are slow. In a distcc environment, it is not uncommon for it to take longer to run the configure script than it takes to build the program.

We have to get rid of these tools. There have been many alternative build systems. Not of them have caught on, at least not in the free software world, because they have to be installed before they can be used. The great advantage of the GNU system is that it requires nothing more than portable shell and portable make.

However, we can do better today. The GNU make program is very powerful, much more powerful than the original make program. GNU make is installed on every popular development platform. Let’s take advantage of it by assuming that it is there.

The goal would be to have people write a single Makefile. instead of configure.in and Makefile.am or Makefile.in. (We would probably keep a configure script for compatibility, and to record options like –target). The Makefile would start with a standard include directive which provided all the default commands. The rest would just be a standard GNU Makefile with some restrictions, like no %.o: %.c command, and with some standard variables to describe compilation options and the like.

The standard include would ensure that every compilation depended on config.h. config.h could be built by grepping the source code for HAVE_xxx strings and the like. Each string would map to a test which would have an appropriate rule. These rules could be run in parallel to generate tiny header files with the appropriate #define. config.h would include all the tiny header files.

This would provide an interface similra to autoconf/automake, but simpler to use and faster to run. Of course there would be many details to work out. But I think an approach along these lines has real promise to get us out of the current quagmire.

Comments (14)

Pakistan in the News

When the war in Iraq was under discussion, but before it started, a friend of mine said: if you’re worried about a country which has WMDs and which supports Al Qaeda, looking at Iraq doesn’t make any sense at all. Look at Pakistan. They’ve already got the bomb. They helped create the Taliban in Afghanistan. They may be sheltering bin Laden–even at that time the common speculation was that bin Laden had crossed the border to Pakistan (of course, the term “border” is a complete misnomer for the unmarked unpatrolled line on the map which separates Afghanistan and Pakistan’s so-called “tribal areas”).

The only thing Pakistan had going for it at the time was that its local dictator, Musharraf, swore eternal allegiance to the U.S. Since then, nothing has changed. They still have the bomb (and we now know they were selling the technology to other countries). They still support the Taliban (at least, the well-funded and well-armed security service does). The general consensus is that bin Laden is happily living somewhere in the tribal areas (of course this could turn out to be wrong, but I know of no reason to think that it is).

Now Musharraf is showing his dictatorial colors even more clearly, not that they were at all hidden before. How long will we continue to support him? What will we do when he inevitably falls? We’ve got most of our military tied up in Iraq. If a radical Islamist government takes over in Pakistan, they could be a much bigger threat to the U.S. than Iraq could ever have been. Besides all the other reasons that the invasion of Iraq was a bad idea, it was a terrible worst case analysis.

That said, there is no reason to think that there will be a radical Islamist government in Pakistan. There is a solid bloc of Pakistanis who would be strongly opposed to it. But ensuring some kind of control over Pakistan’s nuclear weapons should have the highest priority for the U.S. The current Pakistani government does not want radical Islamists to get ahold of them. We should take advantage of that to work toward better control.

Let’s not forget that South Africa actually destroyed their nuclear weapons when it became clear that their government was going to change. So there is a precedent for that, although unfortunately not one that Pakistan is likely to follow.

Comments

Increasing Inequality

Inequality is increasing in the U.S. The real income of the bottom 20% is stagnant or even decreasing. The real income of the top 1% is skyrocketing. The effect is that the difference between the rich and the poor is getting steadily larger.

Is this bad? Extreme disparities of wealth is common in third world countries, but those countries are different from the U.S. in many ways. The U.S. has seen extreme wealth inequality before, during the Gilded Age around the turn of the 20th century. I’m not sure but I believe the inequality dropped somewhat during the 1920’s, helped by the introduction of easy credit. Then of course the Depression led to the backlash of the New Deal.

Increased inequality seems to lead to a loss of social cohesion. But the superrich are a small percentage of society, and it’s not clear how much they participated in society anyhow. A poor person envious of a rich person can go one of two ways: “take their money” or “earn my own money.” Pursuing the latter course is generally good for society. The degree of difference between the rich and the poor may not matter very much when it comes to making that sort of choice.

The wealthy, at least the ones who give interviews, often significanly discount the role of luck in the positions they’ve attained. In winner-take-all capitalism, somebody often does win. But exactly who wins is a matter of luck as much as anything else. However, there is nothing deeply wrong with this attitude, it’s just fatuous.

But can we expect any long-term bad effects from the increased inequality? I don’t know. It’s not obvious to me why we should. The worst possible direction would be increasing social unrest, tied to further separation of the wealthy from the rest of us.

Is the income inequality good? It’s pretty hard to see how. A capitalist society works best when some people are rich, so that everybody else has something they can work toward. But that doesn’t require anything approaching the kind of inequality we see today.

Should taxes be raised on the rich? In my view, unquestionably. Even relatively small adjustments would raise significantly more money which could be used to pay for universal health care and to pay down the significant government debts. It’s ludicrous to think that raising taxes on the rich would somehow dampen people’s desire to become wealthy themselves. And it’s ludicrous to claim that the wealthy owe nothing to the society which made it possible for them to earn their money. It is a particular scandal that social security taxes have a limit.

Comments (17)

Hyperthreaded Memory

One thing I didn’t really touch on in my earlier notes on multi-threaded programming is memory. As processors become increasing hyperthreaded and multicored, access to shared memory becomes the bottleneck. The obvious recourse of processor designers will be to break the sharing: each processor will have its own memory. We already see this in the Cell. And already in multi-core machines different processors have different local memory caches, although some multi-cores share an L2 cache. Those machines use complex cache snooping to maintain memory coherency among the processors.

So the highest performance of future programs is going to require many threads with processor affinity for threads, where the threads do not communicate via shared memory. Any access to shared memory is going to be a choke point, so people are going to want to write their programs to only access local memory.

That is probably a good thing for our future programming models. The difficulties with the multithreaded programming model all center on shared memory. If memory is not shared, we are in much better shape.

In this model, we need high bandwidth communication between the processors which does not to through shared memory. Ideally this will be modeled as a communication queue which can exist entirely in userland. Then different threads can exchange data via these communication queues. Presumably we would put a function call interface over the queues as well.

This model is really communicating processes rather than communicating threads. Without shared memory they would only really be threads in that they would share the same instructions (paged in from the same program file) without sharing memory. Creating a new thread would be calling fork and breaking the processor affinity.

Shared memory would still be possible, of course, via the paging system. However, it would most likely require explicit acquire and release calls to control access to it.

Although it’s easier to write correct code for this model, but it’s harder to write code in the first place. Casual sharing would be forbidden. Would people be willing to accept it? Is there an alternative model which gets us around the shared memory bottleneck?

Comments (4)

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »