Archive for Programming

Public Development

Although it hasn’t been my habit to track other blogs closely, Ben Collins-Sussman wrote an interesting post about Programmer Insecurity. The gist of the post is that programmers need to share code early in the development process. They should not develop code in their own cave, and then unleash it on an unsuspecting world.

This is particularly an issue for the case of modifications to an existing free software project. And it’s particularly true for design rather than code. It’s most important to do your design in the open, and let other developers comment on it. Unfortunately, there can be a catch-22 here: a design without code can be vaporware, so people won’t necessarily pay attention to it. But a poor design supported by good code will also not receive a good reception.

The flip side to public development is that free software programmers can be quite harsh. This is magnified by the nature of the medium used to communicate, e-mail, and it is magnified by the fact that people from different cultures have different expectations. It’s very easy for people to send an e-mail message which is intended to be friendly criticism but which is received as an attack. I’m sure that many free software projects have permanently alienated new developers through this sort of mistake.

So my advice for new contributors is to design in the open and to learn how to read e-mail to extract the useful information while ignoring the attacks. These are two separate skills which need to be developed by aspiring free software programmers.

And my advice for people who want to become maintainers is to develop the skill of making helpful comments which are not disparaging. This requires the use of euphemisms and careful attention to language, which are not characteristics of all good programmers.

I’m not a particularly nice guy on the inside, but I’ve learned to play one by applying a set of transformation to my thoughts. A few examples:

  • You’re wrong => That turns out not to be the case.
  • This is stupid => I think you need some more thought here.
  • Learn to use the indent program => You should add spaces there, there, and there.
  • We already know this is a terrible idea => You may want to review this e-mail thread.
  • Have you heard the word “portability”? => You need to consider other types of processors.
  • What idiot told you to do it this way? => Here are some good examples that you may want to follow.

You get the idea. The point is two-fold: developers should try to extract facts from e-mail while ignoring the language, and developers should try to use friendly language to put across their points.

Comments (5)

Fast Development

I recall a Microsoft magazine ad from the mid-1980’s, back when Microsoft was best known for the compiler. The ad was four pages long. The first page said that their new development environment had the three things every programmer wanted. I didn’t use Microsoft tools in any case, but when I looked at that page, I immediately thought “I only want one thing from my development environment: I want it to be faster.” The “three things” were on the next three pages of the ad: they were speed (of compiling the program), speed (of debugging the program), and speed (of the generated code). That was an ad writer who really understood what programmers want.

I’m a long-time emacs user and I’ve never used a more integrated development environment. I played around with Eclipse a few years back and at the time it was slow. In emacs I turn off font-lock mode because it is slow. When it comes to writing code I want to be able to just write.

Of course Eclipse, and font-lock mode, bring another advantage: they make it more likely for your code to be correct as you write it, since they can point out typos right away. I typically don’t find typos until I compile the code. So I definitely see an advantage to using an editor which parses your code as you type–as long as you never have to wait for it.

Speed of compilation is still an issue, at least with C/C++. Using a compilation cluster is easy with distcc, as long as you have a few machines to run compiles on. You’re still limited by how long it takes the largest piece of your program to compile, which can easily be 30 seconds. And of course the link time is serialized, although I have ideas on how to speed that up. The ideal compilation time should be no more than a few seconds–short enough that you don’t switch onto another task. I have not used Eclipse for Java development, but I gather that compilation can be very fast. I hope that Tom Tromey’s work on incremental compilation can help with C/C++.

I don’t personally find that speed of debugging is much of an issue for me these days. The debugger is not the first tool I reach for to find problems in code, and when I do, I normally find that gdb is fast enough.

Speed of development is one of the big advantages of the interpreted languages like Python. There have been interpreted environments for C/C++ in the past, trying to provide the same development time benefits while still permitting fast execution. None of them have caught on, and these days a development environment will only catch on if it is near zero-cost. I’ve never used one, so I don’t know what it is like.

In any case, I still fully agree with that old Microsoft ad: the only way to judge a development environment is by how fast it lets you write code, by how fast you can debug the code, and by how fast the program runs. Any bells and whistles which aren’t directly related to making things faster are irrelevant.

Comments (4)

Newsqueak

A mention of Squeak in a comment on my last post reminded me of Newsqueak, an interesting little language by Rob Pike. Newsqueak has nothing to do with Squeak. Newsqueak implements Hoare’s idea of Communicating Sequential Processes.

The interesting part of Newsqueak is the channel data type. A channel is a two-way communication path. Given a variable v of type chan of int, the assignment c< - 3 sends 3 on the channel, and the expression < -c receives the next value on the channel.

Thus a channel is basically a Unix pipe, represented as a fundamental data type. It can be used for synchronization as well as communication. This should make it easier to write safe multi-threaded programs.

I've never programmed in Newsqueak, and I don't know if there are even any implementations out there outside of Plan 9. And of course it's possible to use Unix pipes to communicate between threads today, although the required kernel calls and marshalling may make them less efficient than in-process channels. Still it's interesting to think about threading using channels.

Comments (5)

Scheme

When I was in school I wrote programs in Scheme and its variant T. I still remember that as the easiest programming language I’ve ever used. In Scheme you never waste time on pointless boilerplate. You just write code. In order to run some function on a bunch of data, a very common operation, you just write a function closure, which is trivial. Dynamic typing means that you don’t waste time writing types.

In other words, Scheme has all the advantages of today’s interpreted languages like Python and Ruby. It is more powerful in practice, because Scheme makes it very easy to manipulate and evaluate Scheme code itself, something which is not feasible in most other languages. This means that Scheme can be its own macro processor.

And of course it is possible to compile Scheme to reasonably efficient machine code–not C/C++ efficient, but not bad.

So why hasn’t Scheme caught on? It still lives in various niche environments, but it is not popular. Is it as simple as people not liking prefix notation?

When people ask me what they should do to learn to program, or more commonly these days what their teenage kids should do, I always recommend Abelson and Sussman’s book Structure and Interpretation of Computer Programs (which can now be read online). It uses Scheme. It’s the best introduction I know to what computer programming is. It’s only an introduction, of course; it doesn’t cover the issues which arise in the workplace. But I think that anybody who wants to be a programmer has to be able to master the material in that book.

Comments (8)

Multithreaded Garbage Collection

Garbage collection is the traditional solution to the problem of managing memory. Multithreaded programming is the current wave of the future. I’ve written about the difficulties of multithreaded programming before, but people are going to do it regardless. In which case: how do we garbage collection in a multithreaded program?

Let’s assume that we don’t want to halt the whole program during garbage collection, because that is expensive. In that case, it’s not too hard to understand how it can be done if you can 1) halt the whole program (other than the garbage collection thread) for a brief period; 2) any change to a heap object will put the object on a list of changed objects; and 3) you can assume that all pointer loads and stores are atomic with respect to each other. Then the garbage collection thread can halt the program while it scans the roots, let the program run while it does a mark pass, halt the program again and scan the changed objects, and let the program while it does a sweep. (This has to be an in-place garbage collector, not one that moves the valid objects).

It’s possible to implement those requirements for an interpreted language like the traditional setting for Java. You can still JIT code that uses the heap, and it will help to do some escape analysis to see whether a heap pointer can possibly escape the function.

I don’t really see how to implement those requirements for a native code language like C++. In particular tracking the changed objects seems somewhat painful. There was a garbage collection proposal for the next C++ standard, though I believe that it may have been withdrawn. But I don’t see how to implement garbage collection efficiently in a multi-threaded programming. I did some web searches, but the most helpful sounding ideas I could find were all in academic papers which weren’t online. I wonder if there are any actual implementations which try to implement my suggested requirements.

Comments (3)

« Previous Page« Previous entries « Previous Page · Next Page » Next entries »Next Page »