GCC in C++

It is time to start using C++ in gcc. gcc was originally written in C. C++ has now advanced to the point where we can reasonably take advantage of the new features that it provides. The most obvious advantage would be in data structures. gcc implements data structures which are awkward to use for different types. With C++ they could become much simpler. The target structure is naturally implemented as a base class, which would simplify target code. The double-wide integer values could be naturally represented as a small class with operators, again simplifying the code and making it easier to understand.

This would be an easy transition, as the code is already almost completely written in the shared subset of C and C++. One of the arguments against converting to C++ is that the code would be less efficient, but it’s not as though the C code would become less efficient because we were compiling with a C++ compiler. Certainly we would have to pay close attention to efficiency with new changes, but that is no different from what we do today.

The other argument against C++ is that the language has too many complicated features. I think that gcc’s review system will ensure that new code is at least as readable as the old code. In any case programmers these days learn C++ in school. It is not so complex that gcc developers can not understand it.

The only real technical difficulty I see is that we would have to make bootstrapping work with the right libstdc++. I’m sure this is possible. We would also have to explicitly make sure that new versions of gcc can be compiled with old versions of gcc. This would be an addition to the release testing.

In the past Richard Stallman has objected to using C++ for gcc. I don’t know who he feels about it today. However, I believe that this sort of decision should be made by the actual developers.

If anybody has a principled argument against using C++ for gcc, I would very much like to hear it.

13 Comments »

  1. jldugger said,

    May 6, 2008 @ 11:33 pm

    The thing about C++ is that its the worst high level langauge out there. Template inheritance is about the most confusing thing I’ve seen in a language, and I think classes introduce a layer of ambiguity about what’s actually happening in the generated the code. I suspect gcc authors might know a bit more about this than I however ;)

    And it should be noted that my CIS department teaches C, Java, Python and OCaml, but not C++. C is useful and will be for the forseeable future, but what C++ brings to the table isn’t all that fantastic without an existing C codebase. So fewer schools are teaching C++, but C itself remains important as a systems language.

    This said, gcc is a unique project with a large C codebase and hopefully intelligent authors. If there are tangible benefits to the work you’d have to do in transition, you may as well try it.

  2. davem said,

    May 7, 2008 @ 1:14 am

    This would be an enormous project. I think you’re better off getting rid
    of GC from gcc, which would be so much easier.

    Sure, you could convert the gcc tree, and transition because the current
    codebase is using the C subset of C++.

    But any real use of C++ features, especially your target class idea, would
    take a lot of work. You’d either have to keep a “C” non-classed target interface
    around, add some kind of wrapper target class for “C” targets, or have a field
    day and convert everything in one go.

    I mean, by all means, if you’re motivated, go for it.

    But, actually C++ is one of the things I’ve grown to dislike about gold,
    especially after working with it a lot (writing a new target, fixing bugs)
    and looking at the generated assembler and bloat.

    I’m sorry to say this. The bloat from templates just to support both 32-bit
    and 64-bit sparc is just rediculious. And this seems to be endemic of
    templated code. The argument seems to be that the templates instantiations
    you aren’t using aren’t even executed or brought into memory, but as a long
    time C systems level programmer I find these kinds of waste unsettling. And
    the fact that C++ programmers consider it an acceptable tradoff… even more
    unsettling.

    Really, ridding GCC of GC is a so much more worthwhile and doable
    goal than a C++ conversion. Although, I suppose C++ is part of one of
    your imagined plans to rid GCC of GC. If so, good luck :-)

  3. tromey said,

    May 7, 2008 @ 6:13 am

    Many GCC developers don’t like C++ — so the principled argument here is “the developers should choose”. IME these developers are often wrong on the the merits, but … I don’t know, does that matter?

    Also, while I’m in favor of using C++, I am not sure that it will help with the bigger problems in GCC. Maybe we can more easily librarify a few bits by making use of an implicit “this”. But, there are plenty of other ugly things that still require lots of slogging. It may help a bit with de-GC-ification, but even that isn’t totally clear to me.

  4. atgreen said,

    May 7, 2008 @ 6:20 am

    Have any of the GCC hackers object yet based on compiler build times? Compare gold build times to that of GNU ld, for instance. I’ve been rebuilding GCC a lot recently for my ggx port, and it would be sad to see it get any slower considering the tremendous improvements we’ve seen over the years (mostly due to hardware improvements, granted).

  5. ncm said,

    May 7, 2008 @ 11:50 am

    Incidentally, “implicit this” is usually bad coding style.

  6. Ian Lance Taylor said,

    May 7, 2008 @ 8:55 pm

    jldugger: Interesting to hear that a school is teaching C but not C++. In any case other high level languages are not a realistic option for gcc.

    davem: I agree that get rid of GC is a completely separate project. And you’re right: I find the template bloat in gold to be basically irrelevant. If it made gold slower, then it would be highly relevant. But in fact, it doesn’t. I didn’t introduce it because I like templates, I introduced it because it makes gold faster. In any case I don’t see any reason to use that sort of technique in gcc.

    tromey: Sure, the developers should decide. I’m casting my vote.

    atgreen: This is perhaps overly cynical, but the best way to speed up C++ compilation times would be to make them matter for gcc programmers.

  7. pinskia said,

    May 9, 2008 @ 3:51 pm

    The main problem I have with C++ is STL and how the the C++ folks normally use STL in the wrong way. STL is a piece of *****. Strings in STL is a good example, there is no way to have a non mutable string and no way to have no initialization code for non mutable strings. Vectors is the same way but is less used that way.

    For the code I see here at Sony, they abuse templates to try to get around pointer aliasing issues and a couple other weird stuff. The wrap vector types inside a class even though we define addition and a couple other stuff like subscripting (which I am testing a patch to submit to right now). I can’t tell if they are doing this wrapping for portability or just for the fun of it.

    — Pinski

  8. pinskia said,

    May 9, 2008 @ 3:52 pm

    >Incidentally, “implicit this” is usually bad coding style.

    Considering from what I heard, the C++ standards committee has mentioned that implicit cast are bad and should never have gone in. I would have expected that they would be against implicit this also.

    — Pinski

  9. Ian Lance Taylor said,

    May 9, 2008 @ 4:43 pm

    Pinski: technically strings are not part of the STL. String initialization does require running a constructor, which I agree is not ideal. In fact I think it’s normally unwise to use a static string, as they add nothing over a const char array.

    Any l anguage can be misused. I think many would argue that gcc’s vec.h is a misuse of C. It would be much simpler–indeed, nonexistent–in C++. The interesting question is whether using C++, including the STL, makes it easier to write correct and/or faster code.

  10. ncm said,

    May 10, 2008 @ 4:01 pm

    pinskia: string isn’t part of the STL. It’s also an unfortunate textbook example of a standard-committee orphan. The original draft standard string was simple and clean. Then some busybody decided it was too simple, and made a proposal to mess it up badly. After that was accepted (at the meeting before my first) the original designer gave up. String got very little attention for yeas after that, except what I could spare to keep it from getting even worse.

    Stepanov has said that if he were starting over, STL components would have no public named member functions at all. Before template function overloading that wasn’t possible, but then they were added after it was too late to take out the member functions.

  11. Blaisorblade said,

    October 11, 2008 @ 4:46 am

    1) Not only GCC hackers don’t like C++, probably they are not (on average) so experienced. I have evidence of that for kernel developers – in one of the “Linux kernel in C++” discussions, somebody could claim that auto_ptr did reference counting (there was somebody knowing facts, luckily).

    Also, _any_ big project I’ve seen around, to be productive in C++, must prepare a lot of infrastructure, before getting productive. Well, in any no-batteries-included language actually.

    2) Who is claiming that GC is slow, over there, has any number to substantiate that? GC in early Java was slow, yes. Modern Java GC is faster than Boehm-Demers-Weiser conservative collectors, that can be faster than malloc().

    http://www.ibm.com/developerworks/java/library/j-jtp09275.html?S_TACT=105AGX02&S_CMP=EDU#resources

    3) C++ templates bloat are the poor man algorithm specialization.
    IMHO, the problem with templates is that they replace function pointers/virtual calls with function body specialization, but do so _always_.

    Virtual Machines (such as Java ones) can do adaptive virtual method inlining (a guard is added to distinguish the fast, common path from the slow one) and adaptive method specialization (generating a specialized copy of a generic method, for a given set of specific input types). Method specialization is what templates do.

    No C++ compiler, that I know of, even specializates templates by reinserting indirect function calls for some specializations (the less used ones). One has no choice. Virtual Machines can do that adaptively, so one can try to tune them to reduce the bloat while keeping performances high. Has anyone ever suggested doing that for C++ (with programmer allocations, like ‘inline’)?

    This is why I think in Java you can implement the Visitor pattern much faster.

  12. Ian Lance Taylor said,

    October 13, 2008 @ 9:18 pm

    Thanks for the comments.

    1) Compiler developers are not kernel hackers. They are entirely different sets of programmers, with a surprisingly small intersection. I assure that many, though not, of course, all, GCC developers have a deep and thorough understanding of C++. After all, somebody has to write the C++ compiler.

    GCC already includes a ton of infrastructure which already works fine in C++. There is little need for additional infrastructure.

    2) GCC uses garbage collection today. I claim, with, I think, some reason, that it is slow. Experience with Java is not directly comparable, as GCC is not written in Java.

    3) There are many advantages to JITs. There are also disadvantages, the most obvious being slower startup times, which does matter for a compiler.

    Are you trying to make an implicit argument that GCC should be rewritten in Java? That would be significantly more difficult than rewriting it in C++, sufficiently so that I doubt anybody would make the effort.

  13. Allan McRae » Blog Archive » GCC in C++ - One day this will feature a witty tagline… said,

    October 4, 2010 @ 5:57 am

    […] of C++ in the GCC codebase. This is not a particularly sudden decision… I originally saw this proposed by Ian Lance Taylor on his blog a couple of years ago. He also has some good slides about how using […]

RSS feed for comments on this post · TrackBack URI

Leave a Comment

You must be logged in to post a comment.