Generics

The goal of generics in a programming language is to save programmer time. Generics permit you to write code that works with multiple types only once. They also permit one programmer to write generic algorithms which you can use with your program’s types. This kind of thing is attractive to anybody who has written a lot of C code, where you find yourself writing the same linked list and hash table code over and over again. The core code will use void * pointers but for everything else you need type casts. Those type casts are mechanical and tedious.

I would say that the most significant advantage of C++ over C is the C++ mechanism for generic programming, which is templates. C++ has a powerful template system which is somewhat like a macro system built into the language proper. It’s not really a macro system—in a true macro system the code would be parsed only at template instantiation time, but in fact C++ code is parsed at template definition time—but I’ve discovered that people who are not deeply familiar with the language think of it that way, and it is an effective approach. C++ pays for this power in compilation time, as each template must be separately compiled for each type for which it is instantiated. It also pays in language complexity, which is most visible to users in the widely noted incomprehensible error messages. For a while the new C++ standard introduced the idea of concepts which would at least improve the error messages, but concepts themselves turned out to be very complex and were dropped from the standardization process for now.

Other languages have other approaches to generics. I know these languages less well, so I hope I don’t make any mistakes. Java restricts generics to types which are inherited from Object, which permits the templates to be instantiated a single time; a cost of Java’s approach is that the actual types are lost (this is known as type erasure). C# instantiates generics at runtime using a JIT; all types which are representable as pointers can share a single instantiation. ML has two different generics implementations: simple polymorphism which can be used for, e.g., type-safe containers, and a powerful system in which you can write a transformation which takes a set of types, variables, and functions and produces a different set; the latter system is pretty much as powerful as C++ templates but does not need to be compiled each time (I think) and produces comprehensible error messages. LISP has an extremely powerful macro system which simply rewrites the program, something which is relatively easy to do in LISP due to the simple syntax; this is so powerful that it lets you add new syntactic constructs to your program.

I’ve been thinking about generics because of Go. Go’s goals include fast compilation and efficient execution. Those goals are in conflict when it comes to generics. Go is not interpreted, and it is not the case that every type looks like a pointer. Therefore efficient execution suggests separate compilation for each instantiation, which of course slows down compilation time. ML’s approach is quite powerful, but it may be complex to use in simple cases, and it’s hard to understand how to implement it for Go. Relying on every type matching the empty interface doesn’t help when somebody wants to write a generic function for chan T, since chan int is not the same as chan interface{}.

Implementing generics in Go looks like a struggle among the goals of fast compilation time (i.e., avoid multiple compilation of generics), easy programming (i.e., have some form of generics), and fast execution (i.e., different types are represented differently). All languages have those goals in different degrees, and all languages seem to compromise on one of them when it comes to implementing generics. The question for Go is whether we can find an appropriate compromise.


Posted

in

by

Tags:

Comments

18 responses to “Generics”

  1. Octoploid Avatar
    Octoploid

    If you want modern system programming language to succeed
    in the marketplace today, generics must be the very core around which all other language features circle.
    And not an afterthought and a hack as in »Go«.
    I wish you the best luck, but I think »Go« is doomed because
    of this, although it is a nice language.

  2. Ian Lance Taylor Avatar

    Can you name a modern system programming language which was designed with generics at the core?

  3. fdeweerdt Avatar
    fdeweerdt

    Haskell ? Just kidding.

    Although it makes it easy for you to shoot yourself in the foot, I kinda like the idea of a templating system being just a text manipulation tool (a la C).
    I think that the C macro language is too simplistic (no loops, functions, variable passing etc..), but the preprocessing of text at compile time is quite elegant, IMO. You can even do way more that generics.

    The meaningful error message issue might come into play though.

    Regards,
    Frederik

  4. Ian Lance Taylor Avatar

    The main reason that error messages for C++ templates are so incomprehensible is that there is no way to describe any requirements on the type used to instantiate the template. If you accidentally try to instantiate a template using an inappropriate type, you wind up getting a complex set of errors referring to both the template code and your code, with the root error being something like “no such method.” Concepts were an attempt to address that issue in C++0x.

    In Go we could avoid that sort of issue by having the generic specify an interface which any instantiating type must meet. The compiler can confirm that the generic code only uses methods provided by that interface. When the generic is instantiated, the compiler can confirm that the instantiating type does in fact meet the interface. So I think the the major class of incomprehensible error messages can be avoided.

  5. Octoploid Avatar
    Octoploid

    >Can you name a modern system programming language which was designed with generics at the core?

    No, but I was thinking of Stroustrup’s mantra: “Inside C++, there is a much smaller and cleaner language struggling to get out”¹. And initially I hoped that »Go« might be that language, but it isn’t because of its lack of a generic core.

    ¹ See his HOPL-III paper, chapter »Beyond C++«:
    http://www.research.att.com/~bs/hopl-almost-final.pdf

  6. jldugger Avatar

    Actually, Java recently supports generics in close syntax to C++. There may be some subtle semantic difference I’m unaware of, but it solves type erasure.

    It’s probably impossible to implement a systems language with generics; what you need is a type system that reflects the hardware: registers, pins, memory mapped devices, etc. C gets many people far enough, mainly by substantial use of void casting.

  7. Ian Lance Taylor Avatar

    Thanks for the comments.

    Go can support hardware level programming, but that’s not really what I mean by a systems language. I mean a language which can be used for writing web servers, databases, etc. In that sense, C++ is certainly a systems language.

  8. andhow Avatar
    andhow

    On the subject of C++’s template system, I’ve found that there are really two distinct uses being serviced by templates. On the one hand you have the ability to write generic algorithms/data structures as you’ve described and as typified by the STL. On the other hand, you have a rather awkward system for writing meta-programs that are evaluated at compile time. While concepts will help some of the usability woes of templates-as-generics, templates-as-meta-programs will remain as arcane and unnatural as ever.

    If you are considering generics for Go perhaps you could consider giving C++-like meta-programming “first-class” support since, at least from my experience, it has so much to offer the systems programmer. By first-class, I mean being able to write a meta-function as an actual function over types/values/functions instead of:

    template struct MetaFunctionName {
    typedef ReturnValueExpression result;
    };

    being able to do real reflection (in meta-programs) instead of:

    template struct IsConst { static bool result = false; };
    template struct IsConst { static bool result = true; };

    and being able to compose code and data types using traditional data manipulation instead of:

    template struct Compose {};
    template
    struct CopyEach<Cons> : CopyEach { Head h; };
    template struct CopyEach {};

    or at least thats what I wish for several times a week…
    (P.S. I enjoy your blog)

  9. andhow Avatar
    andhow

    Oh dear, all my angle brackets are gone. Oh well, hopefully the idea gets through…

  10. Ian Lance Taylor Avatar

    Go has full type reflection for all types at runtime, and it has a type which all values can be assigned to at runtime (interface{}), so it’s not clear to me that C++ meta-programming gives much additional expressive power to Go. It could make a program more efficient, I suppose, but I’m not sure it could make a program noticeably easier to write. Though it’s also quite possible that I’m missing something.

    (I think that to get angle brackets you need to write ampersand-l-t-semicolon and ampersand-g-t-semicolon.)

  11. ncm Avatar

    I completely disagree with the claim that “The goal of generics in a programming language is to save programmer time.” It’s one effect, but of little value. This matters, because Ada’s and Java’s generics appear to have been designed as if that were the intent, and they’re both failures, not useful enough to justify the extra complexity. (Ada’s designers had an excuse.) C++ generics, like Haskell’s, make the language fundamentally more powerful, in that they allow you to write libraries that would otherwise be impossible.

    Efficiency is part of expressive power. If you don’t care about efficiency, a Turing Machine suffices.

  12. Ian Lance Taylor Avatar

    But anything you can write in a C++ template, you can write without using templates, by instantiating the templates yourself. You just wind up with a lot of duplicated code (possibly an exponential amount of duplicated code). That is the sense in which I mean that templates exist to save programmer time.

    Yes, C++ templates also give you a more efficient runtime than, e.g., Java generics do. They do that at the cost of increased compilation time and larger programs.

    Perhaps you mean something else, but I’m not sure what.

  13. andhow Avatar
    andhow

    True, runtime reflection/meta-programming is more powerful. However, the associated runtime cost of runtime meta-programming can lead to a conflict between elegance and efficiency. With compile-time meta-programming, you can run wild to achieve the desired library interface knowing that, as long as it meta-evaluates down to efficient runtime data structures and code, you are perfectly justified in writing sophisticated meta-programs. Not so when meta-evaluation occurs at runtime.

    As for slow compile time, a good deal of it is not essential to the problem. Having an actual interpreter evaluate meta-programs, instead of having the compiler simulate beta-reduction via template instantiation, would surely lead to orders of magnitude improvements over templates-as-meta-programs.

    Question for you: does the way Go build programs allow it to make any improvements over the way C++ compilers re-compile and instantiate templates in each translation unit?

  14. dan.berindei Avatar
    dan.berindei

    In the C# case I don’t think the JIT makes any difference: both .NET and Mono have ahead-of-time compilers that handle generics just fine.

    It’s also possible to have code sharing in C++, but most compilers don’t do it automatically – you have to write a partial specialization templated on T* and delegate the implementation to non-template functions.

    I personally don’t consider template code bloat is such a big problem in C++ because a lot of the generated code ends up being inlined anyway – so it doesn’t matter how many different templates you instantiate, it only matters how many call sites you have. But instantiating all those templates does take its toll on the compile time…

  15. Ian Lance Taylor Avatar

    The only advantage I see that Go has is that the compiler imports compiled packages rather than including uncompiled source code. That means that there is a known place where generic instantiations could be stored. So the compilation process could first see whether a generic has already been instantiated for a specific type or types, rather than having to instantiate it in every compilation unit.

    I would have to think about using an interpreter to evaluate meta-programs, that sounds quite complex to me. I’m not sure that relying on runtime reflection is all that bad when you consider the possibility that the compiler can optimize the runtime reflection code using the known type information. Admittedly that approach can not easily eliminate unnecessary fields in generic data structures.

  16. ncm Avatar

    “But anything you can write in a C++ template, you can write without using templates, by instantiating the templates yourself.”

    Who is this “you” person? Very typically, the person who wrote a library has a completely different skill set than the person writing the program that uses the library. In the case of VSIPL++, for example, the people who wrote it know about optimizing fundamental matrix and signal processing operations on a wide variety of multiprocessor architectures, while the person using it knows about axial tomography or terminal ballistics.

    The power of C++ templates lies not in instantiating trivially variant copies of big functions you write, but in what they can make happen at compile time, stitching together fragments of code into a whole you might not otherwise know how to write, and that the library coder could not have foreseen.

    But the the “most significant advantage of C++ over C” isn’t templates. It’s the destructor. Templates come second, exceptions third.

    David Vandevorde has a design for a module system for C++ that he will probably propose for ISO C++1x. It will fix much of the compilation speed problem. It would be nice to have an equally powerful language not obligately compatible with C, and (therefore) much simpler, but we don’t, and nobody’s working on one. (Daveed was, for a while, but he stopped.)

  17. Olreich Avatar
    Olreich

    Generics are, from what I can tell, a form of function creation that allow one to define functions that are not based on any specific types, but rather that are based on specific functionality of those types, such as addition, multiplication, the ability to be stored, etc.

    So, taking the List idea from Java.util.List (I think) and the List implementation in C++, it could easily be reproduced with a Vector based on the interface{} type. Since the interface{} satisfies all types, one can then store anything. As for having specific abilities being required, use a different interface. The one issue is that once there are multiple different types that need to have different abilities for a generic function to work, the system breaks down, because one would have to tie any code based on the interfaces into the Generic library.

    So, if Go already has something that can mimic Generics, but is limited by the requirement of linking all of the implementation packages to the Generic package, I think I have a solution.

    With interfaces, you can define interfaces that rely on one another, but you cannot define interfaces that reference other interfaces dynamically. Example:

    type A interface {
    Assemble(int) B
    }

    type B interface {
    GetAmt() int
    }

    The A interface does not accept any type that has the function Assemble and a return type that satisfies B, rather it looks for a return type of B. This is a minute, but significant difference. Suddenly, I cannot create an orthogonal function that is independent of the interfaces, rather I must use the B interface to get something that implements the A interface.

    This, in my opinion is a problem, though I’m not sure how easily it is solved. Syntactically I would envision something readily simple, like so:

    type A interface {
    Assemble(int) satisfies(B)
    }

    This would then check that the return type of Assemble satisfies interface B, allowing any type that satisfies B to be the return type. I have no idea how much complexity this would add to the language, or even if that is a good syntax to use, but the concept would seem to fulfill all generics desideratum and maybe even take it a step further.

    Of course, my experience with Generics is limited, as I’ve not used them extensively. So, if this is wildly off-base or doesn’t fit the bill, please divulge the bill or explain what’s wrong with it.

  18. Ian Lance Taylor Avatar

    Thanks for the comment. In your example

    type A interface {
    Assemble(int) satisfies(B)
    }

    where you want to permit the function to return a value of any type which satisfies the interface B, how do you implement a value of type A? In a normal interface, the caller knows the type it will receive. In your example it does not. I don’t see how to implement that in a compiled language like Go.

Leave a Reply