GCC Inline Assembler

GCC’s inline assembler syntax is very powerful and is the best mechanism I know of to mix assembly code and optimized C/C++ code. It lets you take advantage of assembly features like add-with-carry or direct calls into the kernel without losing optimizations. I don’t know of any other approach which supports that.

That said, the inline assembler syntax is also a set of traps for the unwary. Because the compiler applies optimizations around the assembler code, the inline assembler construct must precisely describe what the inline assembler code does. This is done by using constraints and by listing registers and memory that are clobbered—changed in a way which can not be easily described. Constraints are underdocumented, machine specific, and easy to get wrong.

For a complex and underdocumented construct like inline assembler, it is naturally tempting to simply copy some existing example. Unfortunately, even minor changes to the assembler code can require changes to the constraints. Unfortunately, there is no automated way to check whether you got them right. Unfortunately, it is common for incorrect constraints to work fine in simple cases and break in complex one, or to work fine with one gcc release and break with another.

So using inline assembler really requires reading and understanding the documentation. In particular the = and & constraints must be used correctly. On non-orthogonal machines like the x86 the register class constraints must be used correctly. In many cases it will be better to simply write the assembler code in a separate file and call it.

Several years ago I sketched out a different approach that might be easier to use in some cases. However, actually implementing something along those lines requires embedded the assembler into the compiler. This is unlikely to ever actually happen. I’m certainly not working on it.


Posted

in

by

Tags:

Comments

6 responses to “GCC Inline Assembler”

  1. ppluzhnikov Avatar
    ppluzhnikov

    Sun compiler supports this in arguably cleaner way:
    http://groups.google.com/group/comp.unix.solaris/msg/4fa096b9fe24c2bc

  2. Dan Villiom Podlaski Christiansen Avatar

    I believe Apple added support for assembly blocks when using the -fasm-blocks flag. How does your suggestion compare to it?

  3. Ian Lance Taylor Avatar

    Thanks for the comments. The problem with both the Sun’s compiler approach and the Apple compiler approach is that they don’t (as far as I know) allow you to express clobbers. That means that you can’t use them to implement supervisor calls (or system calls if you are using a regular Unix kernel). The supervisor call will clobber some specified set of registers. You need to be able to explain that to the compiler so that the optimizers understand what the asm block is doing.

    gcc’s asm syntax allows you to specify exactly what the asm generates, and thus the optimizer can actually delete the asm if it is unused (and not volatile). This is very useful when working with things like add-with-carry, where the optimizer can reasonably determine that the result is not needed in some cases.

    In other words these other asm syntaxes are easier to use but they are not, I believe, as powerful.

  4. davem Avatar

    As much as I love GCC’s inline asm feature (and no, Sun’s is definitely
    not better, it is so much less powerful) one thing that irks me is that
    there is no way to interface with the condition codes.

    Actually, because of this, I kind of find it amusing that you mention add
    with carry instructions, heh.

    So you can’t use inline asm to generate condition codes that if
    statements act upon, for example. And this leads to a lot of extra
    unnecessary code.

    it also makes things like extremely space and time efficient assertion
    implementations not possible either. We have this macro called BUG_ON()
    in the Linux kernel that just traps on a given condition, but with certain config
    options enabled it also will provide the source file and line number in the
    logs when it triggers.

    The most optimal implementation would be a builtin_trap() that allowed
    annotations to be assosciated with the trap instruction address. The
    power of builtin_trap() is that it is “noreturn”, and also on platforms like
    sparc we can use conditional trap instructions. Nearly zero cost assertions 🙂
    But when you try to use inline asm to add instruction pointer based annotations
    you can’t do anything sane because getting at the condition codes calculated
    by an expression is simply not possible.

    This applies both on the way into the asm and also on the way out.

    It would be really hard to provide this kind of facility for a number of
    reasons. For one thing, gcc likes to invert tests and use the reverse
    branch during optimizations and code generation, and there is no
    easy way to express that in the asm syntax.

    I think the ARM backend of GCC tried to support condition code access
    with some kind of asm syntax, but that got ripped out because it could
    never work reliably.

  5. alexr Avatar
    alexr

    Metrowerks’s implementation (not merely the syntax that Apple cloned with -fasm-blocks) was capable of scheduling and even optimizing inline asm.

    I consider clobbers and the other GCC markup to be abhorrent. Everybody gets them wrong. I mean everybody: kernel engineers and GCC committers alike.

    It’s way past time for GCC to generate object files directly.

  6. Ian Lance Taylor Avatar

    davem: you are of course right about condition codes.

    alexr: you really do need clobbers if you want to use inline supervisor calls. I don’t see any way around it.

    Generating object files directly would help somewhat with inline assembler, assuming gcc included an assembler, but it wouldn’t help much otherwise. Several years ago we looked into it, but we realized that it would only help with compile time, and the savings in compile time would be quite small–a couple of percent. The assembler is fast.

Leave a Reply