Signed or Unsigned

C has always permitted comparisons between any integer type, and C++ follows its lead. Comparing signed types to signed types is straightforward: you sign extend the smaller type. Likewise, when comparing unsigned types to unsigned types, you zero extend. When comparing signed and unsigned types, the rules are less clear.

The C standard specifies a type ordering: long long > long > int > short > char. If the unsigned type appears in that ordering before the signed type, then the signed value is converted to the unsigned type. Note that this happens even if the types are the same size (e.g., either long long and long or long and int are often the same size). Otherwise, if the signed type is larger than the unsigned type, in the sense of having more bits, then the unsigned value is converted to the signed type. Otherwise both values are converted to the unsigned type which corresponds to the signed type.

Pre-standard K&R C used a different rule, but that is old enough now that we no longer have to worry about it.

What this rule means is that if you write portable code, such that you don’t know the sizes of types, you can not predict whether the comparison will be done as a signed comparison or an unsigned comparison. Therefore, the gcc compiler has an option -Wsign-compare. However, this option is sufficiently awkward to avoid that it is not part of -Wall, though it is part of -Wextra (the difference between -Wall and -Wextra is that the former gives warnings for which false positives are easy to avoid through simple code changes; the latter gives warnings which are generally useful but for which false positives are harder to avoid).

There are good reasons to use signed types: they don’t have odd behaviour around zero, so you can write i < limit - 1 without worrying about the case limit == 0. There are good reasons to use unsigned types for things like the number of elements in a container: you get the full range of sizes, rather than limiting yourself to only the positive half. In particular, the C++ standard containers use unsigned types as their size. Combining these two rules gets you in trouble with portable code. The only reasonable answer I can see for portable code is to use -Wsign-compare and work around the many false positive warnings.

Go avoids these problems in two ways. First, there are no implicit conversions, so you can never be surprised by having a comparison become unsigned when you expected signed. You have to explicitly say which type of conversion you mean. Second, Go intentionally discards half of memory, and takes the philosophy that if you want a container which can hold more values than fit in a signed int, you should write a special purpose large container.


Posted

in

by

Tags:

Comments

2 responses to “Signed or Unsigned”

  1. pixelbeat Avatar

    Writing a post on signed unsigned comparisons was on my todo list, thanks ! 🙂

    We’ve not enabled -Wsign-compare in coreutils at the moment, due to some ugliness that’s introduced in the code to work around the false positives. But it is a very useful warning to enable as I for one easily miss these. For a concrete example of a portability fix see:
    http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=e2dbcee4

  2. ncm Avatar

    Of course C++ has no choice but to follow C’s int semantics. It would be stupid to make an unconstrained language to follow C’s mistakes.

    I have occasionally seen code that assumed that signed ints roll over nicely to a negative value when they overflow, e.g. “if (i + 100 < i) ….". A good optimizing compiler will omit the comparison because signed overflow is undefined behavior. A better compiler will emit a warning about omitting the comparison. When last I checked gcc didn't.

    Off topic, I saw code recently like 'if (foo() == "bar")' which passed tests because most compilers merge (some) manifest string constants. I would expect a good compiler to warn about this, and treat the expression as always false.

Leave a Reply