Having just read Graham Lee’s latest post, “What happens when you add one to an integer?”, I had a couple of comments.
First, Graham asserts that “computers don’t natively handle negative numbers”. This is sort-of true, but only sort-of. It is true from the perspective that there is no way to store a ‘+’ or ‘-’ sign in a flip-flop; it would have to be encoded somehow. That’s a pretty silly argument, though, because technically you aren’t storing ‘0’s and ‘1’s in your flip-flops either; it’s really an electrical charge that you’re dealing with. In any case:
-
Most computers use 2’s-complement arithmetic, under which addition and subtraction are identical whether you’re dealing with signed or unsigned numbers.
-
Most computers that use 2’s-complement arithmetic have separate instructions for signed multiplication and division. Some also have “arithmetic shift right”, which shifts all the bits one to the right, leaving the top bit alone (as opposed to “logical shift right”, which tends to zero it).
-
Most modern computers support IEEE floating point, which explicitly does support negative numbers (it uses a sign/magnitude representation).
Some older machines use 1’s-complement arithmetic or sign/magnitude even for integers. That’s pretty unusual, though, and you’re unlikely to run into such a device.
One other observation I’d make is that in some problem domains, another kind
of arithmetic is interesting, namely saturating arithmetic. With saturating
arithmetic, INT_MAX
+ 1 = INT_MAX
. Why is this used and what supports it?
It’s good for audiovisual processing; you don’t want sample values overflowing
and wrapping. Saturating arithmetic is usually available on DSPs and also in
the vector units of modern microprocessors.
If we’re talking about the C programming language, C99 §6.5 ¶5 says:
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behaviour is undefined.
while §6.2.5 ¶9 states that:
… A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
The combined effect of which is indeed to render the result of adding one to the maximum value of a signed integer undefined as far as C is concerned. If you’re using some other programming language, on the other hand, it may be very well-defined what happens in such a case (maybe you get an exception, perhaps the type of the result will be different, or maybe it’s defined to wrap or even saturate).
In fact, a number of microprocessor architectures maintain an overflow flag (typically labelled the “V” flag) to allow assembly programmers to detect just this case and handle it as they wish. C compilers sometimes have a flag that causes them to emit code to test the V flag and abort execution if overflow is detected.
As regards choosing types for variables in your program, I disagree with Graham’s implication that it’s best to use signed types even for notionally unsigned quantities. Neither approach protects you from unexpected overflow, and using a signed type for a signed quantity just means you’ve added an extra failure mode (before, when it was unsigned, it could be too large, but at least it was always positive; as a signed value, it could still be too large, but it might also now be—unexpectedly—negative).
There is a loop-related argument about signed versus unsigned variables that people sometimes trot out, namely:
1 2 3 4 5 6 7 8 9 |
|
See the problem? That’s right, n >= 0
is always true, because n
is
unsigned; decrementing it from 0 results in a large positive number. Some
people claim that, as a result, it’s “safer” to use a signed integer, though
personally I think that’s a red herring.
The correct way to write this loop with unsigned
is
1 2 3 |
|
The final part of Graham’s post contains an odd example about using a
uint8_t
to hold a small count, but then failing to properly check it when
using it. I must admit I fail to see the point he’s trying to make here; if
you don’t check it, you don’t check it; it seems no safer using a uint16_t
or even an int
, because the problem is not overflow but that it isn’t being
checked properly. Put another way, given
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
changing the uint8_t
to uint16_t
is just not a fix for the actual
problem. The fix is something more like
1 2 3 4 5 6 7 8 9 |
|
though in a shipping application you might not actually want to assert()
on
failure.