Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Complete newbie programmer naivete, I'm afraid.

Unsigned integers have a big "cliff" immediately to the left of zero.

Its behavior is not undefined, but it is not well-defined either. For instance, subtracting one from 0U produces the value UINT_MAX. This value is implementation-defined. It's safe in that the machine won't catch on fire, but what good is that if the program isn't prepared to deal with the sudden jump to a large value?

Suppose that x and y are small values, in a small range confined reasonably close to zero. (Say, their decimal representation is at most three or four digits.) And suppose you know that x < y.

If x and y signed, then you know that, for instance, x - 1 < y. If you have an expression like x < y + b in the program, you can happily change it, algebraically to x - b < y if you know that overflow isn't taking place, which you often do if you have assurance that these are smallish values.

If they are unsigned, you cannot do this.

In the absence of overflow, which happens away from zero, signed integers behave like ordinary mathematical integers. Unsigned integers do not.

Check this out: downward counting loop:

  for (unsigned i = n - 1; i >= 0; i--)
  { /* Oops! Infinite! */ }
Change to signed, fixed! Hopefully as a result of a compiler warning that the loop guard expression is always true due to the type.

Even worse are mixtures of signed and unsigned operands in expressions; luckily, C compilers tend to have reasonably decent warnings about that.

Unsigned integers are a tool. They handle specific jobs. They are not suitable as the reach-for all purpose integer.



> Suppose that x and y are small values, in a small range confined reasonably close to zero. (Say, their decimal representation is at most three or four digits.)

There's the rub! You need to justify characterizing your values in that way, which means either explicit range checks and assertions, or otherwise deriving them from something that applies that guarantee in turn. And that justification is more work than just making your code correct for every value. I mean, if x and y are both smallish, is x*y also smallish?

The "nearby cliff" is a good thing, in that it makes errors come out during testing rather than a month after you ship. Handwaving about "reasonably close to zero" is begging for trouble.

In the absence of automatic bigints, unsigned integers are easier to make correct.


What bugs have you seen that were caused in part by the choice to use signed integers?


The most dramatic example is the INT_MIN/-1 case, since that causes an outright crash.

For example, Windows has the ScaleWindowExtEx function, which scales a window by some rational number, expressed as the ratio of two ints. Using signed arithmetic is already suspicious: what does it mean to scale a window by a negative fraction? But of course they forgot about the INT_MIN/-1 case, and the result is a BSOD. http://sysmagazine.com/posts/179543/

http://kqueue.org/blog/2012/12/31/idiv-dos/ has some others. Fun stuff.


Every single one of those cases involve integers that necessarily must be signed, in order for the interface to work. They're implementing an interpreted language having signed integers, or in the one other case, with ScaleWindowExtEx, it manipulates values that are signed, and that have meaning when negative. The one place I saw an INT_MIN / -1 bug (in code review, iirc) was also for an interpreted language implementing a modulo operator. These are bugs with signed integer usage, but they're not bugs caused by a decision to use signed integers, because in these cases, there was no choice. They aren't representative of what you see when you do have a choice, say, using signed integers for array indices and sizes.

The question of what bugs you actually see was meant to be a personal one, not one of some dramatic bug you've read about on the internet. The answer of which is the better choice is defined by how much damage is caused by one choice versus the other, and you get that answer by noting how frequently you get bugs in practice as a result of such decisions. (Not that thought experiments and imagination no place, but this is clearly a question for which you can talk yourself into any direction.) For example, I've never had problems with C/C++ undefined behavior of signed integer overflow, while you're spending a lot of time talking about that. I have seen bugs caused by unsigned and signed integer usage that fit into other categories, though.


The ScaleWindowExtEx example certainly has no legitimate reason to accept signed ints.

Personally, the bug I introduce most often with signed ints is a failure to range-check for negative values, e.g.:

    void *get(int idx) { assert(idx <= arr.size(); return arr[idx]; }


The "legitimate reason" is that it's scaling a signed value, that has meaning when negative.

> void *get(int idx) { assert(idx <= arr.size(); return arr[idx]; }

You got problems there even if idx and arr.size() are unsigned.


Without signed integers, the scaling function will turn arguments like (-4, -3) to garbage.

In order to disallow negatives, you need signed arguments, or else to reduce the range. (Say the numerator and denominator cannot exceed 255 or whatever). Otherwise the function has no way of knowing whether argument values (UINT_MAX-3, UINT_MAX-2) are a mistaken aliasing of -4, -3 or deliberately chosen positive values.


Garbage in, garbage out, as they say.

For all we know the ScaleWindowExtEx did have a domain check that disallowed negatives, but it put it after the division.


It permits negatives! SetWindowExtEx permits negatives!


> for (unsigned i = n - 1; i >= 0; i--)

An defined behavior alternative:

  for( size_t i = n; i --> 0; )


size_t is also unsigned (no idea why). The signed equivalent is ssize_t.

Edit: Sorry, missed the "i-- > 0" at first. The code works, but not because of changing "unsigned" to "size_t".


Sizes can be >2gb on a 32 bit system. Not sure how ssize_t works, is it 64 bits then?

It makes sense to use unsigned for sizes to save a bit (or 32 bits per size). Also less invalid possible inputs to handle.


By the way, ssize_t is POSIX, from <sys/types.h>, not ISO C.


No worries. I nearly left it "unsigned", but my OCD kicked in. Too many 64-bit conversion warnings stain my psyche...


> but what good is that if the program isn't prepared to deal with the sudden jump to a large value

Aaaand you only need to care about this case.

For an unsigned int just check: x < (your max value)

For signed ints: x < (your max value) AND x > 0

Oh, the downward counting loop example. Because that's done very frequently no? I really don't remember when was the last time I did that (I'd much rather have an upwards loop then y = MAX_VALUE - x (adding 1 if needed)

Quite funnily, if you do a loop in x86 assembly, it is naturally downwards counting, if ECX is zero the loop ends

Don't use a for, just i = (n - 1); do { i--;} while (i);




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: