I've been hit by nasty slow-downs with denormal numbers a couple of times. It's much easier to hit these problems with single-precision floating point, where near underflow happens much more quickly than with doubles.
As a folk-theorem of numerics goes though: such problems are often a symptom that you're not doing quite the right thing. Should your numbers really be that small? Often recasting the problem slightly avoids having to turn on low level flags or other hacking.
I was astonished in my computer architecture class as an undergraduate to find out that the C language specification requires that all single-precision floating point numbers be converted to double-precision before any operations are done on them. So working with float's rather than double's in C is not only more vulnerable to underflow and what-not, but it's slower. Since then I've never declared a variable to be a float again.
The semantics of floating point arithmetic in C do not allow the first line to be eliminated. It has non-trivial behavior in the case of signaling NaNs or denormals or -0.0.
It can if you set the DAZ (denormals are zero) flag on the Pentium. I don't know if the C & IEEE-754 standards require the optimizer to preserve semantics in that case, but I can see why compiler writers would stay clear of such an optimization.
Neither the C standard nor IEEE-754 specify anything about floating-point operations in non-standard modes like DAZ or FZ. (Nor do they require that support for signaling NaNs be implemented). The behavior of -0.0 suffices to block the optimization under strict fp modes, however.
I think you might have missed something in the OP.
I got the impression that the "+ 0.1" was enough to raise the value of the floating point number being operated on, so that it was not denormalized.
So, the "+ 0.1" version was faster because the numbers it generated were not denormalized. It had nothing to do with the special properties of 0 versus 0.1f (a float).
Neither of the two explanations on the SO post made any assumption about the "theoretical optimizing compiler" that you mention, right?
Interesting discussion. Definitely something for the Lua programmers to keep in mind, as the default number implementation type is a C double. I did a quick test on my ancient MacBook, and the translation of the C++ code into Lua showed a slowdown for the denormalized case, although not as dramatic as the original problem (note: I changed the iteration count from 9,000,000 to 900,000 because the original count took way too long).
Yes, floating point is scary. I always try to stick with integers/fixed points if at all possible. Especially, if there is a loop with addition. That can eat 1 bit of precision per iteration if you're unlucky
I'm curious why no one has a good answer for the comments questioning why the compiler doesn't optimize out the +0/-0 instead of converting it to a floating-point, which triggers this issue.
The conversion is optimized out by gcc-4.6 at all positive optimization levels, the addition is optimized at -O2 and above. But the slowness is caused by the algorithms being different, with one producing denormals while the other stays reasonable.
Now that I look at the assembly, not even + 0.0f gets optimized out. If I had to guess, it could be that + 0.0f would have side-effects if y[i] happened to be a signalling NaN or something.
If x = -0.0 (yes, zero has a sign) then adding 0.0 gives 0.0, so it can have an effect and optimizing it away is wrong and generally only allowed with -ffast-math.
If you're seriously interested, read Kahan's "Branch Cuts for Complex Elementary Functions (or: Much Ado About Nothing's Sign Bit)".
TLDR: there are some classes of problems for which the sign bit of zero preserves enough important information to get an accurate solution to a problem that would not be possible if you only had an unsigned zero. These happen to turn up in certain types of conformal mappings that are useful for solving certain PDEs.
As a folk-theorem of numerics goes though: such problems are often a symptom that you're not doing quite the right thing. Should your numbers really be that small? Often recasting the problem slightly avoids having to turn on low level flags or other hacking.