Yes, FLT_EVAL_METHOD 2 is an extension of "keep values on the stack as much as possible" that also requires the compiler to spill intermediate results as long doubles. Apparently it also doesn't cover constant folding, but knowing who implemented it in GCC (Joseph Myers) he's extremely thorough and I have no doubt it's allowed by the standard.
yeah, once you are restricted to performing (at runtime) computation at a precision that does not match the specified storage precision, the compiler is kind of screwed :-/
That's the best they can do.