> What are your thoughts on whole-program optimizing compilers on this? They're ...

> What are your thoughts on whole-program optimizing compilers on this?

They're awesome, but they're not powerful enough to allow the "cost-free use" as well as JITs. Even whole program optimizing compilers can't be as aggressive as JITs in the general case because, like all AOT compilers, they are limited to what they can prove, and there are many things compilers simply cannot prove (at least not without extreme help from the programmer).

Now, don't get me wrong: AOT compilers are often good enough (for languages specially designed for them), but the automatic "zero-cost-use" of JITs is also often good enough compared to cost-free abstractions. We're talking about a general philosophy taken to certain extremes. I would say that a whole-program optimization on a language like ML is a middle ground between those two extremes.

> and makes JIT seem like a joke.

There's nothing stopping a JIT from doing whole-program optimization. It's just that they can do better for less work. They can do the same optimizations, only with local reasoning because they're not required to prove their correctness. True, they will need to add a guard for some of those speculative optimizations, but on the other hand, they can do so much more (speculate and optimize cases even a whole-program AOT can't prove). Of course, as I said in my original comment, JITs do have other costs.