> Efficient integer overflow should really be implemented in the processor.
I disagree. It greatly adds to the complexity, verification, cost, and power for the HW to do this, which is especially onerous in that almost nobody cares or runs code that exercises it!
Instead, the real performance hit comes from the compiler being forced to serialize otherwise independent instruction streams. So why waste all that HW effort when you can pay it all in the SW when and only when you want to use it? The user probably won't even notice the difference between SW and HW overflow support on real workloads.
I disagree. It greatly adds to the complexity, verification, cost, and power for the HW to do this, which is especially onerous in that almost nobody cares or runs code that exercises it!
Instead, the real performance hit comes from the compiler being forced to serialize otherwise independent instruction streams. So why waste all that HW effort when you can pay it all in the SW when and only when you want to use it? The user probably won't even notice the difference between SW and HW overflow support on real workloads.