This analysis is correct but the solution is not feasible. Changing a modification through `x` to a modification through `y` does indeed violate the semantics of `restrict`. The problem is that in order to detect this situation, we'd have to track the provenance of integers. In this specific example, we'd have to know that replacing `xaddr` with `y2addr` affects `x` and `y`. There is general consensus that tracking provenance for integers causes many more problems than it solves, so although this would solve the problem it is not feasible. This is why weak and strict provenance are being pursued instead.
With regards to optimization passes, does it matter if the analysis is infeasible? We agree the optimization/transformation from the first version to the second isn't valid, so I think it shouldn't have been done.
The original version with two stores and one load doesn't seem to have a problem. Having the optimizer punt when it gets confused by integer to pointer casts seems acceptable.
It's not sufficient to say "there's an int2ptr cast, so stop optimization." You can break the code up across several function calls, which means the code doing the optimization can't tell that there's an int2ptr cast around to shut down the optimization. (Most optimizations are intraprocedural, and even if they could look interprocedurally, there's no guarantee that the function body is even available for inspection--they could be in different C source files).
Instead, you'd have to instead prove that there's no possible int2ptr cast around everywhere, which is generally an impossible analysis.
> It's not sufficient to say "there's an int2ptr cast, so stop optimization."
Complicated or not, it's necessary that optimizations do not break correct code.
There doesn't seem to be a problem (UB or otherwise) in the first function at the top of the article, but the second one has a clear aliasing problem that violates the promise the `strict` makes. That translation was invalid.
If you can't track provenance to un-restrict the pointers because it's infeasible, then you have to give up on at least one of the optimization passes. In this case, the optimizations used are very fundamental and giving up on any one of them unilaterally would be catastrophic for performance. The provenance models being suggested add more nuance to the model (pointer provenance) so that we can keep all of the optimization passes while preventing these cases from being optimized incorrectly. Weak provenance says we can't optimize away the pointer to integer cast, strict provenance says we must provide provenance for integer to pointer casts. Weak provenance is broadly compatible with existing code (compiler semantics change) whereas strict provenance is not (language semantics change). The tradeoff is that strict provenance leads to better optimization in general.
Catastrophic sounds strong. As far as `restrict` goes, C was never that far behind Fortran in performance.
And if maintaining `restrict` for other passes is really important, maybe the order of the passes should be changed. I'm not pretending compiler optimization is simple, but I can't see any situation where having an incorrect optimization pass run first is the right thing to do. The broken pass needs to be fixed, and it shouldn't have emitted code with incorrect `restrict` on it.