That too but the eternal riddle of optimizer passes is which ones reveal structure and which obscure it. Do I loop unroll or strength reduce first? If there are heuristics about max complexity for unrolling or inlining then it might be “both”.
And then there’s processor family versus this exact model.