You can actually autogenerate all reasonable variants of the code if necessary, there aren't that many architectures these days. Simd imstructions are usually very local, this shouldn't blow up the binary.
The point is to not have to write repetitive source code many times.
The point is to not have to write repetitive source code many times.