I myself implemented one in the SSE4/Altivec days (later extended to AVX, AVX512 and NEON). There were only a few options then, but now everyone seems to be doing it.
I myself implemented one in the SSE4/Altivec days (later extended to AVX, AVX512 and NEON). There were only a few options then, but now everyone seems to be doing it.