AVX512 benefits are coming from gather-scatter instructions, I think. What is in...

AVX512 benefits are coming from gather-scatter instructions, I think.

What is interesting here is that in their current implementation they aren't very beneficial [1] and [2].

I remember vaguely that first implementations of scatter/gather instructions were not faster than sequential access from different memory registers.

And, thusly, it may come handly that AMD has much bigger core count because each thread will have less memory to access.