Hacker News new | past | comments | ask | show | jobs | submit login

this isn't going to be any better than just loading from the two buffers into registers then storing the other way around, like:

  a = load(ap)
  b = load(bp)
  store(ap, b)
  store(bp, a)
  ap += step; bp += step;
any instructions to "do the swap" are a waste because generally load-store are separate instructions in SIMD instruction sets (and even if that wasn't the case, that's how they would get executed anyway)

if you want to avoid polluting the cache there are SSE instructions for loading without caching, which might be worthwhile

edit: this might be useful in a SIMD context where you need to swap two registers, where the cost of using another register is higher than the cost of the 3 arithmetic instructions. i could totally imagine that happening, but it's nothing to do with caches or memory




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: