> On a modern x86 cpu the ‘xchg’ instruction performs a swap and can do so entirely in the front-end via register renaming. It doesn’t even require a micro-op.
This is only true for AMD cpus, on Intel xchg is 3 uops. Still better than the xor trick, though.
This is only true for AMD cpus, on Intel xchg is 3 uops. Still better than the xor trick, though.
Source: https://www.uops.info/html-instr/XCHG_R64_R64.html