For those who are still wondering the actual reason for the extra instruction af...

ndesaulniers · on May 12, 2019

sorry, I must've misread the output from godbolt from too many window splits. Will update the article. Thanks for pointing it out. :)

klodolph · on May 13, 2019

The reason is that there’s no space between the instruction address and the first byte of the instruction, but there are spaces between later bytes.

saagarjha · on May 12, 2019

> a xor r32, r32 is 2 bytes, not 1

Does the article imply otherwise?

biesnecker · on May 12, 2019

Yes.

> So Clang can potentially save you a single instruction (xorl %eax, %eax) whose encoding is only 1B, per function call to functions declared in the style f(), but only IF the definition is in the same translation unit and doesn’t differ from the declaration, and you happen to be targeting x86_64.

saagarjha · on May 12, 2019

I couldn't seem to find that bit, thanks for pointing it out to me.

ndesaulniers · on May 12, 2019

I just updated the article, too, sorry if that caused confusion.

cblum · on May 12, 2019