Hacker News new | past | comments | ask | show | jobs | submit login

For those who are still wondering the actual reason for the extra instruction after reading all that, it has to do with the calling convention: when calling a variadic function in SysV AMD64, AL holds the number of vector registers used for parameters. I believe the Microsoft x64 one doesn't do that.

Also, a xor r32, r32 is 2 bytes, not 1.




Yep: https://godbolt.org/z/BMjD0Y

sorry, I must've misread the output from godbolt from too many window splits. Will update the article. Thanks for pointing it out. :)


The reason is that there’s no space between the instruction address and the first byte of the instruction, but there are spaces between later bytes.


> a xor r32, r32 is 2 bytes, not 1

Does the article imply otherwise?


Yes.

> So Clang can potentially save you a single instruction (xorl %eax, %eax) whose encoding is only 1B, per function call to functions declared in the style f(), but only IF the definition is in the same translation unit and doesn’t differ from the declaration, and you happen to be targeting x86_64.


I couldn't seem to find that bit, thanks for pointing it out to me.


I just updated the article, too, sorry if that caused confusion.


Yes.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: