I was able to shave off one additional byte with this: ... xor rax, rax ; = 0 in...

bd01 · 2025-01-03T18:13:40 1735928020

That second snippet is pretty funny:

  push 1
  pop rax
  pop rdi

You can't push a value once and pop it twice, that's not how a stack works! You're popping something else off the stack. So why does this even work?

Linux passes your program arguments on the stack, with argc on top. So when you don't pass any arguments, argc just HAPPENS to be 1. Which you then pop into rdi. Gross!

michidk · 2025-01-04T13:03:43 1735995823

Of course - you are completely right, an oversight in wanting to correct my mistake as quickly as possible.

With that fixed, is there any reason not to use push here?

bd01 · 2025-01-05T00:42:39 1736037759

Yes, because:

  push 1       ; 6A 01 (2 bytes)
  pop rdi      ; 5F    (1 byte)

is longer than a simple:

  mov edi, eax ; 89 C7 (2 bytes)

michidk · 2025-01-05T11:10:17 1736075417

I think your statement might only apply to 32 bit (one of the constraints mentioned early in the blog post was 64 bit).

But even if it was 32 bit, then we would't have to copy a 1, since the syscall number for sys_write would be 4 instead of 1.

I get the same total size with both variants in 64 bit mode.

  push 1
  pop rax
  mov rdi, rax

Assembling to 48 89 C7 (3 bytes)

seems to be same in size as

  push 1
  pop rax
  push 1
  pop rdi

Assembling to 6A 01 5F (3 bytes)

bd01 · 2025-01-05T13:46:16 1736084776

That's because you're using `mov rdi, rax` again. You keep changing `edi, eax` to `rdi, rax`. Why?

The default operand size in 64-bit mode is, for most instructions, still 32 bits. So `mov edi, eax` encodes the same in 32- and 64-bit mode.

For `mov rdi, rax` you need an extra REX prefix byte [1], that's the 48 you're seeing above, but you don't need it here.

[1] https://wiki.osdev.org/X86-64_Instruction_Encoding#REX_prefi...

michidk · 2025-01-05T13:56:55 1736085415

okay, I didn't know that, thanks for the background. I wonder why the assembler would not optimize this though.

I noticed that I then could also shave of one byte more by using lea esi, [rel msg] instead of lea rsi, [rel msg].

michidk · 2025-01-04T13:10:37 1735996237

should be ... push 1 ; syscall: sys_write pop rax push 1 pop rdi

of course