The smallest Hello World program

bd01 · 2025-01-02T20:32:48 1735849968

This is pretty bad. Let's start with the very first instruction:

  mov rax, 1

An actual "mov rax, 1" would assemble to 48 B8 01 00 00 00 00 00 00 00, a whopping TEN bytes.

nasm will optimize this to the equivalent "mov eax, 1", that's 6 bytes, but still:

  xor eax, eax ; 2 bytes
  inc eax      ; 2 bytes

would be much smaller. Second line:

  mov rdi, 1

You already have the value 1 in eax, so a "mov edi, eax" (two bytes) would suffice. Etc. etc.

xpasky · 2025-01-02T20:43:55 1735850635

  push 1
  pop rax

is even shorter (credit: https://old.reddit.com/r/programming/comments/q6mnz1/what_is...)

musicale · 2025-01-03T03:38:00 1735875480

I feel like I shouldn't love x86 encoding, but there is something charming about this. Probably echoing its 8-bit predecessors. It seems like it's designed for tiny memory environments (embedded, bootstrapping, etc.) where you don't mind taking a hit for memory access.

rep_lodsb · 2025-01-02T21:17:40 1735852660

Linux initializes all general purpose registers to zero. It's not documented AFAIK, but should be reliable - it has to init them to some value anyway to avoid leaking kernel state. So you can get away with:

    mov     al,1       ;write
    mov     edi,eax    ;handle=stdout
    mov     esi,msg    ;assumes load address below 4G
    mov     dl,msg.len
    syscall
    mov     al,60      ;assuming syscall succeeded, EAX was bytes written
    xor     edi,edi
    syscall

The load address stays constant unless there's some magic GNU extension header to enable ASLR. If we could get the code loaded below 64K, we could save another byte by using SI instead of ESI; however this doesn't work by default, you'd have to run 'echo 0 > /proc/sys/vm/mmap_min_addr' as root first.

bd01 · 2025-01-02T21:48:34 1735854514

Initial register state is documented to be undefined except for rbp, rsp and rdx [1].

Can you say for certain that no other Linux version ever used GPRs to pass something else?

[1] System V ABI, page 29 (last line) and 30, https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf

rep_lodsb · 2025-01-02T22:01:20 1735855280

For certain? No, but I wouldn't expect it. Not sure what that function pointer in rdx is intended for, but Linux doesn't use it.

(Note for pedants: rsp is technically a "general purpose register", but of course it is initialized to point to the userspace stack instead of zero.)

retrac · 2025-01-02T21:52:14 1735854734

Assuming it is initial zero

   inc eax

is a byte shorter than mov al, 1

rep_lodsb · 2025-01-02T21:56:15 1735854975

Yes, but only in 32 bit mode. Not that it matters, except for the hypothetical future processor or Linux kernel that is no longer compatible with that :)

michidk · 2025-01-03T10:04:35 1735898675

I was able to shave off one additional byte with this:

  ...
  xor rax, rax       ; = 0
  inc rax            ; = 1 - syscall: sys_write
  mov rdi, rax       ; copy 1 - file descriptor: stdout
  lea rsi, [rel msg] ; pointer to message
  mov rdx, 14        ; message length
  syscall
  ...

  $ nasm -f bin -o elf elf.asm; wc -c elf; ./elf
  166 elf
  Hello, World!

So I guess NASM already optimizes this quite well

However, using the stack-based instructions as xpasky hinted at:

  ...
  push 1             ; syscall: sys_write
  pop rax
  pop rdi       ; copy 1 - file descriptor: stdout
  lea rsi, [rel msg] ; pointer to message
  push 14            ; message length
  pop rdx
  syscall
  ...

I get down to 159 bytes! I updated the article to reflect that

bd01 · 2025-01-03T18:13:40 1735928020

That second snippet is pretty funny:

  push 1
  pop rax
  pop rdi

You can't push a value once and pop it twice, that's not how a stack works! You're popping something else off the stack. So why does this even work?

Linux passes your program arguments on the stack, with argc on top. So when you don't pass any arguments, argc just HAPPENS to be 1. Which you then pop into rdi. Gross!

michidk · 2025-01-04T13:03:43 1735995823

Of course - you are completely right, an oversight in wanting to correct my mistake as quickly as possible.

With that fixed, is there any reason not to use push here?

bd01 · 2025-01-05T00:42:39 1736037759

Yes, because:

  push 1       ; 6A 01 (2 bytes)
  pop rdi      ; 5F    (1 byte)

is longer than a simple:

  mov edi, eax ; 89 C7 (2 bytes)

michidk · 2025-01-05T11:10:17 1736075417

I think your statement might only apply to 32 bit (one of the constraints mentioned early in the blog post was 64 bit).

But even if it was 32 bit, then we would't have to copy a 1, since the syscall number for sys_write would be 4 instead of 1.

I get the same total size with both variants in 64 bit mode.

  push 1
  pop rax
  mov rdi, rax

Assembling to 48 89 C7 (3 bytes)

seems to be same in size as

  push 1
  pop rax
  push 1
  pop rdi

Assembling to 6A 01 5F (3 bytes)

bd01 · 2025-01-05T13:46:16 1736084776

That's because you're using `mov rdi, rax` again. You keep changing `edi, eax` to `rdi, rax`. Why?

The default operand size in 64-bit mode is, for most instructions, still 32 bits. So `mov edi, eax` encodes the same in 32- and 64-bit mode.

For `mov rdi, rax` you need an extra REX prefix byte [1], that's the 48 you're seeing above, but you don't need it here.

[1] https://wiki.osdev.org/X86-64_Instruction_Encoding#REX_prefi...

michidk · 2025-01-05T13:56:55 1736085415

okay, I didn't know that, thanks for the background. I wonder why the assembler would not optimize this though.

I noticed that I then could also shave of one byte more by using lea esi, [rel msg] instead of lea rsi, [rel msg].

michidk · 2025-01-04T13:10:37 1735996237

should be ... push 1 ; syscall: sys_write pop rax push 1 pop rdi

of course

michidk · 2025-01-03T01:55:31 1735869331

Thanks, that makes total sense. I was so focused on the ELF part that I didn't even consider optimizing the initial assembly further. Will fix it and edit the article.

Tepix · 2025-01-02T21:19:46 1735852786

Here's a tiny DOS COM file that does it in 18 bytes:

    ;; 18 bytes
    DB 'HELLO_WOIY<$'  ; executes as machine code, returning SP to original position without overwriting return address
    
    mov  dx, si    ; mov dx,0100h MS-DOS (all versions), FreeDOS 1.0, many other DOSes
    xchg ax, bp    ; mov ah,9     MS-DOS 4.0 and later, and FreeDOS 1.0
    int  21h
    ret

(credits: https://stackoverflow.com/questions/72635031/assembly-hello-...)

rep_lodsb · 2025-01-02T21:53:38 1735854818

Well, it prints something that is the same length as the correct message at least.

musicale · 2025-01-03T03:43:02 1735875782

COM files for CP/M and DOS really are a no-nonsense executable format.

I'm a bit disappointed that Linux (or BSD, macOS, etc.) doesn't support them (or similar) out of the box, though Windows will sort of run them via ntvdm.

smokel · 2025-01-02T20:35:56 1735850156

My favorite language for implementing short Hello World programs in is HQ9+ [1].

Joking aside, this page [2] used to be a great tutorial on writing small ELF binaries, but I'm not sure whether it will still work in 64-bit land. It proved very helpful for writing a 4K intro back in 1999.

[1] https://esolangs.org/wiki/HQ9%2B

[2] https://www.muppetlabs.com/~breadbox/software/tiny/teensy.ht...

mrfinn · 2025-01-02T20:22:37 1735849357

These challenges are funny - they remind me of the old days. Back in the DOS/Windows days, we used to have the .com format, which was perfect for tiny programs. One could even write a program of less than 10 bytes that could actually do something!

We've come a long way since then, and is like, at some point, nobody cared about optimizing executable size anymore

secondcoming · 2025-01-02T20:45:59 1735850759

People in the embedded space care. Symbian OS was compiled for small size, only certain parts were allowed use O3, such as the jpeg decoder.

theandrewbailey · 2025-01-02T21:26:47 1735853207

Some people care about executable size, (mostly) everyone else ships Electron apps.

ramon156 · 2025-01-03T00:39:31 1735864771

Tell it to my boss, he wants his app last week.

hinkley · 2025-01-02T21:21:38 1735852898

I learned to write COM programs at some point but quickly unlearned it. There were some spots where you can use them and not .bat files, but outside of that it’s a lot.

smokel · 2025-01-02T21:47:48 1735854468

  debug
  -a 100
  178A:0100 int 19
  178A:0102
  -r cx
  CX 0000
  :2
  -n reboot.com
  -w
  Writing 00002 bytes
  -q

dim13 · 2025-01-03T15:07:16 1735916836

Some more:

Quick'n'dirty:

    .model small
    .code
     org 100h
    start:
     int 19h                 ; Bootstrap loader
    end start

More "correct":

    .model small
    .code
     org 100h
    start:
     db 0EAh                 ; Jump to Power On Self Test - Cold Boot
     dw 0,0FFFFh
    end start

Even more "correct":

    .model small
    .code
     org 100h
    start:
     mov ah,0Dh
     int 21h                 ; DOS Services  ah=function 0Dh
                             ;  flush disk buffers to disk
     sti                     ; Enable interrupts
     hlt                     ; Halt processor
     mov al,0FEh
     out 64h,al              ; port 64h, kybd cntrlr functn
                             ;  al = 0FEh, pulse CPU reset
    end start

mrfinn · 2025-01-03T10:06:01 1735898761

Great example, a two bytes reboot utility. From the times when we could turn off the computer with a push of a button without fearing a global catastrophe...

xpasky · 2025-01-02T20:30:58 1735849858

JMP FFFF:0000

mrfinn · 2025-01-02T21:44:48 1735854288

INT 13h... uff chills

5- · 2025-01-02T22:08:13 1735855693

here's an 80 byte x86_64 linux 'hello world' (okay, not 'Hello world!'). convert to binary with xxd -r -p:

  7f454c46488d3537000000ffc7b20eeb03003e00
  b001eb1a01000000050000001800000000000000
  1800000005000000b03c0f05ebfa380001006865
  6c6c0000010068656c6c00006f20776f726c640a

i'm sure this can be improved -- but i could never get any x86_64 linux elf to under 80 bytes. see if you can fit the exclamation point still.

michidk · 2025-01-03T10:29:21 1735900161

Yeah I thought sth like this is possible, but (correct me if I'm wrong) this (ab)uses the ELF header and punts data in there, which goes against my requirement

> It should be a ‘proper‘ executable binary according to the spec

5- · 2025-01-03T18:07:18 1735927638

yes, this one conforms to 'whatever linux agrees to exec(2)', which apparently is a lot that is out of spec.

whynotmaybe · 2025-01-02T20:40:56 1735850456

Could a script be a program?

Because it would be much smaller in a bat file than contains :

echo Hello World!

theandrewbailey · 2025-01-02T21:22:45 1735852965

For the purposes of this challenge, no.

> Let’s first establish some rules for our ‘Hello World’ program:

> It should be able to execute directly without passing to any other programs first (so no decompression)

> It should be a ‘proper‘ executable binary according to the spec

hinkley · 2025-01-02T21:20:00 1735852800

Include the shebang but it’s still crazy how big minimal programs are.

__m · 2025-01-03T08:28:38 1735892918

php would be just

Hello World

gr33kdude · 2025-01-02T20:27:37 1735849657

Linking a similar, very popular past example of this: Teensy: https://www.muppetlabs.com/~breadbox/software/tiny/teensy.ht...

BergAndCo · 2025-01-03T06:42:53 1735886573

Thank you, I knew I had read somewhere someone put the program in the ELF header itself and got it down to 45 bytes, this is that exact post.

musicale · 2025-01-03T03:15:37 1735874137

I realize TFA is trying for object code, but for source code, QuickBASIC (and its successors) isn't bad:

    ? "hello, world!"

PILOT eliminates the quotes:

    T:hello, world!

Of course a typical REPL (Python, JavaScript, Lisp, etc.) will print out something similar (but often quoted) if you just type the quoted string.

And I'm sure there is already some language (call it HELLO) which simply prints "hello, world!" for an empty program.

charlieyu1 · 2025-01-03T03:21:58 1735874518

There are probably some golfing languages out there where an empty program outputs Hello World.

musicale · 2025-01-03T03:25:00 1735874700

I'm certain there is, but I don't have a reference for it yet other than my imaginary HELLO (Highly Efficient Limited Line Output) language.

fjfaase · 2025-01-03T10:07:31 1735898851

Would it be fair to name the program 'Hello World!' and than use argv[0], which is on the stack, to print out 'Hello World!'?

michidk · 2025-01-03T10:22:34 1735899754

Hehe a nice idea. But then it would not always print Hello World and you cannot execute it directly.

xpasky · 2025-01-02T20:29:43 1735849783

Now, can we make it even smaller applying https://nathanotterness.com/2021/10/tiny_elf_modernized.html ? We shouldn't need the full ELF header...

xpasky · 2025-01-02T20:32:28 1735849948

Oh it has been done: https://nathanotterness.com/2021/10/hello_105.asm

oneshtein · 2025-01-02T21:00:19 1735851619

The smallest "hello, world" programs in Rust I did for Arduino are 294 bytes for blink and 388 bytes for hw.

Multicomp · 2025-01-03T00:33:52 1735864432

129 byes...supposedly that's 2 punch cards according to Dr gpt. Small program!