Hacker News new | past | comments | ask | show | jobs | submit login
X86 boot sector written in C (2010) (crimsonglow.ca)
111 points by doubcoid on Oct 24, 2014 | hide | past | favorite | 23 comments



Note that this is not the MBR, but what is known as the "volume boot record" (VBR) which is located at the start of each partition. And to be honest, I don't think this code is significantly more readable than the Asm version... or maybe it's the lack of comments. It'd also be interesting to compare the compiled version's code with the handwritten Asm one.

Here is the annotated disassembly of the DOS 6.2 VBR: http://www.tburke.net/info/ntldr/bootsect.txt


This is probably one of the few times assembly is more readable and easy to understand than C. Many years ago I used to teach OS classes at a local university, we would build a simple bootstrap using NASM. The source code includes 32bits (a20 etc) and GDT iirc:

https://github.com/eduardordm/sisop/blob/master/sisop.asm


Going into protmode in the boot sector is extremely early. Most OSs I've seen will stay in realmode (or "unreal" mode) for a little bit longer, so they can use the BIOS to setup some more stuff before making that leap.

Also noticed a minor optimisation:

    mov AL, [SI]    ; pega um byte da string e carrega em al 
    inc SI 

Could've been replaced with a 'lodsb', saving 2 bytes. Ditto for MOV BH/MOV BL with MOV BX. :-)


Going down the size optimization rabbit hole a little bit (not criticism, I just enjoy this sort of puzzle).

Two bytes shorter:

  call print_str
  ret
  =>
  jmp print_str
Three bytes shorter; this one may be a little too clever, since it depends on the low bit of CR0 being 0 initially:

  mov EAX, CR0
  or EAX, 1
  =>
  mov EAX, CR0
  inc AX


Could've been replaced with a 'cld' and 'lodsb', saving 1 byte.


Actually, I really prefer writing that in C rather than asm: You can use bitfields to define the GDT, IDT, Page Directories and so on. Doing everything is ASM is a PITA, for debugging and for readability.

(I am a teaching assistant at one of these courses and hate when students do everything in asm.. but they are allowed to do as they want, as long as it works as specified).


Why is debugging a pita?


I agree... I did a similar thing here:

https://github.com/aosmith/toy-os

The x86 code seems like a bear but once it's explained it is fairly simple.


>However, as a philocalist and masochist I felt compelled to write legible code and decided to use C.

Having looked at the author's boot.c, I now understand that legibility is in the eye of the beholder.

Still, it's a neat and impressive hack. I wish I knew more about operating systems.


I'm actually pretty surprised and impressed at how short and legible this turned out. Out of curiosity I've written code to do I/O through the BIOS and parse files out of FAT, it didn't turn out this short.


Pretty cool. But can you make the source code readable via the browser. (e.g. Content-Type: text/plain)


Why read only three sectors when you can read the whole OS into memory. Take a look at these:

https://github.com/jhallen/joes-sandbox/tree/master/boot

They're assembly language MBR and Linux boot loaders I wrote long ago. The bootloaders understand EXT2 or FAT, yet still fit in 1K.


I think it's pretty cool.. I know that it's a bit of an odd thing to notice, but thought it was almost odd that it is on Google Code, and not github. Not that there's anything wrong with that, it just feels like when I see something new that isn't on github that's open source related it surprises me a little.

But it looks like this was actually started over 4 years ago, I didn't notice the date of the article. Was curious if anyone else noticed it like I did.


Yes, it was good for the time, but I wouldn't recommend using Google Code anymore.


Why?


GitHub is easier to browse (both over the site and through individual repos). Also, GitHub has better integration between issues, commits, and pull requests.


The last 2 B of the boot sector must have the value 0xAA55; this value is known as the boot signature.

Shouldn't this be 0x55AA because of little endian? Otherwise nice writeup, but unreadable code.


No, the 0x55 comes first.


I'm not sure what's going on with that site, but it won't scroll in my browser. The HTML source is also a bit ... special.


It's not special, it's that the byte charts for the memory view image things are apparently drawn by horribly abusing CSS and `<li>` tags.

Also, You use opera as well?


Yeah, no other browser has the features Opera 12 has, so there's no other option.


Can we get a reference? Where is this layout coming from?


Which layout? The x86 boot process is pretty well documented and not something that needs a source at this point. Either way he says he learned this all by studying MSDOS.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: