Hacker News new | past | comments | ask | show | jobs | submit login

To decouple from the object file format, you want it to dump assembly? Assembly output still marks sections, though I suppose limiting to .text and .data gives you nearly lowest-common-denominator sections. For very simple executable layouts that don't have separate sections, I suppose the assembler could ignore all section markings.

Unless your program doesn't take any arguments or inputs, you're still dependent upon ABI specifics for how arguments are passed or how the bootloader passes boot parameters to the program.

In short, it's pretty easy to get 90% of what you're proposing, but it gets very complicated very quickly to rid yourself of nearly every platform dependency, particularly if you're supporting a wide range of platforms from 8-bit MCUs without IP-relative addressing to x86-64 and aarch64 to platforms like DEC Alpha AXP where the firmware is essentially a single-tenant hypervisor (and the OS kernel runs in usermode from a hardware prospective).




>"In short, it's pretty easy to get 90% of what you're proposing..."

It's pretty easy to get 100% of what I'm proposing -- simply use D! <g> :-)

(Or, some other past/present/future language authored with the same set of underlying ideas...)

https://softwareengineering.stackexchange.com/questions/2444...

https://cloudcomputingtechnologies.com/the-importance-of-dec...


I'm pretty sure betterC1 output still heavily depends on the features of the object format you're outputting.


Future OS programmers -- should understand the following:

If there is an object format present, that is, if a file is an ELF file (as opposed to a pure raw binary file with no ELF header/sections/information in it) -- if a file is an ELF file -- then that means that there is an expectation that it will be loaded into memory via an external loader program or externally executing code of some sort...

This loader program could be implemented as part of the code of an operating system, or it could exist as an independent program or code that executes as part of a boot program.

The presence of an object format, any object format -- says that there is the expectation of an external/independent set of code which will be used to appropriately relocate (move) its code in memory prior to passing control to it, AKA executing it.

If say, someone is using GRUB to say load an OS, then GRUB (or specific parts/modules of it) -- can indeed be used to load/relocate ELF files containing the OS's kernel, prior to passing to control to it.

It's one potential way to do things...

But not all OS'es use, nor choose to use GRUB...

Not all OS'es expect (especially many classic or homebrewed ones) to be relocated in memory via external loader...

Some OS'es -- choose to take responsibility for their own relocation in memory IF they need it -- some defer this to a custom-programmed boot sector -- but the point is, if there's an object format present (aka, higher level container (ELF, etc.) construct around the binary machine code), then that suggests that there's an expectation present that the OS code will be relocated by relocation code external to it...

What's the problem with that?

Well, if that expectation is present -- now there's a dependency on that relocation code existing...

I'll say that again:

IF that expectation is present -- now there's a dependency on that relocation code existing...

If that code brings in another library or codebase, which in turn brings in another library or codebase -- now we have the unnecessary dependency, complexity and inauditability demon all over again...

That is, if that code needs to exist, it makes the boot sequence more complex and it makes the compiler which needs to emit it more complex...

The net complexity is increased.

Which runs counter to principles of understandability, simplicity, transparency, and auditability in codebases...

That's the bigger picture...

Object formats may be necessary for application programmers to get their application to run on a given operating system, but when someone is an OS programmer -- they do in fact have a choice of whether to use container file formats -- or not.

Similarly, when someone is a compiler writer -- they do have a choice of whether to make a given container output format compulsory or optional... my suggestion to that community is that if a happy user base is desired, then they should always make it optional...


If I understand you correctly, you're advocating for always supporting a minimalist COM-like object format. The object code needs to be in some file, and formatted in some way.

That's fine, but it's different from being object format independent.


I'm advocating for choice -- on behalf of compiler users...

If the compiler user wants their compiler to output an object code format, it should output that object code format.

If the compiler user wants their compiler to output pure binary, it should output pure binary.


What does "pure binary" even mean, though? I think you mean a bare-bones COM-like format where the binary gets loaded at an ABI-determined static address in memory, the instruction pointer is set to the first byte of the binary, and you're off to the races.

This "pure binary" is still an object format, and still depends (at minimum) on the format's convention for the address at which the binary is loaded. Note that this is the case even when writing firmware ROMs, where the architecture determines where the boot ROM gets mapped in memory.


"Pure binary" means pure binary.

Old compilers used to output pure binary by default; newer ones will usually choose a container format (ELF, EXE, etc.) by default, and may not have the pure binary option available.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: