Writing a bare-metal RISC-V application in D

peter_d_sherman · on Sept 1, 2023

>"It turns out D has introduced a mode called betterC1 (sounds exactly like what I want), which essentially disables all language features that require the D runtime. This makes it roughly as easy to use D for bare-metal programming as C."

This is a great idea -- for any past, present or future compiled language -- give it a compile mode which disconnects/decouples it from any runtime or libraries or OS dependencies or other special features or container formats (ELF, EXE, .so, .dll, etc., etc.)

That is, just the plain vanilla language!

(Hey, maybe this mode should be called 'vanilla'? You know, like: '$ mycompiler -vanilla <other command-line parameters here>...?)

Anyway, a great idea from D!

KMag · on Sept 1, 2023

To decouple from the object file format, you want it to dump assembly? Assembly output still marks sections, though I suppose limiting to .text and .data gives you nearly lowest-common-denominator sections. For very simple executable layouts that don't have separate sections, I suppose the assembler could ignore all section markings.

Unless your program doesn't take any arguments or inputs, you're still dependent upon ABI specifics for how arguments are passed or how the bootloader passes boot parameters to the program.

In short, it's pretty easy to get 90% of what you're proposing, but it gets very complicated very quickly to rid yourself of nearly every platform dependency, particularly if you're supporting a wide range of platforms from 8-bit MCUs without IP-relative addressing to x86-64 and aarch64 to platforms like DEC Alpha AXP where the firmware is essentially a single-tenant hypervisor (and the OS kernel runs in usermode from a hardware prospective).

peter_d_sherman · on Sept 1, 2023

>"In short, it's pretty easy to get 90% of what you're proposing..."

It's pretty easy to get 100% of what I'm proposing -- simply use D! <g> :-)

(Or, some other past/present/future language authored with the same set of underlying ideas...)

https://softwareengineering.stackexchange.com/questions/2444...

https://cloudcomputingtechnologies.com/the-importance-of-dec...

KMag · on Sept 3, 2023

I'm pretty sure betterC1 output still heavily depends on the features of the object format you're outputting.

peter_d_sherman · on Sept 4, 2023

Future OS programmers -- should understand the following:

If there is an object format present, that is, if a file is an ELF file (as opposed to a pure raw binary file with no ELF header/sections/information in it) -- if a file is an ELF file -- then that means that there is an expectation that it will be loaded into memory via an external loader program or externally executing code of some sort...

This loader program could be implemented as part of the code of an operating system, or it could exist as an independent program or code that executes as part of a boot program.

The presence of an object format, any object format -- says that there is the expectation of an external/independent set of code which will be used to appropriately relocate (move) its code in memory prior to passing control to it, AKA executing it.

If say, someone is using GRUB to say load an OS, then GRUB (or specific parts/modules of it) -- can indeed be used to load/relocate ELF files containing the OS's kernel, prior to passing to control to it.

It's one potential way to do things...

But not all OS'es use, nor choose to use GRUB...

Not all OS'es expect (especially many classic or homebrewed ones) to be relocated in memory via external loader...

Some OS'es -- choose to take responsibility for their own relocation in memory IF they need it -- some defer this to a custom-programmed boot sector -- but the point is, if there's an object format present (aka, higher level container (ELF, etc.) construct around the binary machine code), then that suggests that there's an expectation present that the OS code will be relocated by relocation code external to it...

What's the problem with that?

Well, if that expectation is present -- now there's a dependency on that relocation code existing...

I'll say that again:

IF that expectation is present -- now there's a dependency on that relocation code existing...

If that code brings in another library or codebase, which in turn brings in another library or codebase -- now we have the unnecessary dependency, complexity and inauditability demon all over again...

That is, if that code needs to exist, it makes the boot sequence more complex and it makes the compiler which needs to emit it more complex...

The net complexity is increased.

Which runs counter to principles of understandability, simplicity, transparency, and auditability in codebases...

That's the bigger picture...

Object formats may be necessary for application programmers to get their application to run on a given operating system, but when someone is an OS programmer -- they do in fact have a choice of whether to use container file formats -- or not.

Similarly, when someone is a compiler writer -- they do have a choice of whether to make a given container output format compulsory or optional... my suggestion to that community is that if a happy user base is desired, then they should always make it optional...

KMag · on Sept 5, 2023

If I understand you correctly, you're advocating for always supporting a minimalist COM-like object format. The object code needs to be in some file, and formatted in some way.

That's fine, but it's different from being object format independent.

peter_d_sherman · on Sept 7, 2023

I'm advocating for choice -- on behalf of compiler users...

If the compiler user wants their compiler to output an object code format, it should output that object code format.

If the compiler user wants their compiler to output pure binary, it should output pure binary.

KMag · on Sept 8, 2023

What does "pure binary" even mean, though? I think you mean a bare-bones COM-like format where the binary gets loaded at an ABI-determined static address in memory, the instruction pointer is set to the first byte of the binary, and you're off to the races.

This "pure binary" is still an object format, and still depends (at minimum) on the format's convention for the address at which the binary is loaded. Note that this is the case even when writing firmware ROMs, where the architecture determines where the boot ROM gets mapped in memory.

peter_d_sherman · on Sept 11, 2023

"Pure binary" means pure binary.

Old compilers used to output pure binary by default; newer ones will usually choose a container format (ELF, EXE, etc.) by default, and may not have the pure binary option available.

pjmlp · on Sept 1, 2023

Which like C++, still allows several productive features over C, and unlike both of them, with better defaults.

gavinray · on Sept 1, 2023

I learned a lot from this, neat article!

One tip about initializing global variables in D -- the common idiom is to use a "shared static this() {}" block. This lets you set immutable variables once at the start of the program.

Since this requires the D Runtime, you can't use it in betterC, but you can use a pragma to mark a "crt_constructor" function as an equivalent:

    pragma(crt_constructor)
    extern(C) void init_global_state() {
        // Set BSS pointers up here
    }

I'm not sure whether this works in bare-metal environments (and maybe that's why it's not used).

CyberDildonics · on Sept 1, 2023

Why does it need the D runtime to set a variable once? This seems very complicated.

adrian_b · on Sept 1, 2023

I suppose that this is needed if you want to guarantee that all the constructors that are executed before main() are executed in a certain order.

I have not searched the latest versions of the C++ standard to see if anything has changed, but in the older versions of C++ it was impossible to ensure that the global constructors will be executed in a certain order (to be able to satisfy dependencies between them).

For static variables that only have initial values taken from the executable file, without initialization code, I assume that something like this is not necessary.

pjmlp · on Sept 1, 2023

Still the same in regards to C++.

gavinray · on Sept 1, 2023

To be honest, I'm not entirely sure of the technical reason:

https://forum.dlang.org/post/sgiyhdnpdyxvigwtrfqk@forum.dlan...

cassepipe · on Sept 1, 2023

I have also been using D as a better C (Das better C), it just fixes what needs to be fixed in C. Recommend it

bachmeier · on Sept 1, 2023

I haven't had a lot of use for betterC, but I'd imagine the recent ability to compile C code makes it a better option. You can add existing C code to your betterC project and get interoperability for free.

ptspts · on Sept 1, 2023

Isn't open, but at least well-documented hardware one of the goals of RISC-V? The article complains a lot that the integrated hardware is horribly underdocumented.

pjmlp · on Sept 1, 2023

That is the myth, the reality it is only an instruction set, without the actual hardware, and even the ISA itself can be extended with OEM specific instructions.

dailykoder · on Sept 1, 2023

You are right and it's probably as "bad" as you make it sound, but we've come a long way. One tiny step at a time. Maybe one day full open hardware will make it to the mainstream. Let's just dream a bit (and work on it)

Edit: I am aware tho, that there will always be some vendor who just copy pastes the open hardware and sells it for cheaper, because they don't have that much R&D cost. And restricting that on whatever level (government, etc) sounds even worse.

Max-q · on Sept 1, 2023

It's always good to dream, but in silicon implementation and production, millions and billions are spent, so having anybody sharing that I think is not very likely. It's a totally different ballgame than software. But let's hope things change in the future so that this dream can come true.

5- · on Sept 1, 2023

risc-v is an instruction set specification.

it doesn't concern much beyond that, so there is nothing special about "risc-v hardware ", which contains many other components at vendor's discretion.

brucehoult · on Sept 1, 2023

... and often uses the same memory-mapped peripherals the same vendor used for their Arm or 8051 or whatever chips.

Having good standard open IP for all that stuff would be great, but that's a different project than RISC-V.