Awesome guide on how to get the best out of C++ on embedded systems.
One thing I would add is: enable all the warnings you can and make them errors: "-Wall -Wextra -Wpedantic -Werror".
I know it's hard to do in practice (who will want to fix the thousands of warnings in a well seasoned codebase?) but it's a good idea for new projects. Especially if team uses the same compiler (sometimes the code produces warning with newer gcc versions).
Yes! This was a practice beat into me (not literally, but close) at my first programming gig in the 1980s. "A compile should not produce warnings! If the code producing the warning is truly intentional & deliberate, then grab a senior developer to review and ONLY if he agrees we'll issue a #pragma directive to tell the compiler to ignore it. But most of the time, (99%+) there's a way to do what you want without triggering a warning."
Most of the time it was simple mistakes or laziness like a downcast conversion leading to loss of precision, etc. but occasionally on review we'd find somethign really stupid/dangerous like type mismatches on a pointer ("well, no, that index into a string should NOT be used to reference memory...") or a return before a free() call (memory leak), uninitialized values (yay for random behavior!).
I thought sloppy, warning-riddled code was ancient history until last week, when I did a fresh install of R Studio on a server + a couple dozen packages, watching page after page of gcc compiler warnings scroll past.
> I thought sloppy, warning-riddled code was ancient history until last week, when I did a fresh install of R Studio on a server + a couple dozen packages, watching page after page of gcc compiler warnings scroll past.
Nah, just compile or use any Gtk+ based application.
I launch every program from terminals and a terrible warning parade fills my terminals whenever it is a Gtk application. I came to think that the developers never actually launch their program this way and never the see the flow of warnings they trigger.
I might misunderstand what kind of warnings you are talking about, but I saw OP talking about compiler warnings during the build process and you about GTK application runtime warnings. They are different, and in the context of an application it's... well, not necessariliy good, but normal. Runtime warnings and unexpected input is at least input the programmer is aware of. I'd be more scared of an app that lacks runtime warnings.
Warnings and static analysis tools are good, but wrangling them across multiple tools, architectures, and coding standards is hard. Really hard. Different compilers will throw warnings for different things. Linting gets crazy. A customer might ask you to run your code against a MISRA ruleset. A different customer may have their own coding standard. The two standards may contradict each other in areas.
Did you know that printf returns a value? Aggressive linting tools will throw a warning about an unchecked return value.
So is printf("hello world?\n") wrong? And (void)printf("hello world\n") right? I don't know anymore.
Serious question (I've been exactly the guy that planteen described -- wrangling code into submission for multiple static analysis tools and compilers, for warnings, MISRA compliance, etc.)
So my question: do you remember what tool would complain about such a construct (e.g. casting return value of printf() to void)? Reason I ask is because printf() returns a value; the cast to void is explicitly and deliberately -- right or wrong -- saying, "It's OK, I got this. I am choosing to ignore it." So a static analysis tool complaining about such a construct strikes me as a bit odd.
Older version of glibc tagged printf, etc, with warn_unused_result and it was extremely annoying. The defense was understandable but that it was eventually removed was the right decision.
There are a couple of other warnings I always explicitly disable, including -Wmissing-field-initializers and -Winitializer-overrides/-Woverride-init (clang/GCC).
I always use -Wall. I'd like to use -Wextra but it's problematic given that every few releases someone may introduce a new diagnostic that emits spurious warnings. When a nice clean build suddenly starts spitting out warnings you get a deluge of complaints and pointless patches at exactly the moment you have zero time to deal with them. So I usually stick to just -Wall and then run separate static analyzers (including clang static analyzer) before a release.
It doesn't help that with clang you now have twice the problems, and often twice the number of workarounds required. Notice above that disabling initializer override warnings requires a different option on clang than GCC. And while with recent versions of GCC and clang you can disable certain warnings using pragmas, you still need to special case each compiler
And then there are the people who build your software with -Wextra regardless and complain when they get spurious diagnostics. IIRC, GCC only enables -Woverride-init with -Wextra whereas clang enables the similar check with -Wall. I explicitly disable it for GCC precisely because invariably people will try to compile stuff with -Wextra.
Given how much larger C++ is than C, I can only imagine C++ developers have (or will have in the future) many more headaches of this kind. (But not with initializer overrides specifically as C++ rejected named initializers, which is the only way to override initializers. It's one of the ways that C and C++ have irreversibly split.)
I think you're right and I usually find MISRA gives decent warnings and solving the warnings is quite instructive. I remember, for example, John Carmack had nice things to say about static code analysis.
Author here - thanks! I left out advice like, "turn on all the warnings" because I was trying to focus on problems specific to embedded platforms. I'd strongly argue advice like that that is great regardless of what your target is.
> One thing I would add is: enable all the warnings you can and make them errors: "-Wall -Wextra -Wpedantic -Werror".
Note:
GCC (and probably other compilers) for Embedded systems don't accept 'void' return type for 'main' function without warning. As it is allowed by standard (ie, it can be implementation defined), and as it is normally used as return type for 'main' function in Embedded programming, the compilation may fail at 'main', even if the code is strictly conforming to standards.
It shall have a return type of type int, but otherwise its type is implementation-defined. All implementations shall allow both of the following definitions of main:
int main() { / ... / } and int main(int argc, char* argv[]) { / ... / }
In the latter form argc shall be the number of arguments passed to the program from the environment in which the program is run.If argc is nonzero these arguments shall be supplied in argv[0] through argv[argc-1] as pointers to the initial characters of null-terminated multibyte strings
---
No version of the C or C++ specs have said void main is acceptable, (apart from the implementation-defined optional main for non-hosted systems). However, most compilers have it as a language-extension.
A return statement in main was compulsory until C99, where the compiler was to assume return 0, if not specified. Some compilers had that behaviour already, however.
First hand 5-year experience with embedded and C++ here.
I used to work with a big C++ codebase which implemented ISDN/R2/SIP stacks (and a few more protocols), running inside an embedded ARM. Now I mostly work with a C codebase that does similar things.
I can say with good confidence that most of what was written on the article about compiler flags, etc, can be regarded as premature optimization - ie, you just need to start from this point if you know your environment is really constrained and exceptions are too much of a cost - which is usually not the case in modern embedded ARM systems.
About the advantages, I completely agree with the safety issue. Reuse is also much easier - using generic programming concepts to better encapsulate your abstractions, including functions acting on different types (templates), you can rely on sizeof and/or properties of the types themselves to specialize the proper behaviour. All without having a single virtual/dynamic dispatch. And modern C++ is making this much easier to use.
TBH I never used a single virtual in any of my code, though we used to have that in a few places in the whole stack away from the performance-sensitive code paths.
Comparing to C, there's a lot of repeated patterns in the code I work nowadays, and it's a bit frustrating as I can't really abstract them sometimes, and there's some repetition of similar algorithms in different places. You can start doing callbacks, some ugly acrobatics with macros, and some other alternatives, but the code then becomes unreadable, a minefield, or both.
> you just need to start from this point if you know your environment is really constrained and exceptions are too much of a cost - which is usually not the case in modern embedded ARM systems.
ARM systems really run the gamut these days - while many are quite powerful (see: my phone), many also fall on the more traditional microcontroller side, with very little in terms of flash and RAM. In case it wasn't clear, this article focused on the latter.
Great! But when you have hard timing requirements measured in microseconds, you generally need some fine-grained control over the instructions the CPU ends up running. All the knobs and levers C and C++ offer you are quite useful here.
Indeed, in the later case, exceptions are really an unnecessary cost.
In fact, another argument which could be made against the use of exceptions is that the abstraction per se is not a really good mechanism for abstracting error handling. Explicitly handling errors - including try!() à la Rust and/or monadic composition of functions - usually leads to a better error coverage, as safety is enforced by the type system itself.
> operator delete is required whenever we give a base class a virtual destructor—as is standard practice—even if we never heap-allocate an object of that class.
I'm not sure why you would need a virtual destructor if you don't heap allocate a class and are not going to call delete, as it will not be polymorfically deleted anyway, so you can do with a non-virtual destructor.
The main motivation is just that experienced C++ devs (myself included) have the rule, "base classes should have virtual destructors" drummed into their heads. Defining operator delete (even if the compiler elides it since, like you pointed out, it's never needed) allows devs to continue that habit without any negative consequences.
This makes instances of all derived classes be at least as long as a pointer -- many of them will be adding a data member so be longer yet, which will disqualify them for passing over in a general-purpose register (details per each ABI). This can have dramatic adverse effect on the overall performance of the code which, depending on the domain, can be very negative.
EDIT: actually, adding a virtual, or a destructor, alone (without size considerations) makes the type non-POD which prevents it from being returned in a register.
To be honest, a derived class already cannot be a POD. But stuffing the vtable pointer into any derived class is indeed wasteful for embedded programming and opens another way to exploit your system (if you care about security) by leaving code pointers at the fixed addresses in r/w memory.
Great points. In our cases, these objects were few in number and were accessed infrequently through a pointer, so I don't think there's much to be concerned about. Avoiding unnecessary vtables is definitely something to keep in mind, though.
also an operator delete is very much not required if you give a base class a virtual destructor, only if you are using a custom allocator; probably the OP is missing something in his description.
This is not about defining a custom operator delete, it's about defining it at all, because when compiling with -ffreestanding it is not provided, and apparently virtual destructors synthesize call to the delete operator.
that's kind of circular... having a virtual destructor is useful only if you actually are going to call delete. If you want to make sure that the object is never heap allocated, just declare the operator new as deleted and at least the issue would be always caught at compile time instead of just having a chance of catching it a runtime.
A few years ago I wrote an article entitle "Should you use C++ for an embedded project": http://blog.brush.co.nz/2011/01/cpp-embedded/ -- however, my embedded experience is mostly on very small micros, like less than 256KB flash and 64KB RAM. In those cases, I'd still very much lean towards the explicitness and simplicity of C. For less resource-constrained systems, C++ would be reasonable.
There is a tax that you pay by introducing C++, and that is chasing down things like inadvertent copy constructor/assignment operator calls, automatic memory allocs, and heavy-weight operator overloads, all of which are an inexhaustible source of untraceable performance bugs. In other words, C++ as a language is right at the sweet spot where it does not guarantee safety, while at the same time making performance unpredictable.
I never understood why people argue that operator overload is a bad thing. In a language that doesn't support it, you're going to have a function "T add(T, T)" which is pretty much the same, and can do anything. Overloading "T operator+(T, T)" is mostly syntactic sugar.
The Ardunio environment is C++; the Arduino people just don't emphasize that. Much of the stuff that needs heap allocation isn't there, but all the compile time class machinery is.
crosstools-NG is a PITA, the ARM team maintains a solid toolchain for the Cortex series at http://launchpad.net/gcc-arm-embedded/ you can even load it by PPA on Debian/Ubuntu systems.
For the larger parts there is also a perfectly functional memory allocator in newlib. So you can "go wild" if you want, although as the author points out, newlib is not without its race conditions for multi-threaded code.
The main issue with just using newlib's allocator is that we're using FreeRTOS, and (AFAIK), there isn't a good way to make the two aware of each other. One could also probably hook operator new up to to FreeRTOS, but we try to avoid dynamic memory allocation after init, like the article said.
> (AFAIK), there isn't a good way to make the two aware of each other
When you set up FreeRTOS you configure which heap implementation it uses. They provide a number of heapX.c files you can choose between. heap3.c uses malloc directly for instance, the others use built-in implementations of a heap algorithm of varying complexity.
If you want to use your own allocator, you just write your own heapX.c file. Use the malloc one as a template and call the newlib allocator instead of malloc. Link that in instead of any other heapX file and voila, FreeRTOS is using newlib to allocate memory.
IIRC, the concern was in the details of newlib's allocator. Its sbrk looks for a linker symbol, then starts slicing memory off of that address. There were worries that unless we were careful, it might stomp on FreeRTOS. (These worries could have been unfounded, but we didn't look into it deeply. We only dynamically allocate at startup, so we'd prefer FreeRTOS's heap_1.c anyways.)
I don't follow. If an allocator is optimized for our needs (i.e., allocating everything up front and never freeing), and is easier to set up than the general-purpose allocator in newlib, how is using it "technical debt"?
"These worries could have been unfounded, but we didn't look into it deeply. We only dynamically allocate at startup, so we'd prefer FreeRTOS's heap_1.c anyways."
The definition of technical debt is not taking the time to understand the components of the system, in favor of getting the system working to a given deadline.
Allocators are "well understood" by the code, which is to say that a lot of programmers assume they know what the allocator can do and what it can't do.
You've chosen to go this route of a static allocator which never frees as your solution because it meets your current requirements, however you've done so under the cover of the API of a more common allocator. This works great and you ship your project and move on.
Now months, or years later, someone else comes along and they are adding a feature or changing a small bit. They need an allocator that can allocate and free from the heap, but since they don't know your allocator can't free they just use the function calls, they all link but the ones to free() don't actually do anything. They run out of memory and die. And they start debugging the problem and realize that you didn't do the work to understand the allocator and to make it work in your setup, so in addition to them having to do that to use it in their code they either have to go back and rewrite your code to use the fully functional allocator, or they leave your code alone (that would be adding still more technical debt since you now have two allocators in the code that operate under different principles).
This is how software systems get so broken over time, people get it working and move on, not taking the time to think about how it is going to be used in the future or even if it can be used in the future. The code that sits there, and will cause future you or future maintainer is "debt" that eventually will have to be paid. Sometimes you can pay it with a refactor, sometimes you have to throw it all out and start over.
You're really making a bunch of uncharitable assumptions about our situation.
First, you assume that this behavior isn't well understood, well documented, and even expected in my organization and our corner of the industry. The reason FreeRTOS has an allocator that doesn't free is because it's standard behavior in embedded systems with hard time requirements.
Then you guess that we've taken no precautions against some future developer not having this knowledge. The allocator in question asserts if the user tries to free to avoid the very scenario you describe, and attempting to use newlib's allocator asserts with an explanation of why it's unused.
Technical debt is an issue I take very seriously, and I work hard to document the design decisions we've made. Choosing to use a tool that's designed specifically for our use case, instead of taking the time to set up a general-purpose allocator with overhead we don't care for, is hardly a debt that needs to be paid off.
And you are reading way more into my comment than I wrote, sorry for that.
The only point I make is this phrase:
"We didn't look into it deeply" ... "works for our purpose."
Is the definition of technical debt. The opposite of technical debt is "We looked at the options available, we clarified the requirements and the assumptions we depend on in the code for it to work, and we put these regression tests into place to warn us if someone tried to use the system in a way that violated one of our assumptions."
That sentence, which I quoted, and you wrote (its right up there in your comment) is my definition of technical debt. That my explanation wasn't clear, and that your above response still doesn't mention how you insured this its pretty clear we have very different ideas about what technical debt is and how it is mitigated.
What is a good resource for getting started with this kind of development? I've done some write-an-OS tutorials but nothing with bare-metal or RTOS yet, and I'd like to.
Is it necessary to call the constructors in the manner specified in the article? What would be the problem with using placement new in a designated block of memory?
The approach I laid out makes sure that the constructors for all of your global/static objects get called. AFAIK, there's no reason you couldn't do everything "by hand", but it seems like a lot more work for questionable gain.
I prefer forcing a top-down initialization using placement new for any extent globals, that way initialization order can be carefully controlled. E.g. the OS gets started before we even bother bringing anything else up, other initialization happens in a separate task, then worker tasks are brought up, etc. Everything is sequenced, and we don't have to go to the linker script to figure out what's actually getting called.
Since the globals are singletons, any attempt to get their instance when they've already initialized is an error that will blow the program up. Helps keep us honest.
However, there's no reason it has to be overly manual. As an example, you can make all the global object classes mixins for a variadic and use a variadic constructor and member methods to bring them all online in one fell swoop, even parameterizing construction via policy classes as template parameters. (Plus you have the added benefit of knowing via sizeof() on the variadic type the footprint of all the global objects.)
The gain would be to avoid the dependence on the linker script solution, particularly if tweaking linker scripts on a different compiler/build environments.
Embedded Systems are quite varied these days. The one in the article sounds more at the high end of the scale. In more resource constrained environments, C is still the king. ( though I'd prefer something far more robust than C that works well on resource constrained platforms and cross compiles to ansi C ).
I think once your system is large enough, C++ is not your only option, many languages can be used.
I've always read that there is no reason C++ cannot be used in any environment where C is used, assuming the compiler exists. There is nothing about C++ that makes it less resource-sensitive than C, and in fact with C++14 there is a tremendous amount of computation (including some instance methods!) that can be done at compile time! Scott Meyers shows some of this in his presentation at various conferences on embedded systems. It really opened my eyes to how useful templates can be in tight systems, and those don't exist in C.
To expand on this, C++ fundamentally uses the same machine model as C (or at least one that's very, very close). Where C++ differs from C is the array of high-level language constructs, which map down to that common machine model (this is what is meant by "zero-cost abstractions"). The idea is that when you use a construct like templates or virtual functions, what it compiles down to is not significantly different from the analogous code you would write in C (hand-duplicated routines with different concrete types and structs-of-function-pointers, resp.).
There are exceptions (no pun intended) to this, namely exceptions, RTTI, global construction/destruction, and the intricacies of new/delete. As the article points out (as will anyone who advocates the use of C++ in constrained environments), none of these are really necessary to use C++ effectively and can be safely stubbed out.
Absolutely. There was a fantastic talk at CppCon last year, using C++17 to program a game for the Commodore 64. Using standard C++ idioms, the optimizer was easily able to make it compile down to efficient machine code.
Agreed. It's worth noting that Arduinos, by default, are programmed in C++ (thanks the AVR port of GCC), and it's been this way more or less since their first version, which (as far as I recall) was a fairly small 8-bit micro. There's plenty of even less powerful micros, but it seems like architecture and tooling are really more of a constraint than capacity (e.g. low-end PIC micros have a less C-friendly architecture).
Agree. There is a lot to be gained from using C++ with absolutely no or imperceptible overhead.
I guess the only common embedded systems I would think twice before using C++ nowadays is 8051 and the very low end AVR/PIC (which can have as little as 32 bytes of RAM).
> Agree. There is a lot to be gained from using C++ with absolutely no or imperceptible overhead.
i would agree only if there is sufficient discipline and experience on the team, management included. Spaghetti C++ can be a lot harder to untangle than spaghetti C, in my experience.
Except that C++ is like 4 totally different languages over the years.
So, which one should I write in? And this company hates exceptions while that company doesn't allow dynamic allocations so I've got different dialects even IF you can get everybody to agree on which dialect of C++ you should write in.
And, why should I learn C++14 when C++17/18/19 will be another different language?
No, I'd rather put in the effort to make Rust usable. The C++ ship has sailed.
> And, why should I learn C++14 when C++17/18/19 will be another different language?
While it is indeed a pain to look at source code and understand the differences, the C++ standard gets very few releases than say Python or Ruby, and I doubt anyone can list all the differences between language versions without searching for them.
> No, I'd rather put in the effort to make Rust usable. The C++ ship has sailed.
Me too, but when it comes to what customers allow me to use, it is C++.
Sony and BMW are just now migrating to C++11 (not C++14, C++17, ...), how many years do you think it will take for something else to reach similar scale?
Which is why I am supportive of all attempts to improve the overall quality of C++, even C (UNIX variants are not going away) for that matter.
Rust still needs to improve a few things, which are anyway part of the 2017 roadmap, before Sony, BMW, Apple, Microsoft, IBM and similar companies decide to go Rust instead of C++yz or Swift.
It is a large language but, for me, I just think it's an awesome toolset to work with. Is not for everyone.
Also you are shouting a bit hyperbole. Each new standard does not erase the prior standard. C++17 does not remove the features from ++14. Each new version focuses on different things, and they have been really great additions.
> Also you are shouting a bit hyperbole. Each new standard does not erase the prior standard. C++17 does not remove the features from ++14. Each new version focuses on different things, and they have been really great additions.
I am not shouting hyperbole at all. Each new standard completely changes the way you are supposed to write new C++ code. The problem is that the legacy code was written the old way, with all the problems that entails.
So, you have to know everything about the new way AND you have to know all the pitfalls of the old ways for when you need to go debugging through the old code. Um, no thanks.
I've been following C++ since 1996-ish. I stopped following it after 2011. It's just not worth the pain.
I'm incredibly interested in getting Rust into embedded devices, but it's a much larger sell than moving from C to C++14. And how will C++17 onward be so much different? C++11 was a huge inflection point, no doubt, but the standards committee works tirelessly to make things backwards-compatible.
As the article and others here have said, if you have the requisite proficiency and understanding of C++ and the C++ compiler(s) that you're using, and if you are using a high quality optimizing compiler like GCC, there is no reason that C++ code will be any larger in footprint than C code.
As the author points out, C++ offers very significant and tangible benefits, and in the right hands/with the right discipline should be less error prone and result in more efficient code. In my opinion, C++'s greatest benefit for embedded programming is that it has a wealth of abstractions that have no runtime overhead.
The are several reasons that C is still king, but I suspect the main one is portability. Your company may need to be able to port its software to some obscure microcontroller where the only compiler available is a buggy C89 implementation. Or if it has C++ support, odds are that it is inefficient and out of date.
Because of this and other reasons, the labor pool of "deeply" embedded software engineers has a heavy bias towards C over C++. This compounds the problem, reducing the incentive for embedded engineers to learn modern C++.
I don't think we read the same article; "higher end of the scale" is not going to be bothered by RTTI and exceptions which TFA suggests disabling. Overhead of C++ without those vs C is pretty minimal.
I find that -ffreestanding is too limiting in the presence of a decent C library. If newlib and libsupc++ are available, then you probably don't want -ffreestanding. Instead, provide your own syscalls and rely on the linker to tell you when you've inadvertently pulled in something extra heavy.
Nice article! I was doing C++ for device drivers in the early 90s and my advice back then was more or less the same. Open source toolchains didn't really exist back then, at least on the PC, so some stuff was harder to achieve.
I would not use C++ on firmware though. Usually memory/space requirements are far more important than OOP lang features. And good software can be developed without them.
C++, especially modern standards of the language, offer far more than OOP features.
/If/ you know the language and tools well, there is no reason that C++ won't be as efficient in space and time as the equivalent C code. C++ offers real gains to be had over C for embedded software, with rich (if a bit cryptic) zero-overhead abstractions, and higher level constructs that can eliminate entire classes of errors.
Nothing guarantees that it is zero-overhead, it all depends on the efforts, quality and mood of the C++ compiler. Whereas in C you can at least rely on the fact that even without any complex optimisation it will at worst produce machine code that looks 99% like the C code, because it maps 1 -> 0.99. Good luck telling what is the complexity hidden behind a C++ operator.
Right, that was my point. You have to use time/effort on understanding the compiler workings, time that could be used to focus on your actual problem and features implementation. For firmware I wouldn't.
One thing I would add is: enable all the warnings you can and make them errors: "-Wall -Wextra -Wpedantic -Werror".
I know it's hard to do in practice (who will want to fix the thousands of warnings in a well seasoned codebase?) but it's a good idea for new projects. Especially if team uses the same compiler (sometimes the code produces warning with newer gcc versions).