> C instead of C++ because "it's faster" (spoiler: it probably doesn't matter for your project)
If your C is faster than your C++ then something has gone horribly wrong. C++ has been faster than C for a long time. C++ is about as fast as it gets for a systems language.
What is your basis for this claim? C and C++ are both built on essentially the same memory and execution model. There is a significant set of programs that are valid C and C++ both -- surely you're not suggesting that merely compiling them as C++ will make them faster?
There's basically no performance technique available in C++ that is not also available in C. I don't think it's meaningful to call one faster than the other.
This is really an “in theory” versus “in practice” argument.
Yes, you can write most things in modern C++ in roughly equivalent C with enough code, complexity, and effort. However, the disparate economics are so lopsided that almost no one ever writes the equivalent C in complex systems. At some point, the development cost is too high due to the limitations of the expressiveness and abstractions. Everyone has a finite budget.
I’ve written the same kinds of systems I write now in both C and modern C++. The C equivalent versions require several times the code of C++, are less safe, and are more difficult to maintain. I like C and wrote it for a long time but the demands of modern systems software are a beyond what it can efficiently express. Trying to make it work requires cutting a lot of corners in the implementation in practice. It is still suited to more classically simple systems software, though I really like what Zig is doing in that space.
I used to have a lot of nostalgia for working in C99 but C++ improved so rapidly that around C++17 I kind of lost interest in it.
None of this really supports your claim that "C++ has been faster than C for a long time."
You can argue that C takes more effort to write, but if you write equivalent programs in both (ie. that use comparable data structures and algorithms) they are going to have comparable performance.
In practice, many best-in-class projects are written in C (Lua, LuaJIT, SQLite, LMDB). To be fair, most of these projects inhabit a design space where it's worth spending years or decades refining the implementation, but the combination of performance and code size you can get from these C projects is something that few C++ projects I have seen can match.
For code size in particular, the use of templates makes typical C++ code many times larger than equivalent C. While a careful C++ programmer could avoid this (ie. by making templated types fall back to type-generic algorithms to save on code size), few programmers actually do this, and in practice you end up with N copies of std::vector, std::map, etc. in your program (even the slow fallback paths that get little benefit from type specialization).
Having written a great deal of C code, I made a discovery about it. The first algorithm and data structure selected for a C program, stayed there. It survives all the optimizations, refactorings and improvements. But everyone knows that finding a better algorithm and data structure is where the big wins are.
Why doesn't that happen with C code?
C code is not plastic. It is brittle. It does not bend, it breaks.
This is because C is a low level language that lacks higher level constructs and metaprogramming. (Yes, you can metaprogram with the C preprocessor, a technique right out of hell.) The implementation details of the algorithm and data structure are distributed throughout the code, and restructuring that is just too hard. So it doesn't happen.
A simple example:
Change a value to a pointer to a value. Now you have to go through your entire program changing dots to arrows, and sprinkle stars everywhere. Ick.
Or let's change a linked list to an array. Aarrgghh again.
Higher level features, like what C++ and D have, make this sort of thing vastly simpler. (D does it better than C++, as a dot serves both value and pointer uses.) And so algorithms and data structures can be quickly modified and tried out, resulting in faster code. A traversal of an array can be changed to a traversal of a linked list, a hash table, a binary tree, all without changing the traversal code at all.
C and C++ do have very different memory models, C essentially follows the "types are a way to decode memory" model while C++ has an actual object model where accessing memory using the wrong type is UB and objects have actual lifetimes. Not that this would necessarily lead to performance differences.
When people claim C++ to be faster than C, that is usually understood as C++ provides tools that makes writing fast code easier than C, not that the fastest possible implementation in C++ is faster than the fastest possible implementation in C, which is trivially false as in both cases the fastest possible implementation is the same unmaintainable soup of inline assembly.
The typical example used to claim C++ is faster than C is sorting, where C due to its lack of templates and overloading needs `qsort` to work with void pointers and a pointer to function, making it very hard on the optimiser, when C++'s `std::sort` gets the actual types it works on and can directly inline the comparator, making the optimiser work easier.
Try putting objects into two linked lists in C using sys/queue.h and in C++ using the STL. Try sorting the linked lists. You will find C outperforms C++. That is because C’s data structures are intrusive, such that you do not have external nodes pointing to the objects to cause an extra random memory access. The C++ STL requires an externally allocated node that points to the object in at least one of the data structures, since only 1 container can manage the object lifetimes to be able to concatenate its node with the object as part of the allocation. If you wish to avoid having object lifetimes managed by containers, things will become even slower, because now both data structures will have an extra random memory access for every object. This is not even considering the extra allocations and deallocations needed for the external nodes.
That said, external comparators are a weakness of generic C library functions. I once manually inlined them in some performance critical code using the C preprocessor:
It seems like your argument is predicated on using the C++ STL. Most people don’t for anything that matters and it is trivial to write alternative implementations that have none of the weaknesses you are arguing. You have created a bit of a strawman.
One of the strengths of C++ is that it is well-suited to compile-time codegen of hyper-optimized data structures. In fact, that is one of the features that makes it much better than C for performance engineering work.
Most C++ code I have seen uses the STL. As for “hyper optimized” data structures, you already have those in C. See the B-Tree code those binary search routine I patched to run faster. Nothing C++ adds improves upon what you can do performance wise in C.
You have other sources of slow downs in C++, since the abstractions have a tendency to hide bloat, such as excessive dynamic memory usage, use of exceptions and code just outright compiling inefficiently compared to similar code in C. Too much inlining can also be a problem, since it puts pressure on CPU instruction caches.
C and C++ can be made to generate pretty much the same assembly, sure. I find it much easier to maintain a template function than a macro that expands to a function as you did in the B-Tree code, but reasonable people can disagree on that.
Abstractions can hide bloat for sure, but the lack of abstraction can also push coders towards suboptimal solutions. For example C code tends to use linked lists just because its easy to implement when a dynamic array such as std::vector would have been more performant.
Too much inlining can of course be a problem, the optimizer has loads of heuristics to decide if inlinining is worth it or not, and the programmer can always mark the function as `[[gnu::noinline]]` if necessary. It is not because C++ makes it possible for the sort comparator to be inlined that it will.
In my experience, exceptions have a slightly positive impact on codegen (compared to code that actually checks error return values, not code that ignores them) because there is no error checking on the happy path at all. The sad path is greatly slowed down though.
Having worked in highly performance sensitive code all of my career (video game engines and trading software), I would miss a lot of my toolbox if I limited myself to plain C and would expect to need much more effort to achieve the same result.
Having worked on performance sensitive code (OpenZFS), I have found less to be more.
While C code makes more heavy use of linked lists than C++ code, most of the C code I have helped maintain made even heavier use of balanced binary search trees and B-trees than linked lists. It also used SLAB allocation to amortize allocation costs. In the case of OpenZFS, most of the code operated in the kernel where external memory fragmentation makes dynamic arrays (and “large” arrays in general) unusable.
I think you have not seen the C libraries available to make C even better. libuutil and libumem from OpenSolaris make doing these things extremely nice. Some of the first code I wrote professionally (and still maintain) was written in C++. There really is nothing from C++ that I miss in C when I have such libraries. In fact, I have long wanted to rewrite that C++ code in C since I find it easier to maintain due to the reduced abstractions.
This is not a convincing argument for C. None of this matches my experience across many companies. In particular, the specific things you cite — excessive dynamic memory usage, exceptions, bloat — are typically only raised by people who don’t actually use C++ in the kinds of serious applications where C++ is the tool of choice. Sure, you could write C++ the way you describe but that is just poor code. You can do that in any language.
For example, exceptions have been explicitly disabled on every C++ code base I’ve ever worked on, whether FAANG or a smaller industrial company. It isn’t compatible with some idiomatic high-performance software architectures so it would be weird to even turn it on. C++ allows you to strip all bloat at compile-time and provides tools to make it easy in a way that C could only dream of, a standard metaprogramming optimization. Excessive dynamic allocation isn’t a thing in real code bases unless you are naive. It is idiomatic for many C++ code bases to never do any dynamic allocation at runtime, never mind “excessive”.
C++ has many weaknesses. You are failing to identify any that a serious C++ practitioner would recognize as valid. In all of this you also failed to make an argument for why anyone should use C. It isn’t like C++ can’t use C code.
This risks becoming a no true Scotsman, but it is indeed true that there is really no common idiomatic C++. Even the same code base can use vastly different styles in different areas.
Even regarding exceptions, I would not touch them anywhere close to the critical path, but, for example, during application setup I have no problem with them. And yet I know of people writing very high performance applications that are happy to throw on the critical path as long as it is a rare occurence.
> Sure, you could write C++ the way you describe but that is just poor code.
C++ puts people into a sea of complexity and then blames them when they do not get a good result. The purpose of high level programming languages is to make things easier for people, not make them even more likely to fail to write good code and then blame them when they do not.
> For example, exceptions have been explicitly disabled on every C++ code base I’ve ever worked on, whether FAANG or a smaller industrial company.
Unfortunately, C++ does not make exceptions optional and even if you use a compiler flag to disable them, libraries can still throw them. Just allocating memory can throw them unless you use the nothrow version of the new operator that C++11 introduced. Alternatively, you could use malloc with a placement new and then manually call the destructor before freeing the memory. As far as I know, many C++ developers who disable exceptions do not do either. Then when a memory allocation fails, their program terminates.
That said, there are widely used C++ code bases that rely on exception handling. One of the most famous is the Windows NTFS driver, which reportedly makes heavy use of structured exception handling:
> It isn’t compatible with some idiomatic high-performance software architectures so it would be weird to even turn it on. C++ allows you to strip all bloat at compile-time and provides tools to make it easy in a way that C could only dream of, a standard metaprogramming optimization. Excessive dynamic allocation isn’t a thing in real code bases unless you are naive. It is idiomatic for many C++ code bases to never do any dynamic allocation at runtime, never mind “excessive”.
I have seen plenty of C++ software throw exceptions in wine, since it prints information about it to the console. It is amazing how often exceptions are used in the normal operation of such software. Contrary to your assertions of everyone turning off exceptions, these exceptions are caught and handled. Of course, this goes unseen on the original platform, so the developers likely have no idea about all of the exceptions that their code throws.
As for excessive dynamic memory allocations, C++11 introduced move semantics to eliminate a major source of them, and that very much was a problem in real code. It still can be, since you need to always use std::move and define move constructors. C++ itself tends to discourage use of intrusive data structures (as they break encapsulation), which means doing more dynamic allocations than C code does since heavy use of intrusive data structures in C code avoids allocations that are mandatory without them.
> C++ has many weaknesses. You are failing to identify any that a serious C++ practitioner would recognize as valid. In all of this you also failed to make an argument for why anyone should use C.
My goal had been to say that performance was not a reason to use C++ over C, since C++ is often slower. Had my goal been different, there is plenty of material I could have used. For example, I could have noted the binary compatibility breakage that occurred for C++11 in GCC 5.0, the horrendous error messages from types that are effectively paragraphs, changes in the STL definitions across different versions that break code, and many other things, but I was not trying to throw stones at an obviously glass house.
> It isn’t like C++ can’t use C code.
It increasingly cannot. If C headers use variably modified types and do not have a guard macro an alternative for C++ that turns them into regular pointers, C++ cannot use the header. Here is an example of code using it that a C++ compiler cannot compile:
Unfortunately Stepanov and the STL are widely misunderstood. Stepanov core contributions is the set of concepts underlying the STL and the iterator model for generic programming. The set of algorithms and datastructures in the STL was only supposed to be a beginning, was never supposed to be a finished collection. Unfortunately many, if not most treat it that way.
But if you look beyond, you can find a whole world that extend the stl. If you are not happy, say, with unordered_map, you can find more or less drop in replacements that use the same iterator based interface, preserve value semantics and use the a common language to describe iterator and reference invalidation.
Regarding your specific use case, if you want intrusive lists you can use boost.intrusive which provides containers with STL semantics except it leaves ownership of the nodes to the user. The containers do not even need to be lists: you can put the same node in multiple lists linked list, binary trees (multiple flavors), and hash maps (although this is not fully intrusive) at the same time.
These days I don't generally need boost much, but I still reach for boost.intrusive quite often.
I have met a number of people who will not use the boost libraries. It has been so long that I have long forgotten their reasons. My guess is that it had to do with binary compatibility issues.
Except, nothing forbids me to use two linked lists in C++ using sys/queue.h, that is exactly one of the reason why Bjarne built C++ on top of C, and also unfortunely a reason why we have security pain points in C++.
Yet the C++ community is continually trying to get people to stay away from anything involving C. That said, newer C headers using _Generic for example are not usable from C++.
Because C++ was "TypeScript for C", plenty of room to improvement that WG 14 refuses to act on for the last 50 years.
Yes, most language features past the C89 subset are not supported, besides the C standard library, because C++ has much better alternatives, like why _Generic when templates are a much saner approach, than type dispatching with the pre-processor.
However that is besides the point, 99% of C89 code minus a few differences, is valid C++ code, and if the situation so requires, C++ code can be exactly the same way.
And lets not forget most FOSS projects have never moved beyond C89/C99 anyway, so stuff like _Generic is of relative importance.
C, unlike C++, does not really force new versions onto you, even if dependencies begin using them. That said, Linux switched to C11. Newer versions of C will gradually be adopted, despite the incompatibilities this causes for C++.
As for WG 14, they incorporated numerous C++isms into C. While you claim that they did not go far enough, I am sure you will find many who would say that they went too far.
> I very much doubt it, when someone decides to make full of use of recent ISO C on a C library header file.
I have tested this WRT _Generic as it was a concern. It turns out that GCC will accept it on older versions, which permits compatibility. You might feel that this is wrong, but that is how things are right now.
In my experience, templates usually cause a lot of bloat that slows things down. Sure, in microbenchmarks it always looks good to specialize everything at compile time, whether this is what you want in a larger project is a different question. And then, also a C compiler can specialize a sort routine for your types just fine. It just needs to be able too look into it, i.e. it does not work for qsort from the libc. I agree to your point that C++ comes with fast implementations of algorithms out-of-the-box. In C you need to assemble a toolbox yourself. But once you have done this, I see no downside.
The performance gain comes not from eliminating the function overhead, but enabling conditional move instructions to be used in the comparator, which eliminates a pipeline hazard on each loop iteration. There is some gain from eliminating the function overhead, but it is tiny in comparison to eliminating the pipeline hazard.
That said, C++ has its weaknesses too, particularly in its typical data structures, its excessive use of dynamic memory allocation and its exception handling. I gave an example here:
Nice catch. I had goofed by omitting optimization when checking this from an iPad.
That said, this brings me to my original reason for checking this, which is to say that it did not use a cmov instruction to eliminate unnecessary branching from the loop, so it is probably slower than a binary search that does:
It should be possible to adapt this to benchmark both the inlined bsearch() against an implementation designed to encourage the compiler to emit a conditional move to skip a branch to see which is faster:
My guess is the cmov version will win. I assume merits a bug report, although I suspect improving this is a low priority much like my last report in this area:
> If your C is faster than your C++ then something has gone horribly wrong. C++ has been faster than C for a long time.
In certain cases, sure - inlining potential is far greater in C++ than in C.
For idiomatic C++ code that doesn't do any special inlining, probably not.
IOW, you can rework fairly readable C++ code to be much faster by making an unreadable mess of it. You can do that for any language (C included).
But what we are usually talking about when comparing runtime performance in production code is the idiomatic code, because that's how we wrote it. We didn't write our code to resemble the programs from the language benchmark game.
I doubt that because C++ encourages heavy use of dynamic memory allocations and data structures with external nodes. C encourages intrusive data structures, which eliminates many of the dynamic memory allocations done in C++. You can do intrusive data structures in C++ too, but it clashes with object oriented idea of encapsulation, since an intrusive data structure touches fields of the objects inside it. I have never heard of someone modifying a class definition just to add objects of that class to a linked list for example, yet that is what is needed if you want to use intrusive data structures.
While I do not doubt some C++ code uses intrusive data structures, I doubt very much of it does. Meanwhile, C code using <sys/queue.h> uses intrusive lists as if they were second nature. C code using <sys/tree.h> from libbsd uses intrusive trees as if they were second nature. There is also the intrusive AVL trees from libuutil on systems that use ZFS and there are plenty of other options for such trees, as they are the default way of doing things in C. In any case, you see these intrusive data structures used all over C code and every time one is used, it is a performance win over the idiomatic C++ way of doing things, since it skips an allocation that C++ would otherwise do.
The use of intrusive data structures also can speed up operations on data structures in ways that are simply not possible with idiomatic C++. If you place the node and key in the same cache line, you can get two memory fetches for the price of one when sorting and searching. You might even see decent performance even if they are not in the same cache line, since the hardware prefetcher can predict the second memory access when the key and node are in the same object, while the extra memory access to access a key in a C++ STL data structure is unpredictable because it goes to an entirely different place in memory.
You could say if you have the C++ STL allocate the objects, you can avoid this, but you can only do that for 1 data structure. If you want the object to be in multiple data structures (which is extremely common in C code that I have seen), you are back to inefficient search/traversal. Your object lifetime also becomes tied to that data structure, so you must be certain in advance that you will never want to use it outside of that data structure or else you must do at a minimum, another memory allocation and some copies, that are completely unnecessary in C.
Exception handling in C++ also can silently kill performance if you have many exceptions thrown and the code handles it without saying a thing. By not having exception handling, C code avoids this pitfall.
Ahh yes, now we are getting somewhere. "C++ is faster because it has all these features, no not those features nobody uses those. The STL, no, you rewrite that"
The poster you are responding to is correct. Modern C++ has established idiomatic code practices that are widely used in industry. Imagining how someone could use legacy language features in the most naive possible way, contrary to industry practice, is not a good faith argument. You can do that with any programming language.
You are arguing against what the language was 30-40 years ago. The language has undergone two pretty fundamental revisions since then.
> If your C is faster than your C++ then something has gone horribly wrong. C++ has been faster than C for a long time. C++ is about as fast as it gets for a systems language.
If your C is faster than your C++ then something has gone horribly wrong. C++ has been faster than C for a long time. C++ is about as fast as it gets for a systems language.