As a C programmer for 20+ years, I find most "don't use C" articles banal and snobbish (even when parts of them are correct) but this one's really good. Well reasoned, well explained, not condescending at all. Kudos.
I like the fact that textual includes are at the top of the list. What an abomination! Modules have been known and clearly superior since forever, both for correctness and for compile times. Every time I see macro abuse in C, or exploding compile times in C++ (which oddly doesn't even get mentioned in this section) I cry a little for the future of my profession. Optional delimiters, default fall-through, weak typing - check, check, check. Good points, well argued. I've had to fix bugs caused by all of these. I happen to agree on type first, disagree on single return (and it's a shame Zig doesn't seem to be on the radar especially because of this), but these are certainly discussions worth having and the OP provides a good starting point.
I work with a large vendor code base that is ~300k lines of C. There are 515 different #ifdef conditional symbols (thank you, grep). The code base started probably >20 years ago and has has who knows how many engineers working on it. There are 20+ years of standards' evolution in this code, across who knows how many chipsets, the vendor changing hands, etc. And yet, in spite of the maze, the code works very well!
I would love to see a new language, technology, something, take on this sort of problem. C works so well in this sort of problem because of the preprocessor's ability to strip out bits & pieces that aren't necessary at the time.
We need something that will allow us to fall into the pit of success rather than succumb to the easy solution of using #ifdef for new, optional features.
Is anything new really needed here? The problem that the kernel etc. solves with ifdefs and function pointers doesn't seem that different than what one can do with interface inheritance in just about any OO language. With smart inlining and LTO it can even be efficient. The problem is how to do those platform/feature checks at build time instead of run time, but I'd say that's the build system's problem ('make' or equivalent) not the language's. Ifdef there makes a lot more sense, even for languages that use modules.
> I would love to see a new language, technology, something, take on this sort of problem.
I know for a fact that Rust can do this[0], though I couldn't speculate as to whether it would be better or worse than the equivalent done with the CPP (I am fairly convinced it would be harder to fuck up at least), and I would be very surprised if the other modern systemsy languages (D, zig, etc.) didn't offer similar things.
I suppose you mean something besides dead code elimination? Because that looks like it could be solved with proper encapsulation and abstraction mechanisms in the language, which C lacks.
It's a pretty amazing problem--a wifi chipset. The driver runs on Linux and Windows, also supports 3x different CPU architectures (x86, mips, arm) and multiple bus interfaces (SoC internal, usb, pci). Supports disabling optional features so 2.4Ghz only chipsets have a smaller memory footprint. Can build with/without different security options. And that's just starting in on the huge number IEEE 802.11 standards across the last 20 years. Some chips support some of the features, some don't -- but the driver still has to work on all of them.
Linux kernel is an amazing example of how to do ^^^that and still produce beautiful code. The Linux kernel uses a great number of function pointers to accomplish their encapsulation goal.
Dead code elimination would require all code paths to be compilable by the compiler being run.
That need not be the case, and typically isn’t. If you do
#if USE_FOO
#include <foo.h>
#endif
the foo.h header only need to be present on platforms where USE_FOO is defined, and need only be compilable on those platforms (it might use nonstandard compiler extensions, for example containing assembly instructions)
This is something Common Lisp does with its reader macros. There's a global variable FEATURES, which is populated by one's implementation (and so might indicate OS, implementation &c.). When one reads in a file of code, there're two reader macros #+ & #- which conditionally include code, based on whether or no symbols are present in FEATURES. E.g.:
Yes, it's a good summary of points, although a few are very much a personal preference ({} delimiters).
What's not discussed is the layer below the language that so often ties us to it anyway: linkage. C also does not require anything that can be called a "runtime", although you usually get some in libc. These factors result in it being easier to write libraries in C that can be imported by FFI into more modern languages than the other way round.
Very good point about linkage and libraries. The fact that very few of the newer languages (except e.g. Rust and Zig) can be used to write universally usable libraries is unfortunate. The fact that some languages (e.g. Go) have trouble even consuming such libraries is downright antisocial.
D has a "Better C" mode which removes the runtime and some of the language features. It kind of feels like C with modules, without macro abuse, without fallthrough, strong typing.
Multiple return has two distinct use cases - to return multiple actual values, or to return an error status along with an actual value. Many languages can handle both cases by making it easy to create and return a list/tuple/whatever, and that satisfies both cases well enough for me. Others provide Optional or Error types which satisfy only the second case but do so in an even more concise way (no packing and unpacking of those lists/tuples/whatever). Zig is in that category, with some additional twists like errdefer and how null is handled. I'm not 100% sure I'd make the choices they did, but I think those sections of the Zig documentation are well worth reading to spur more thoughts about alternatives to C-style single return or exceptions.
Return status and data is a great argument for multiple value returns but it's simple enough to do in C. You just need an ADT and interface. Who hasn't done OO in C with mixed
results?
Yes, you can do that, but it's always going to be far more cumbersome than native multi-value return would be. Even in the simplest case (stack allocation in the caller), you have to define a new struct that has no other purpose, in a place visible to both caller and callee. You also have to refer to the status and value as struct members rather than simple variables. That's already worse than using a return value for the status and a reference parameter for the value.
I guess you could say that multi-value return is just syntactic sugar for more reference parameters. That's mostly true at the source level, but at the object level not quite. Using a reference parameter requires an extra stack operation (or register allocation) to push the parameter's address, in addition to whatever was done to allocate the parameter itself. Then the callee has to access it via pointer dereferences. With true multi-value return neither is necessary. The exact details are left as an easy exercise for any reader who knows how function calls work at that level. (BTW yes, interprocedural optimization can make that overhead go away, but a lot of code is written in C specifically to avoid relying on super-smart compilers and runtimes.)
Multi-value return isn't a huge deal, though. I appreciate having it in Python, but in almost thirty years I rarely missed it in C. While I wouldn't necessarily oppose its addition, I think it's more in keeping with C's general spirit of simplicity to leave it out.
Who says the struct has no other value? An ADT is the only place this approach makes any sense and these are not typically comprised of two values/references alone.
A interface devised just to unwind information from
a two member struct is beyond 'cumbersome'.
I tried to love Zig but the lack of destructors are killing me! They're such a useful tool for resource management, and apparently they weren't added in order to make all control flow "explicit" or something.
I've never actually used Zig (yet), but I think their choice is reasonable. It's not an object oriented language. Using destructors for resource management in such languages is great, but I've also seen a lot of C++ code that abuses the object-lifetime machinery. The most common is lock pseudo-objects, which "exist" only so that they can be released via a destructor when they go out of scope. Those aren't real objects. They don't contain any data. They're abstractions, which should be dealt with with other guard/context language structures. I'll admit that "defer" is a bit low-level, but it does handle most resource-release use cases in a way that's consistent with the rest of Zig.
What non-object-oriented approach would you suggest instead?
> Using destructors for resource management in such languages is great, but I've also seen a lot of C++ code that abuses the object-lifetime machinery.
That's not abuse, that's the most important pattern in C++ (RAII). You mustn't think of C++ "objects" as Java or C# or Smalltalk objects - they are not, they are deterministic resource managers before everything.
> You mustn't think of C++ "objects" as Java or C# or Smalltalk objects
As long as people persist in calling them objects, and use all of the other object-oriented concepts/terminology such as classes and inheritance, people will expect them to be objects. That's not unreasonable. It's absurd to look down your nose at people who are taking you at your word.
These other uses are hacks. If you want to be able to attach something to a scope that's great, actually it's a wonderful idea, but just be honest about it. Make scopes a first-class concept, give things names that reflect their scope-oriented meaning and usage. Python's "with" is a step in the right direction; even though the implementation uses objects, they're objects that implement a specific interface (not just creation/destruction) and their usage is distinct. That separation allows scope-based semantics to evolve independently of object-lifetime semantics, which are already muddled by things like lambdas, futures, and coroutines. Tying them together might be an important pattern in C++, but it's also a mistake. Not the first, not the last. Making mistakes mandatory has always been the C++ way.
> If you want to be able to attach something to a scope that's great, actually it's a wonderful idea, but just be honest about it.
We don't want to attach resources to scopes, we want to attach them to object lifetimes. That's why defer/with/unwind-protect are not alternatives to RAII. The lifetime of an object I pushed to a vector is not attached to any lexical scope in the program text, it is attached to the dynamic extent during which the object is alive. While a scope guard always destroys its resource at the end of a block, RAII allows the resource lifetime to be shortened, by consuming it inside the block, or prolonged, by moving it somewhere with a dynamic extent that outlives the end of the block.
Here's an example where defer solves nothing: if I ask Zig to shrink an ArrayList of strings, it drops the strings on the ground and leaks the memory because Zig has no notion of the ArrayList owning its elements. You need to loop over the strings you are about to shrink over and call their destructors, which is literally the hard part, since the actual shrink method just assigns to the length field. The lack of destructors (Zig has no generic notion of a destructor) here impedes generic code since what you do for strings is different than what you do for ints.
RAII guards are real objects, they contain real data (drop flags), and they make code safer and more generic. If you don't like RAII, show comparable solutions (which scope guards are not), don't just call it a hack and adduce philosophical notions of how OOP should work.
> We don't want to attach resources to scopes, we want to attach them to object lifetimes.
Is that the royal "we"? Because for people who aren't you, it's only true some of the time. Sure, true resource acquisition/release is tied to object lifetimes. That's almost a tautology. But that doesn't work e.g. for lock pseudo-objects, which very much are expected and meant to be associated with a scope. It just happens to work out because the object and scope lifetimes are usually the same, but it's still a semantic muddle and it does break for things like lambdas and coroutines.
> if I ask Zig to shrink an ArrayList of strings
That's a silly and irrelevant example, having more to do with ownership rules (which C++ makes a very unique mess of) more than scopes vs. objects. Any Zig code anywhere that shrinks a list of strings had better handle freeing its (now non-) members. No, defer doesn't cover that case. Yes, destructors would, but this isn't an OO language. The obvious solution (same as in C) is to define a resize_string_list function. Again, what can you suggest in a non-OO language that's better?
> The lack of RAII here impedes generic code
You really don't want to get into a discussion about C++ and generics. Trust me on that. Yes, you need to do different things for strings and ints, but there are many ways besides C++'s unique interpretation of RAII (e.g. type introspection) to handle that.
> If you don't like RAII, show comparable solution
Done. Your turn. If you want to be constructive instead of just doctrinaire, tell us what you'd do without OO to address these situations better than existing solutions.
P.S. Also, what's with all the nonsense-word accounts in this thread taking offense at things said to jcelerier and responding with exactly the same points in exactly the same tone?
They are not comparable to RAII, see my comment above. Even the fact it has to bifurcate into defer and errdefer suggests that it lacks the generality to replace a totalizing resource management solution.
No, they're not the same as your beloved RAII, but this isn't an object-oriented language so it doesn't have destructors. For the third time, what better solution do you propose for a non-OO language?
This seems equivalent to scope(exit)/scope(success)/scope(failure) in D. The drawback of this construct is that you need to repeat 2 or 3 lines of code each time you need safe cleanup or a commit/abort type of construct. This can become pretty repetitive pretty fast.
Python's indentation for blocks isn't holding up to the test of time.
After suffering badly formatted code in my early career, Python's approach seemed refreshing. But after suffering one too many bad merges where indentation is left mangled, we need braces. You can automatically reformat everything instantly without even thinking about it, eliminating merge ambiguity. gofmt is the better solution because it declares a strict representation that can be automatically enforced.
You could retort that better rebase/merge practices could alleviate some of these issues but if that discipline could be enforced on arbitrary groups of humans, we wouldn't have cheered Python's forced indentation in the first place.
I'm not an expert on Python by any means, but if I understand correctly, the end of the indent level is the only way you know that the block ended. That seems to me to make it impossible to write semantics-aware merging (presuming I understood what you meant by the term).
What I thought you meant is that you need something that will straighten out the mangled whitespace. I didn't think that was possible, because the only thing that marks where blocks begin and end is whitespace. If the whitespace gets mangled, there's no other syntax you can use to straighten it out.
But I'm beginning to think you meant that you want a merge tool that doesn't mangle the whitespace in the first place. If the pre-merge whitespace is unmangled, and the merge tool understand Python whitespace, then at a minimum it should be able to flag ambiguous changes and ask for help.
Right, a merge tool that understands Python semantics at some level should be able to detect when a block's indentation has changed, and adjust the indentation of any changes to that block correspondingly.
> Python's indentation for blocks isn't holding up to the test of time.
Citation needed. Python popularity seems to be only growing up and in all kinds of industries. Hard to find somewhere where python hasn't touched yet.
Also, as you mentioned, is pretty standard to have some sort of flake8 as part of your CI and it would certainly detect problems with bad identation that would cause code issues.
If the basic flake8 tests pass then should certainly have some unit / integration tests as part of your merge request...
Now a days there are even things like black [https://github.com/ambv/black] that does auto formatting just like gofmt.
A quick tour through history will show that minor language defects have never been a showstopper for adoption. Python is popular in spite of it's minor irritations (and all languages have minor irritations in some form).
Further, all the human conditions that resulted in bad code style before are still present and showing through in Python codebases. Namely organizations with loose standards or inadequate tooling where language problems get magnified. This is remarkably evident in companies that used to be 100% C/C++ or Java shops and never quite figured out unit tests.
Also indentation-based syntax makes it (almost?) impossible to have an expression based language and IMO delimiters are definitely a price worth paying for that.
There are a few more faults in C that I would call out that make it actually a pretty bad low-level language:
* No bitcast operator. Your options are a) use the union trick; b) use the memcpy trick; or c) take an address, cast it to a different pointer type, and hope that your compiler gives you a pass on technically violating the standard here. Even C++ didn't get bitcast until C++20.
* No SIMD vector types. Of course, SIMD vectors are even more type-punning heavy than integers and floats, so you do need a good bitcast operator to get anywhere.
* Volatile and atomic are type qualifiers. These ought to be properties of the memory accesses; making them qualifiers on types obfuscates which memory accesses they apply to. If you look at the Linux kernel, it doesn't use volatile but instead uses a READ_ONCE macro that acts much like a volatile load.
* Bitfields are a mistake. They combine especially poorly with the vague properties of volatile. How many memory accesses are required in this program:
struct {
volatile unsigned a : 3;
volatile unsigned b : 2;
} foo;
foo.a = 3;
foo.b = 0;
Point taken that newbies are going to be confused that "odd / even == integer", but they'd be equally confused about why odd / even yields something close to what they expected but not quite (floating point errors).
In both cases, they're probably going to be confused why an equality test a few lines later fails every once and a while.
The concept of divide vs. div & mod is one that a programmer eventually needs to understand. More importantly, not everything should be optimized for newbies. The context driven / operator is appropriate in programming languages designed for experienced programmers.
Upon further thought... Couldn't the argument be rephrased as "integers are broken numbers"? It sounds silly, but from a newbie's perspective the same problem exists with "int f = 38.2 * 25;".
The problem isn't necessarily that / does integer division: it's that it ONLY SOMETIMES does integer division. It should either always do proper division, or always do integer division, but not try and guess what the programmer wants.
An example from C#: when I was a Unity developer, I frequently needed to figure out what the aspect ratio of the screen is. You'd think this would work:
float aspect = Screen.width / Screen.height;
since 99% of all parameters in Unity are 32 bit floats (common in gamedev). But no! Screen.width and Screen.height happen to be integers, so this particular line silently returns that the screen is actually square. I've literally run into this exact bug with Screen.width/height three or four different times. Every time I feel like an idiot, even though it's not my fault that C# inherited C's dumb division operator.
Python does it correctly, and C-like languages should as well. Division used with two integers should always return a floating point number, and there should be a separate operator for integer division. It makes no sense the way it's done now.
As far as the confusion between the assignment operator and the equality operator, I took somebody's advise a long time ago and always put the literal on the left of the equality operator.
3 = x
Doesn't solve everything but it does help.
The real problem is using '=' as the assignment operator. I think this was a serious design flaw. Of course some languages use ':=' which is better. I prefer just ':'. I see many languages that use '=' in some contexts and ':' in other contexts. Members/properties/fields quite often get assigned using ':'. I say make it universal and reserve '=' for equality.
1. "Single return and out parameters" should have a special mention for Haskell, since Haskell doesn't even have multiple input parameters!
2. Python has assignment expressions now.
Overall, it's a pretty good list of shortcomings of C, but I disagree with several of the points: (a) special-casing subtraction lexing to be whitespace sensitive is silly, and (b) integer division is essential whenever working with arrays or modular arithmetic, and converting types explicitly, like Rust mandates, is definitely the way to go. Who knows if I'd want a float, double, rational or currency type to be the output, anyway?
Numbers are hard. They can be made to look easy, but that just sweeps the corner cases under the rug.
Probably what people want is a Number type providing integer bignums for those situations when you don't want to care, a set of "machine integer" types with controllable overflow handling, IEEE floating point, a Rational/Fraction so you can handle 1/3 correctly, a Money type with controllable rounding, and COBOL-style "picture" types.
Oh, and complex numbers, and matrix types of all of the above.
There's only one nitpick I could find in the text: in Rust, semicolons are not delimiters, but instead they distinguish statements and expressions. It's clearest when returning. Rust returns the last value, so `3 + 5`, which evaluates to `8: u32`, is different than `3 + 5;`, which evaluates to `(): ()`.
I'm not sure if that makes things better, but it could be worth a special mention.
Julia doesn't use ~ for logical not, it's used for bitwise not (it does work as logical not but only because Bool is a subtype of Integer, I've never seen it explicitly recommended).
"In Lua, point would be the x value, and y would be silently discarded. I don’t tend to be a fan of silently throwing data away, but I have to admit that Lua makes pretty good use of this in several places for “optional” return values that the caller can completely ignore if desired."
No. Just no.
This is several orders of magnitude worse than most of the rest of the list. (Which is mostly personal opinions.)
I found this essay like a lot of other essays comparing languages. The comparisons were broader. The author studiously avoided any mention of C++, or how it differs from C and other languages. (Or is C++ supposed to have the same answers as C? If so, FAIL.)
Almost all the attention went to superficial details that really don't make much difference. Your fingers and your eyes learn the gestures and signposts, and they fade into the background. (But modules and namespaces matter.)
What is left is whether you can express what you need to, at all. Whether, having expressed it, you can pack it up into a library that anybody can use without knowing all about how it was put there. Whether, finding a library, you can actually use it in your runtime environment without it costing more overhead than if you wrote your own, or demanding runtime concessions different from what you have committed to for your code or for other libraries you want to use. Obligate GC is death for interoperability.
The only languages that excel, there, are C++, Rust, and D. (I include Rust because it is well on its way, and will get there before long if its users can wean themselves off of ARC boxing.) None of the other languages are really even trying. It's tragic. Haskell and the MLs could be good at libraries, were it not for their obligate-GC problem. The other languages with big library ecosystems are slow, so overhead isn't noticed.
There has been more than enough time to come up with something to unseat C++. Part of the problem is that the main incubator for new languages has been academia, and academics won't even discuss a language that is not obligate-GC. We need a language that will be equally good at copying register values to and from ALUs and memory buses, driving vector pipelines, orchestrating legions of GPU cores, and wiring up FPGA subunits. (I have not seen an FPGA compatible with GC.) If we end up programming our FPGAs in C++, it will be the fault of everyone who failed to unseat it by making a better language than it.
I think that assignment by destructuring gets a bad rap here. It's a great convenience and an elegant way to accept multiple return values. Or rather, to maintain a coherent type system where there is only one return value but it can be of a compound type.
To me a big mistake in C is its multidimensional arrays. For example, it is not possible to write a function which multiplies two rectangular matrices, since the sizes cannot vary in that manner (m×n and n×k, with m, n, and k variable).
On the other hand, C has so many goodies which ought to be done right and better in modern languages, but often are not:
1. Variadic functions like printf. It sucks to wrap arguments into a list just for this.
2. Setjmp/longjmp and nonlocal returns
3. Union data types
4. Conditional macro directives to compile debug statement versions when needed.
It's easy to criticise C or patronise it saying that it was good for its time, the reality is that many of its features (or what they attempt) are futuristic even today.
I don't think variadic functions are uncommon. Try Idris - not only it has them, but you can actually statically typecheck the arguments against the format string - all in regular code, with no special help from the compiler!
Lisps do NOT do textual inclusion. Lisp systems are created from code, not text.
I guess that the author only saw the use of `use-package' or the `:use' option of `defpackage', but this is not necessary (and not generally used) to refer to other namespaces.
The actual use of `defpackage' is often quite close to how Clojure does it.
Lisps in fact do textual inclusion. In ANSI Common Lisp, when you (load "foo.lisp"), and foo.lisp isn't a compiled file, it gets read and processed as text.
The symbols in the compiled file get read in the current package. The <star>package<star> variable is dynamically rebound, over the lifetime of the load, to its existing value, so that if the file happens to change it, that effect is undone when the load finishes.
The loaded file source can arrange for its bulk to be read in its own namespace, or it can be processed in the parent namespace, which is the best of both worlds.
When a file is compiled, then it's no longer textual inclusion: best of both worlds again.
This is all so reasonably designed that I copied the salient aspects of things like load and compile-file and all that jazz nearly as is into TXR Lisp, which isn't an ANSI CL implementation and free to do anything differently.
On Optional block delimiters, the author recommends programmers in C like languages ALWAYS use braces.
I agree! I've stuck hard and fast to this rule since... I was programming Qbasic as a kid. Back then it was for different reasons, but the practice stuck with me as I learned new languages.
Integer division: I think the best thing to do is to produce an exact rational, like Clojure that was mentioned, and most Lisps, where it got that from.
I don't understand the modulo issue with negative numbers, since you can always just use an unsigned number or write a representation layer without much hassle.
Though maybe I'm too deep down the rabbithole already and simply got used to it.
I like this list, although I want to nitpick the use of the word "monad", since it seems to (a) put people off these ideas, either due to them seeming scary or pompous/ivory-tower and (b) confuse those who are otherwise receptive to the idea.
Rather than call this "monadic error handling", I'd just say that results are wrapped up so that errors can be distinguished from successful results. Usually that's done by wrapping in a list (or, if the language supports it, an "Optional"/"Maybe" type, which is just a list truncated to 1 element).
Adding this extra structure lets us distinguish things like "the query died" (an empty list) from "there was no match" (a list containing an empty list). If we'd used NULL to indicate failure, we wouldn't be able to distinguish between these situations (or indeed if there was a match, whose value happened to be NULL!).
Naively we might think this require a lot of length-checking and unwrapping, but we can avoid that by using list operations that are (hopefully) familiar to every programmer, like "map", "concatenate" and "singleton".
It turns out that those operations form a monad, but it seems overly dramatic and confusing to name the approach using that terminology. Sure it's nice that we can abstract out this interface, but we don't need that much abstraction when our whole ecosystem is using a single, specific implementation like "Optional".
Incidentally, there's a really nice paper on this called "How to replace failure by a list of successes" ( https://rkrishnan.org/files/wadler-1985.pdf ), which shows how normal, non-truncated lists actually implement backtracking search (assuming our lists are lazily generated, e.g. like in Haskell or using an iterator).
Note that being "monadic" specifically means we're able to 'collapse' these lists, i.e. concatenate a list-of-lists into a list ("singleton" comes from a weaker notion called 'applicative', "map" comes from an weaker notion called 'functor'). Collapsing lists removes the distinctions that we introduced, since "concat([[]])" and "concat([])" are both "[]", making our result act more like NULL. So calling this approach "monadic" is actually emphasising the wrong part!
"Do you struggle to track down the source of NULL values in your code? With monadic error handling you can struggle to track down 'Nothing' values instead!"
The real improvement is from the non-monadic interface, like "map", which preserves these distinctions.
When I was closer to my teenage years, I tried what I considered a sizable variety of programming languages and over time gravitated more towards C and Lua and less towards everything else, the latter which is specially mentioned a few times in this article.
I found that they both shared a philosophical simplicity (even if it only seemed that way with C, considering how much complexity you later learn about) and over a decade later I've not found the same philosophy in any other programming languages.
They all tend to be written with the goal of adding features that are suppose to make the programmer's life easier—and here's the distinction—rather than designing a language that is powerful but simple.
This of course has shortcomings of its own, but the trade-offs are ones that I typically seek.
I'd love to be educated on similarly easy languages.
I'm not sure what the lingua franca of the future for software developers should look like, but I forfeit that it probably should be slightly more complicated than C or Lua in terms of looks. At least in terms of optional standard libraries provided for things like cache levels and GPU support.
Perhaps that's outside of the scope of what a programming language should provide to users, though? I'm not sure. It seems like we sit on a lot of complexity and don't use it as efficiently as we could be, though. Maybe some of these things the underlying virtual machines or compliers should be doing for us as they currently do, but extending this reach.
The complaint about null as a bottom type and then saying haskell fixes it isn't quite right.
Haskell has an explicit value ⊥ that is part of every type. This is very close to Null with regards to the complaints of OP.
The difference is that ⊥ can't exist in a Haskell program. The moment you get it your program will either crash or loop forever.
(The reason it is useful has to do with lazy evaluation)
> #include is not a great basis for a module system
And that is perfectly OK! Any "module system" sucks big time, creates more problems that it solves, and should be avoided whatever the cost. Textual includes are great but of course they should not be used for silly module systems.
One problem with module systems (in compiled languages) is, that they form dependency chains.
Imagine you change code in an upstream module. Now the compiler has to recompile all downstream modules. In C and C++ this only happens if you change the header file.
(On the other hand modern development techniques emphasize tests so you might only recompile your module and the testsuite until all tests pass and only then recompile all modules, minimizing the impact.)
For a full recompilation you can parallelize C and C++ compilations much better than any module system I know.
The C language is not designed for building huge programs by accretion of modules. The idea is that you build many independent programs, and then you glue them together using scripts.
> The idea is that you build many independent programs, and then you glue them together using scripts.
That doesn't exactly describe kernels or embedded systems, which are C's strongest bastions. Whether it's well designed for the purpose or not, whether modules are appropriate in that context or not, a significant majority of the C code out there (including most common web servers, databases, etc.) does not fit your description at all.
Building small programs and gluing together with scripts is great, but hardly relevant. You don't need includes or modules for that. What if you do need to build one large program, like those aforementioned web servers or databases, or (what I work on) a storage server? That's where the difference between textual inclusion and modules really comes into play, and modules are strictly better than includes in every way.
I think the problem here is that you're confusing modules with things built on top of modules - specifically package managers. A lot of the package managers out there are horrible and create more problems than they solve, but that has almost nothing to do with modules as a language construct.
>The C language is not designed for building huge programs by accretion of modules.
Can you give a single concrete example of a problem, because the above are all noops semantically speaking...
>The idea is that you build many independent programs, and then you glue them together using scripts.
This is a non-starter for most use cases outside pipeable shell commands (which are not the only kind of programs people want to write).
People need, and write, and have written for decades, large programs in C, and programs in C which have from 10s to 100s of headers files included (including recursively from included libs).
Like the etymology of a word doesn't necessarily convey its meaning, the "original intent" of something is often meaningless as to its actual practical use.
The idea you mention is valid, and is part of the Unix philosophy.
But it was never the idea that C should be used JUST for that.
In fact the first use of C was to write a whole operating system.
Well the systems I'm developing now consist off small interlocking C++ processes which use UNIX IPC to communicate across address spaces. It's not what most people are used to but it works very well in this case. The aren't shell commands like you're thinking of but they are discrete programs.
Oh, I don't know. Gluing the programs together using RPC and REST seems to be working so well for everyone else. Why do you even need a shell or unix style pipelines, right?
You use the MMU to create virtual address spaces and the kernel to provide a message passing interface. The MMU may cause you to incur some latency but in exchange you get some pretty good isolation between components of your system. Even more so if your kernel is proven using formal methods.
There's no need to get snarky. I've used similar parts before, for example TI's Hercules RM4 does not have an MMU either and so what I said wouldn't apply there. But that doesn't mean you would never have an MMU.
Indeed, my apologies. I actually quite like the idea. It reminds me a lot of Erlang, where you set up a bunch of small processes and if any of them die, you can a) try to recover quickly, or b) run in reduced functionality mode. I just haven't been fortunate enough to work with embedded processors where it'd be a good fit; they've either been on the small side (Cortex M, MSP430, AVR) or on the big side (iMX6, AM335x) and just run full-blown Linux.
I like the fact that textual includes are at the top of the list. What an abomination! Modules have been known and clearly superior since forever, both for correctness and for compile times. Every time I see macro abuse in C, or exploding compile times in C++ (which oddly doesn't even get mentioned in this section) I cry a little for the future of my profession. Optional delimiters, default fall-through, weak typing - check, check, check. Good points, well argued. I've had to fix bugs caused by all of these. I happen to agree on type first, disagree on single return (and it's a shame Zig doesn't seem to be on the radar especially because of this), but these are certainly discussions worth having and the OP provides a good starting point.
It's definitely a bit long, but well worth it.