There was a thread posted on the rust forum a while back that laid out the goals of this project [0]:
> A friend of mine (Luke) has been talking about the need for a Rust frontend for GCC to allow Rust to replace C in more places, such as system software. To allow some types of safety-critical software to be written in Rust, the GCC frontend would need to be an independent implementation of Rust, since the relevant specs require multiple independent implementations (so just attaching GCC as a new backend for rustc wouldn't work).
Luke:
> The goal is for the GNU Compiler Collection to have a peer level front end to the gfortran frontend, gcc frontend, g++ frontend and all other frontends.
> The goal is definitely not to have a compiler written in rust that compiles rust code [edit: unless there is an acceptable bootstrap process, and the final compiler produces GNU assembly output that compiles with GNU gas]
> The goal is definitely not to have a hard critical dependence on LLVM.
> The goal is to have code added, written in c, to the GNU Compiler Collection, which may be found here https://gcc.gnu.org 20 such that developers who are used to gcc may compile rust programs and link them against object files using binutils ld.
> The primary reason why I raised this topic is because of an experiment permitting rust modules to be added to the linux kernel: https://lwn.net/Articles/797828
> What that effectively means if that takes off is that the GNU Compiler Collection, which would be incapable of compiling that code, would be relegated to a second class citizen for the purposes of compiling the largest software project on the planet: the linux kernel.
> Thus it is absolutely critical that GCC be capable not just of having rust capability but of having up to date rust capability.
It would be a grave error to have coded the Rust frontend to Gcc in C (or, as written above, "in c"). Gcc is now a C++ codebase, and new components should be coded in modern C++, for better productivity, performance, safety, and maintainability.
(You might prefer not to consider modern C++ more productive, performant, safe, and maintainable than C (and reflexively downvote), but the statement remains true: Gcc did transition to C++, for reasons. And, all the improvements that have kept Gcc competitive with Clang were done, since. And, Clang and LLVM are also coded in C++, also for reasons.)
I disagree with the idea of using C++. Even with some of its more modern features. If I was choosing a compiled language for system development right now I would choose C or Rust.
For starters C++ does not have a stable ABI. Rust plans too have a stable ABI. In either case right now both fall back to the C ABI, but at least its a goal of the rust devs.
C++ is going to be harder to boot strap than either Rust core or C when trying to build a compiler for a new system/platform. C probably is the easiest bootstrap of the 3.
Rust and C are generally more performant than C++ code. Also in the realm of low end micro controllers C code tends to have the smallest binary image size.
In C++ if you need to interact with other languages a lot times your stuck to the C abi which which forces you to avoid a lot C++ features/constructs. As for rust its not an object orientated language and its unsafe blocks generally make it easier to at least keep rust code idiomatic.
C++ is also a massive kitchen sink of a language with many ways to hide gotchas. Probably the worst thing in C is its null terminated strings but C++ also suffers from that. C is simple, and rust has at least so far made good design choices for the most part.
There also choices in the C++ language that are even part of the standard library that seemed like a good idea but with hind sight are not that great of idea. For example operator overloading.
Although we are talking a GCC front end so I would be surprised to see both C and C++.
Where did you ever get the idea that C and Rust are more performant than C++ code? Modern C++ produces the fastest code of any language in wide use, empirically, and I don’t think many people would argue that. People don’t use C++ because they love its elegance — who would? — they use it for its raw power and expressiveness.
C has not been used for applications where performance is a priority for many years now. C lacks the expressiveness to make many software optimizations practical. I’ve written database engines in both C and modern C++ and it is no contest, C++ is much more concise while producing faster code. People primarily still use C for portability.
> If I was choosing a compiled language for system development right now I would choose C or Rust.
I'm not quite sure what systems development has to do with compiler writing? (I assume by system development, you mean something like writing OS kernels etc?)
Eg there are good reasons to stay away from automatic memory management when you are writing an OS, you want control over that. But those reasons hardly apply when you run a simple user space program like a compiler.
System programs can be in user space for instance a run time for another language.
As for a compiler ideally its easy to port as its kinda of the bed rock to get any other system going. Although a compiler for some architecture or use cases is too large to be self hosted or does not make sense to be self hosted. So its not always a deal breaker for a compiler.
You seem to be confused about what C++ and C are, and even about what Gcc is and its relationship to programs it compiles.
C++ is a language defined by ISO committee SC22/WG21 via the Standards 14882, most recently C++20 (superseding C++17).
C is defined by SC22/WG14, in the same way.
The most widely used implementations of these Standards -- Gcc, Clang, and MSVC -- are in fact the same project in each case (although MS's [until recently implemented] a long-superceded C Standard), and are, in fact, themselves C++ programs.
Portability of compilers is not needed, in general, to "get a system going", because cross-compilation is a mature and long-supported technique. Gcc is very frequently ported anyway, for practical reasons, via cross-compilation. Being coded in C++, and capable of compiling and cross-compiling itself, Gcc remains quite portable, as is repeatedly demonstrated by notedly frequent ports.
It is hard to imagine what could be considered a "deal-breaker" in this context. By the evidence, portability of C++ is never a dealbreaker as implementation language for a compiler, as in fact the compilers actually used in "get[ting] any other system going" are, with only rare exceptions, compilers in fact coded in C++.
(Exceptions are found not to be about portability, but about the memory footprint of the resulting compiler, which is not a product of its source language, but rather of the power of the compiler's optimizer, which is not always seen as necessary.)
In any case, the implementation language used for the compiler has absolutely no effect on the stability of an ABI for any target language. ABI stability is a choice made by language designers to favor backward link-compatibility over convenient expression of new features or bug fixes. It is hard for me to imagine the level of confusion that would produce this error.
> The most widely used implementations of these Standards -- Gcc, Clang, and MSVC -- are in fact the same project in each case (although MS's [until recently implemented] a long-superceded C Standard), and are, in fact, themselves C++ programs.
Yes. Though as far as suitability for 'systems programming' is concerned they might as well be Python programs or bash scripts, as long as they produce suitable output in an adequate amount of time.
Is a great deal of actual Rust behavior not fairly intimately tied to LLVM? As far as I know it leaks various LLVM details and much of the documentation about various functions documents them as being little more than a thin wrapper to various LLVM internals.
IMHO, "great deal" is overemphasizing it. There are some things, yes, but they are mostly smaller, more niche details that are even probably things we'd choose ourselves in many cases.
That being said, there's also cases where Rust does not have LLVM semantics, and that can cause bugs. Some famous examples being the loop optimization miscompilation, and &mut T currently not being marked noalias.
> Some famous examples being the loop optimization miscompilation, and &mut T currently not being marked noalias.
I don't think that's a case where Rust "doesn't have LLVM semantics" since the miscompilation was reproduced in standard C. Rather that rust is actively and ubiquitously leveraging otherwise rarely-exercised LLVM features, revealing a bunch of bugs (either leftovers or breakages) in them.
The latter, you're right, I did make a mistake here. I was thinking of it as "Rust doesn't just do whatever LLVM does, see, we have semantics and LLVM compiles them wrong, so this is an example of that" but I forgot that actually, we do specify "this follows what LLVM does" currently, and the miscompilation is just a plain bug. Extra embarrassing because IIRC I was the one who made the PR saying that.
The former though is an area of significant divergence between Rust and C++ semantics, and LLVM directly following C++ semantics.
FWIW, `noalias`, aka. `__restrict__`, generally works fine but the inliner translates it into `noalias` metadata which is, still, ambiguous and also not handled properly in some cases. Loop unrolling is one of them but that will be fixed with https://reviews.llvm.org/D92887, a general problem is the ambiguity itself.
There is a major effort to revamp the `noalias`/`restrict` handling going on for a while now. It takes quite long because it is hard and complex and we want to get it right.
TL;DR: C++ says that an infinite loop with no side effects is UB, Rust does not. Empty loops in Rust will disappear entirely when they should really loop forever.
(I edited my comment slightly because, re-reading, "blindly" sounds too negative.)
Great point! If anyone is curious about how to write an “empty” loop that is not compiled away, I've seen this common implementation that doesn’t even need the standard library:
Since https://reviews.llvm.org/D86841, ~ Aug 2020, Clang is able to reason about the forward progress guarantees of the input. For the LLVM-IR handling see https://reviews.llvm.org/D86233. While we are still adding more fine-grained support of different language standards and corner cases, e.g., constant loop condition in C vs. C++, you should be able to tell LLVM a loop is allowed to loop forever w/o side-effect. In fact, that is the new default. That said, LLVM still removes calls to functions without side effect unconditionally. We are working on that, i.a., https://reviews.llvm.org/D94106.
Generally, yes. Though, as always, these things are hard to summarize in one sentence and depend on the standard version (pre/post C++11). This post might shed some light on the options we are considering for Clang right now: https://reviews.llvm.org/D94367#2489090
Just what is gained by failing to loop forever? It seems like a really low-value optimization. Nearly always, the optimization would violate the programmer's intent or it would change an ordinary bug into a confusing bug. An infinite loop is not very many bytes on any processor.
The standard may allow the generation of insane code, but a decent-quality compiler does not do so. The standard ought to be fixed.
The infinite loop being UB was added because it prevents some important code motion optimizations and makes it hard to reason about the memory model. You can find the details on the papers leading to the C++11 memory model.
> The infinite loop being UB was added because it prevents some important code motion optimizations
Which ones?
If this optimizations are so important, how come Rust was designed in such a way to make them impossible? Also, how does this fit, e.g., the benchmark game results which show that Rust is faster than C for all benchmarks considered there ?
Store sinking for example, which is unsafe unless the loop is guaranteed to terminate.
C does not have this guarantee (at least not in all cases). Also rust is compiled with the llvm backend, so my understanding is that in practice rust assumes that loops terminate. See:
There are llvm directives that can be added to prevent the optimization, but they are rejected by the rust maintainers exactly because they would cause performance regressions.
Letting the compiler assume that a "switch" without a default will never go there is a great optimization too. Why not put that in the standard?
Actually, this is worse. Letting the loop be UB is like letting a true "if" be UB. Just never mind the code actually written; surely it doesn't mean what it says.
> Letting the compiler assume that a "switch" without a default will never go there is a great optimization too. Why not put that in the standard?
that's actually the case already. A missing default is UB if the switch condition does not match any of the cases. And C compilers already optimize accordingly.
I'd find that very surprising and interesting - I can't imagine which semantics could leak from LLVM into a language specification. Can you give some examples?
I'm not sure as to what capacity GCC has a similar instruction but I heard that LLVM's using of signed integers here is apparently nonstandard and what lead to Rust's decisions for vectors to be limited to a certain size: https://doc.rust-lang.org/nomicon/vec-alloc.html
I don’t know why you’re italicizing LLVM, gcc, and Rust, but I (and perhaps others) find it relatively jarring as my brain automatically parses it as emphasis even though it clearly isn’t intended to be.
I only bring it up because it pretty drastically harms readability (for me at least, and to a degree I frankly find surprising). Just thought you may want to know.
This is common nomenclature to enhance readability in fields that have a large number of neologisms and proper nouns, such as some technical domains, medicine, investment banking, and so on. It's particularly useful when names are in the same language as the base language (e.g. in English "Peter" is easily recognized as a name, but "better fish" means something in the language, but can also be used as a proper noun, so using italics for proper nouns serves to disambiguate the intended meaning).
This is indeed a point of contention among different styles of whether software titles should be italicized. Though the position that they should uniformly be is the minority one, it is certainly used by some outlets with considerable weight, and I believe it to be the consistent one if titles of books and films are too.
Some style guides make the even more inconsistent distinction that only titles of video games be italicized, but titles of “application software” not be.
I suppose that on H.N., where such software is frequently mentioned, it does stand out more.
Looking at their commenting history, they have a habit of italicising really often, so it's not specific to this topic. It does indeed make a lot of their comments annoying to read without obviously adding any value.
> It does indeed make a lot of their comments annoying to read without obviously adding any value
I’m going to assume that you mean that the italics don’t add value, and not their comments as a whole. I think your comment could be read either way. Anyways, even so others have since pointed out reasons why the might be used to doing so.
IMHO, "Just some pointer arithmetic" is selling it a bit short. There's a lot of stuff there around provenance that matters, that is, there's a reason this is an intrinsic and not just a subtraction after casting both to an int.
Glad you're going slow on the stable ABI and resisting the pressure to put out something half-baked. The C++ ABI is horribly fragile and complex. Unless the pitfalls of C++ can be avoided, making no ABI promises is better.
If you want to distributed stable shared libraries, you write to the C ABI, not the C++ ABI.
Likewise Rust users should write to the C ABI for stabled shared libraries. (I believe there are a bunch of mechanisms to do that, though I'm not really a Rust user)
Yes, you have to do a lot of extra work. That's working as intended.
It's sort of an oxymoron to expect to use every C++ or Rust feature in your user-facing ABI and have it be stable. They are incredibly rich languages, with drastically different notions of "function" than C has (let alone other constructs like data layout)
There's also the possibility of an opt-in stable ABI (using the `repr` mechanism). I personally really like the idea of there being an opt-in cross-language ABI at a higher level (or supporting more use cases) than the C ABI (which is quite limiting).
I don't understand what is your point. The KDE example you have linked shows that you can in fact guarantee a stable C++ ABI (although it requires a lot of care). The fact that they go through all the trouble is a hint that a C ABI is in fact too restrictive and not expressive enough for a large framework like KDE.
Specifically KDE relies (among other things) on the ABI stability of the layout of virtual tables which is certainly not part of the C ABI.
That's fair, there is a middle ground between C and C++. My point is that you can't expect the compiler to do everything for you in terms of creating a stable ABI for shared libraries. I think a lot of people are under the impression that this is purely a compiler feature.
What I expect to happen is that rather than "Rust gets a stable ABI" it will be "Rust very gradually builds upon the C ABI for selected features".
Most C++ libraries have templates in their signatures these days for efficiency, and Rust has a similar flavor (monomorphization). I think there is a fundamental tradeoff there between stability and performance, and both languages have a heavy emphasis on the latter. I could be missing something as I certainly don't know all the details of the C++ ABI.
Even with templates you can still keep the ABI stable, se for example libstdc++. Which is not surprising as they compile down to the equivalent C code.
I think your example is better evidence for my view. I remember this issue being mentioned in a CppCon talk: C++ 11 broke the ABI for STL (for efficiency as far as I remember), and the binary interface to libstdc++ changed as a result:
So the ABI can leak implementation details in nontrivial ways that library authors usually don't consider. It's better to have something explicit in the code, e.g. under extern "C".
The point is not that making a stable API for every language feature is impossible; just that it's hard, fragile, and maybe not be worth the effort. If you really want stability, then use fewer features more like C. There is probably some middle ground that's richer, but templates are known to cause problems.
The ABI break in libstdc++ has nothing to do with templates and it was not done by mistake.
The layout of std::string changed to conform to the new standard; exactly the same thing would have happened if std::string was a C POD type, there is no way around that.
In fact libstdc++ to this day still has the option to be compiled with the old ABI.
There have been other minor ABI breaks, mostly to fix bugs, which affect only a small number of programs.
Pure C libraries have exactly the same ABI stability issues caused by changes in layout of structures; to avoid this C libraries expose only opaque pointer sized handles to heap allocated objects. This is similar to C++ libraries only exposing pointers to virtual interfaces, but for high performance libraries, for example containers, this is not considered acceptable.
The point is that data layout is an implementation detail not part of the C++ API, and exposing that through shared libraries isn't a great idea. People do it, but it's a big mess and not particularly well documented.
You can write conformant STL implementations that have completely different layouts. But the GNU one no longer had a valid layout, so it had to change.
So when you expose libstdc++ as a shared library, you're exposing a bunch of details that aren't part of the C++ standard.
If you write it in a header, then you can expect it to break callers via shared library. And templates must be in headers. This is a fundamental language issue.
I don't use Rust, but the point is "making a stable ABI" will expose it to these sorts of problems, i.e. implementation details leaking on specific architectures, outside of the language.
The projects that care about ABI stability, e.g. sqlite, don't expose layouts. They have only functions and not data in their headers.
Your original point was that for binary compatibility you write against the C ABI. I'm claiming that the C ABI has exactly the same fragile layout limitations and you can use the exact same workarounds in C++ (i.e. only expose pointers or make sure that your layout doesn't change).
I'm also claiming that templates are a red herring and have very little to do with ABI.
Yes you should do that now but only because Rust doesn't have a stable ABI. If/when it does then there's nothing wrong with distributing shared libraries using the Rust ABI, just like there isn't really anything wrong with doing it in C++ nowadays as long as you don't use bleeding edge C++ stuff. The C++ ABIs are pretty stable.
there isn't really anything wrong with doing it in C++ nowadays as long as you don't use bleeding edge C++ stuff
That's the whole point of the KDE doc I linked. It's not a reasonable strategy to use unrestricted C++, just like it's not a reasonable strategy to use unrestricted Rust. You actually have to design an ABI, not just rely on the compiler to do it for you.
I thought there was a GNOME doc too, but I couldn't find it. The point remains: there are lots of things in C++ that people who care about stability don't use at their ABI boundaries.
Everything is tradeoffs. Stable ABI can also lead to other issues, like performance problems. C++ is dealing with some of these right now, and there are some situations (very very micro benchmarks, to be clear) where Rust is faster than C++ due to ABI issues.
That affects mostly GCC and clang, other compiler vendors are more happy to break ABI between major releases, as long as they aren't forbidden by ISO C++ specification.
I was already sharing templates and classes across DLLs in Windows 3.x compilers, and keep doing it with VC++ to this day.
On Linux with Qt and Gtkmm projects, and on macOS/iOS with their system frameworks.
Which is the biggest reason I cannot put up with cargo's model to compile every single project from scratch, after git clone.
As we have lengthy discussed, while it might not be a priority right now, it is certainly an adoption block among some Ada, Delphi, Swift, Objective-C, C and C++ communities.
You are exposing an imprecision in the way that this is usually talked about, yes. If you are willing to use the C ABI, then you can use a combination of that repr and annotations on your functions and produce a shared object in Rust with a stable ABI.
What people usually mean here is that Rust would have its own stable ABI, that you would get “for free,” without needing to do that work. (It’s never actually free of course... but that’s yet another detail that is usually papered over when people talk about this.)
I suspect it's too late, but one lesson that can be learned from D is that having one frontend implementation with multiple backend glue layers is much more convenient than having one in D and one in C++ (I believe Iain Buclaw is in the process of moving the D frontend in gcc to use the proper D one as it's been lagging behind)
There are pros to that approach, but also cons. The major con is that keeping the rustc frontend would make an existing Rust compiler be a requirement, and not having that makes bootstrapping easier, which is a major pro.
It also looks rather incomplete. Are they not planning on implementing borrowck? If chalk/polonius were ready and had a C API the roadmap would begin to make sense.
Makes sense to me. As mrustc[0] mentions, implementing validation in a secondary compiler is much less important, because you can always just run the reference implementation as a glorified linter in the meantime.
That's what I don't get. I looked through the available documentation briefly and didn't see any mention that this is intended to be a `mrustc`; it really seems to want to be a `rustc`. I'm not aware of other GCC frontends being less than complete compilers for their respective languages, and while I think that "rust without borrowck" is an interesting point in design space (discriminated unions, generics, macros, traits, closures), "rust without borrowck" is not rust.
I'm not sure if what I'm asking makes sense, but since it's written for a new backend, would the authors have to bootstrap it using a different toolchain? I guess what I'm asking is, could they use the LLVM Rust to build the GCC frontend or do they have to start all over with a different base language to get a first working version of a rust compiler?
I see, so it’s written in C++. Would it remain that way, though? AFAIK the LLVM Rust is, itself, written in Rust, right? I imagine that it would be a goal to do the same for gcc.
There is absolutely no reason why the GCC front-end needs to be written in Rust. The reason the LLVM front-end was written in Rust initially was so they could immediately test and use new features in what was at the time also the largest program in Rust. Re-writing the GCC front-end in Rust would just prolong an already rather unfortunate bootstrap problem with the language and it should be strongly discouraged.
As it stands, the way to bootstrap the official Rust compiler from source with just a C/C++ compiler is a few options:
* Compile OCaml (implemented in C), use it to build the original Rust front-end in OCaml and then build each successive version of the language until you hit 1.49. This option is not fun.
* Compile mrustc, which is a C++ implemented compiler that supports Rust 1.29. Use that to build the actual Rust 1.29 and then iterate building your way all the way to 1.49. That is less bad, but still not fun.
* Compile the 1.49 compiler to WASM and run it via a C/C++ implemented runtime in order to compile itself on the target system. This would also mean packaging and distributing the WASM generated code, which some distributions would refuse. I also am not sure if it's even currently feasible, as I don't follow the WASM situation closely.
A compliant, independent C++ implementation that could be built in the ten minutes it takes to build GCC itself would be a very good thing to have and would be more friendly to distribution maintainers.
Like I mentioned in my post, it wouldn't be for some depending how serious they take things. I'm well aware many would refuse generated artifacts, even compiled to source artifacts. For others not so strict, they may accept it since the WASM blob would be the same across targets, unlike an executable, so it lessens a burden as maintainers only have to generate it once for their distribution. It was never something I suggested the Rust maintainers provide a blob for, either.
Regardless, it was worth mentioning as a potential option. I am one of the handful of maintainers for an experimental distribution where packages are either compiled or interpreted from tarballs and this would be something we'd consider. I'd MUCH rather have the GCC front-end option, however. So far, we've simply not packaged Rust and have accepted that as dead-ending our Firefox package. This may potentially revive it.
Not a requirement for a GCC front-end and certainly not worth sacrificing a potentially faster path to bootstrapping the official compiler implementation. You should be worried about the ease by which various systems can bootstrap and adopt the language, which is a mostly solved problem for C/C++ but not a given for Rust itself. Some maintainers will absolutely refuse bootstrapping off of binary artifacts compiled from other systems, others won't even accept 'compiled to C' artifacts.
Keeping a viable C++ implementation as part of GCC would be the smartest decision.
The Bootstrappable builds folks dislike binary artifacts so much they are implementing bootstrapping a full Linux system from only 512 bytes of machine code plus all the source code:
Who are these groups who are demanding such easy bootstrapping? OS or distro developers? Programmers working on embedded and/or safety critical systems?
I know OpenBSD avoids rust because of the bootstrapping issue, but they also avoid LLVM because of a licensing issue.
bootstrapping is important, but I believe that GCC already allows non-primary (i.e. optional) languages frontends to be written in other languages. The ADA front end is written in ADA for example.
There are still out-of-bounds concerns and chasing through NULL pointers. (Not arguing against gcc's quality or stability, just listing memory safety concerns that transcend memory deallocation)
It could take the same approach as GDC and GNAT, by having the frontend written on the same language (D and Ada respectively), shared with other implementations using another set of backends.
The LLVM-based Rust compiler uses a lot of unstable/nightly-only Rust features internally. So even if this project got to the point where it could compile all stable Rust programs, I think it would take quite a bit more work than that to be able to compile `rustc` itself. (It might be that the unstable stuff is mostly in the standard library and not the compiler itself? Does it make a difference?)
> It might be that the unstable stuff is mostly in the standard library and not the compiler itself? Does it make a difference?
It might. There are at least two major things off the top of my head, regarding libstd:
1. specialization is needed for performance around String and &str
2. const generics are needed to support some trait implementations
We currently allow some stuff like this to leak through, in a sense, when we're sure that we're actually going to be making things stable someday. An alternative compiler could patch out 1, and accept slower code, but 2 would require way more effort.
There has been some discussion about trying to remove unstable features from the compiler itself, specifically to make it easier to contribute to, but it unlikely that it will be completely removed from the current implementation of libstd for some time.
I believe that it does, yes. We haven't hit that in stable yet, though, so I am speaking purely in the present. You're right that it's looking good for stabilization in a few months, but anything can happen, in theory. I'm more trying to illustrate the point, that significant features can depend on something that's unstable currently. Those impls landed well before it was even marked as being stable in nightly.
> It's managed to build rustc from a source tarball, and use that rustc as stage0 for a full bootstrap pass. Even better, from my two full attempts, the resultant stage3 files have been binary identical to the same source archive built with the downloaded stage0.
If it was written in rust, there's no reason they couldn't use LLVM rust to start development until it became self-hosting. (Same as you can develop a self-hosting C compiler by starting with gcc)
A bit off topic, I hope someday GCC's build system gets overhauled. A huge advantage of LLVM is that it is quite easier to rebuild the runtime libraries without rebuilding the compiler. With GCC that's a pain, unless one takes the time to re-package GCC very carefully like https://github.com/richfelker/musl-cross-make and https://exherbo.org/.
Maybe getting some new GCC devs in there with projects like this would help with that?
Don't try to compile this with `make -j` - I tried, and my system ran out of ram and swap and started OOM killing things. I have 16 threads and 32gb of ram.
did you try make -j16 since you have 16 cores? -j with no number means spawn as many parallel jobs as possible which could be hundreds or thousands depending on the project, I learned this the hard way a while ago myself when I used $(nproc) incorrectly on a project, got the same OOM/swapping death spiral!
Ecstatic to see this. Once this stabilizes then I can switch my shop to rust and not look back. If I find some spare time I will absolutely try to find a way to contribute.
Do you mean that you will use this, or just have the alt implementation as a checkbox item? Because - I'm just guessing of course - for this to be a mature alternative compiler we might be looking at 5, 10 years in the future, or never. Just being realistic, things take time to grow (and it's also uncertain how they can ever be able to keep pace with the rapidly developing Rust project).
With all this said, I would love for them to succeed, for multiple reasons. Including <3 GPL.
I mean that I'd use it. If it takes a while then maybe I can get by with mrustc for a bit (haven't tried yet) but the net of it is that I need to support systems which llvm does not, which has kept me from using rust in product thus far.
Does this mean that compilation target currently supported by GCC would now allow rust to target that architecture? Specifically I'm thinking of PPC cores with the VLE extension, which LLVM does not support (as far as I'm aware).
> The developers of the project are keen “Rustaceans” with a desire to give back to the Rust community and to learn what GCC is capable of when it comes to a modern language.
So what's the answer right now ? How does GCC measures "against" rust ?
Yeah, sadly. The project never had much developers, because tackling AOT compilation for Java, alongside its dynamism, is an effort that requires a full time job, so only the commercial offerings like Excelsior JET were competitive.
I wouldn't say 'better'. Certainly, they're different. GCC does better in some ways, but worse in others. Somewhat notably, GCC doesn't do as good of a job at register allocation on some RISC architectures.
What I really hope is that the Rust community doesn’t go out of its way to make this easier. Communicate and let value come back but there’s an significant amount of value in keeping a single backend relies on. CPython has done the Python community a lot of good by keeping one official compiler/tool chain (despite the great work done by projects like JPython/PyPy).
The only way to do this properly, if desirable, is to make GCC an official backend of the main frontend. That will defocus some progress that happens with LLVM (every feature has to be implemented on both backends) and can make dev lives hard (eg “oh this problem comes up with GCC so use the LLVM backend “). The value would be if the majority of bugs/features are in the shared frontend.
This project though seems like a parallel implementation of Rust. That’s valuable for the community and inevitable as a part of successful growth. I don’t believe it’s beneficial to the community though if this grows beyond a toy, niche project.
>> The only way to do this properly, if desirable, is to make GCC an official backend of the main frontend. That will defocus some progress that happens with LLVM (every feature has to be implemented on both backends) and can make dev lives hard (eg “oh this problem comes up with GCC so use the LLVM backend “)
No. The correct way is to create a Rust language specification that describes what the correct behavior is.
Then whether LLVM, GCC, or something else is used does not matter. There won't be one implementation with defacto behavior, there will be multiple implementations that follow the spec.
It doesn't even have to stop progress. C++ does a new rev of the standard every 3 years. Rust has editions, which is a similar but less rigorous concept, and which could be made more rigorous.
Exactly. The C++ standard (https://isocpp.org/std/the-standard) is the specification that describes what a compiler must do to implement a particular "version" of C++ (for example, C++ 20).
Just as there is a C++ standard and multiple C++ compilers that implement the standard, there should be a Rust standard and multiple implementations.
This is the way that very few languages work. Python is a great example of an extremely popular, mature language that does not work this way. It is unclear that most people think that this sort of process is required for “maturity.”
If this is the benchmark, then among popular languages you basically have C, C++, C#, JavaScript, and... is that it?
The other languages without a spec don't market themselves a C/C++ replacement or as systems languages.
Having a webapp depend on a CPython implementation detail is very different from having a kernel depend on an implementation detail of a language without a spec that was used to implement it.
And languages actually stack on top of each other. Imagine depending on a CPython implementation detail that depends on an implementation detail of a specific C compiler. Those things do happen and they make programmers' lives miserable sometimes, but imagine how often that would happen if C didn't have a spec that all compilers strive to implement.
Those are intentional deviations from the standard, and I think there's a reasonable assumption that those stay the same/compatible from version to version. I think the other kind are a bit more problematic, and without a standard it can be difficult to tell whether your assumptions are reasonable.
Sure, and people in that space do care more. But it's clearly not a blocker, even for the stuff you're talking about. It's not a hard requirement to get Rust code into Linux, for example. (Not to mention what my sibling talks about, which is very much already true.)
There are more languages that have standards, sure, but the languages you mention, at this point, and in general, are mostly relegated to historical maintenance and maybe some very specific niches. To me personally, a “mature” language that’s not really being used isn’t the goal. And if developers truly believed standardization was valuable, you’d expect this property to give them a significant leg up against the sorts of languages like Ruby, Python, PHP, Perl, TypeScript, etc.
That being said, I do think, considering it longer, there are languages that I am missing, like SQL, and ones that have a spec, even if it’s not under an ECMA/ISO process, like Java and Go, that I was forgetting.
I still think that using this as a necessary condition for “maturity” is misguided.
Well, I agree that standardisation and "maturity" (whatever that may mean) are not connected, but I think most C and C++ programmers (to name two very widely used languages that I personally have a lot of knowledge of) find the respective standards for those languages very helpful, and somewhat lament the lack of formal standards for other languages they may use.
The C and C++ specs are full of UB though, more like a "minimum common denominator" for compiler devs than a strict spec that users can rely on. It's way too easy to write spec-abiding code that behaves differently with different compilers/architectures/optimizations.
Specs are definitely a good thing to have, but often they're just brandished as a bullet point without looking at the details: how good is the spec, what does it add over the existing tests/CI/RFCs/proofs, etc.
Yes. I do think that standardization is generally good. I am pro Rust getting a specification! I just think it is one pro among many, and not something required for success.
In a past life I worked on a COBOL compiler for IBM.
There is a COBOL specification, but AFAIK nobody actually implements it fully. Implementations pick and choose new features based on customer demand.
Also, COBOL is dominated by large legacy codebases, which means that if there's a discrepancy between an implementation and the specification, the users normally don't want it fixed, because they may have written code that depends on the "incorrect" behaviour and it's a lot of work to audit.
IBM built a new backend for its COBOL compiler using the optimizer from their JVM. IIRC by the time I left it generated code that was ~2x faster than the old compiler, but uptake was still slow because of migration concerns. In particular, we spent a lot of time working on features to help guarantee that code with some forms of undefined behaviour would have the same result as the old version.
Such standardized specifications are typically a response to many divergent implementations existing and a need to standardize a common ground between them.
Few languages find such specification before that time.
Python also lacks it despite some competing implementations, since none actually diverge enough from CPython.
PyPy famously diverges from it a lot (namely for native modules).
Standardization is helpful for tool building and experiments (ie here are the invariants we’ll never change). Languages don’t work that way and seeing how C/C++ have evolved (or really failed to do so at a meaningful pace), I’m under the impression (clearly unpopular due to the downvotes) that standardization and multiple competing tool chains are the cause of a lot of unnecessary complexity (not just within the language but also users of said language).
Having multiple independent implementations (and a spec) is key to being a serious portable systems language, a claim that Rust already makes. I believe it's too early for that, but sincerely wish them success achieving those goals in the future and having multiple implementations is certainly going to help here.
> eg “oh this problem comes up with GCC so use the LLVM backend “
The point of having multiple implementations is making the language independent of the underlying system. A programming language is an abstraction. The only way to test whether an abstraction is a good one is trying out how well it abstracts away various underlying systems. That's why Rust needs ports to various architectures, OSes and compiler backends. Having GCC as a backend counts double here, because you get a few new target architectures for free with the port not just a new backend.
It's of course much different with Python but I can still clearly see how having an implementation defining a standard hurts the language.
Strongly disagree. It will only benefit the language to have more than one quality implementation. C++ has benefited hugely by the competition between g++ and clang; both compilers have gotten much, much better. To be fair, it will take a while before the GCC Rust front end is competitive, but for some purposes it doesn't have to be, like bootstrapping.
If "progress" means "rapidly add more and more new features in each release", multiple implementations will slow things down. But that problem can be addressed with editions: at some point, if the project is a success, the gcc front end will be a feature-complete version of some Rust edition, plus enough extra features to build an older version of the Rust compiler. At that point, you have a better solution to the bootstrapping problem (how to get a Rust compiler when you only have a C compiler and you want to build everything from source and not trust some binary you download from somewhere).
>If "progress" means "rapidly add more and more new features in each release", multiple implementations will slow things down
This has to happen at some point anyway otherwise we'll just get another C++. And I don't think Rust would benefit from that. New languages are designed to fix problems with the old ones, not to replicate them after all.
I just hope the designers will choose that point wisely.
In some extent, given the use of macros and Haskell like libraries, it is already another C++.
Besides, if Rust doesn't become another C++, it won't fulfil the industry needs that C++ caters for, thus while it might become a success in some domain, it won't replace C++ in the OS and GPGPU SDKs.
Rust will not in any case replace C++, in any core application area.
Rust might move in alongside C++, in some, in time. Or, Rust could still very possibly fizzle. That would be the normal course of events for a new language, barring a miracle as was dispensed to Javascript, Java, C++, and vanishingly few others.
Will Dart survive and thrive? Kotlin? Scala? Clojure? Go? All doubtful, based on prior experience. Having a lot of code and a lot of users does not seem to suffice. Many other languages had those, and faded. Ada even had $billions in backing, and faded.
What we can say confidently about Rust's future is that it is not certain to fade. The miracle has come in less deserving cases.
Remember that Rust is supposed to be an alternative to C or C++, where the "this problem comes up with GCC so use the LLVM backend" is really rare and generally a sign of a bad codebase (exceptions being things like the Linux kernel that are so huge, optimized and domain-specific that they often end up relying on compiler dialects).
One exception to this is Visual Studio's toolchain but let's not talk about Visual Studio's toolchain on a weekend...
> A friend of mine (Luke) has been talking about the need for a Rust frontend for GCC to allow Rust to replace C in more places, such as system software. To allow some types of safety-critical software to be written in Rust, the GCC frontend would need to be an independent implementation of Rust, since the relevant specs require multiple independent implementations (so just attaching GCC as a new backend for rustc wouldn't work).
Luke:
> The goal is for the GNU Compiler Collection to have a peer level front end to the gfortran frontend, gcc frontend, g++ frontend and all other frontends.
> The goal is definitely not to have a compiler written in rust that compiles rust code [edit: unless there is an acceptable bootstrap process, and the final compiler produces GNU assembly output that compiles with GNU gas]
> The goal is definitely not to have a hard critical dependence on LLVM.
> The goal is to have code added, written in c, to the GNU Compiler Collection, which may be found here https://gcc.gnu.org 20 such that developers who are used to gcc may compile rust programs and link them against object files using binutils ld.
> The primary reason why I raised this topic is because of an experiment permitting rust modules to be added to the linux kernel: https://lwn.net/Articles/797828
> What that effectively means if that takes off is that the GNU Compiler Collection, which would be incapable of compiling that code, would be relegated to a second class citizen for the purposes of compiling the largest software project on the planet: the linux kernel.
> Thus it is absolutely critical that GCC be capable not just of having rust capability but of having up to date rust capability.
0: https://users.rust-lang.org/t/call-for-help-implementing-an-...