I’ve spent 30 years working on compilers. They have bugs. Lots of them. With tha...

jcranmer · 2025-02-24T02:49:01 1740365341

Compilers are multimillion line programs, and they have an error rate which is commensurate with multimillion line programs.

That said, I think like half the bugs I see get filed against the compiler aren't actually compiler bugs but errors in user code--and this is already using the filter of "took the trouble to file a compiler bug." So it's a pretty good rule of thumb that it's not a compiler bug, unless you understand the compiler rules well enough to articulate why it can't be user error.

LiamPowell · 2025-02-24T02:56:41 1740365801

It's not quite half the bugs on GCCs bug tracker, but it's very high: https://gcc.gnu.org/bugzilla/report.cgi?x_axis_field=&y_axis...

It's around 10% invalid bugs and another 10% duplicates. A lot of them that I've seen, including one of mine, are a result of misinterpreting details of language standards.

marckerbiquet · 2025-02-24T10:16:27 1740392187

Compilers have a huge advantage over other programs: they are fully deterministic since they depend only on input files, command line arguments and few environment variables. It makes bugs easier to reproduce and fix compared to interactive applications, programs with networking, multi-threading...

perching_aix · 2025-02-24T11:17:44 1740395864

Pretty sure most modern compilers are multithreaded, and do exhibit a slew of practical nondeterminisms, which is how/why projects like Reproducible Builds were formed.

jcranmer · 2025-02-24T12:46:45 1740401205

In general, most compilers are generally single-threaded for most of the compilation process--at the very least, compiling a single file (translation unit) is almost always done using just one thread.

However, nondeterminism does creep in in various places in the compiler. Sorting an array by pointer value is an easy way to get nondeterminism. But the most common nondeterminism in a build system comes not from the compiler but the filesystem--"for file in directory" usually sorts the file by inode, which is effectively nondeterministic across different computers.

perching_aix · 2025-02-24T12:47:50 1740401270

Yes, that's why I was so careful with the wording. Timestamps are another example.

AlotOfReading · 2025-02-24T02:22:51 1740363771

It's amazing how many compiler issues never translate into meaningful deviations at the level of application behaviour. Code tends to be highly resilient to small execution errors, seemingly by accident. I wonder what a language/runtime would look like if it were optimized to maximize that resilience, i.e. every line could miscompile in arbitrary ways. Is there a smarter solution than computational redundancy without an isolated verifier system?

mturmon · 2025-02-24T18:12:42 1740420762

Interesting comment.

I do a lot of numerical programming. When developing programs based on optimization, in particular, a similar robustness-to-error property happens. Your implementation can have bugs, but it's generally hill-climbing, and so often the results generally look OK.

If you really want to verify correct operation, you have to construct hard cases, or compare with another implementation, or look at intermediate state variables, or examine the cost function at very high numerical precision, to detect trouble. Run-of-the-mill inputs will not tickle the bug hard enough to notice.

IgorPartola · 2025-02-24T05:40:22 1740375622

I am very curious, if these bugs are that common then why don’t we see more programs with weird bugs when they are running and especially having them be documented? Is it because when an unknown bug turns out to be a compiler bug and not a code error it gets fixed right away and with little fanfare? Or that there is some sort of resiliency built into the compiled code that can mask compiler bugs? Or is there some other factor?

Also how easy is it do discover a compiler bug and how easy is it to identify that a bug in your executable is due to a compiler bug?

starspangled · 2025-02-24T08:23:40 1740385420

Compilers runs enormous regression suites, and CI/git/bisect/etc style of development has made bugs harder to check in and quicker to squash in a lot of cases I would say.

I have found a number of compiler bugs in GCC and LLVM (and GAS and LLMV AS). Almost without fail they have been in the use of new features (certain new instructions, new ABI / addresing model) or esoteric things (linker script trickery, unusual use of extended inline asm) etc where the compilers had probably no or very little "real" code to test against other than presumably some simple things and basic unit tests when they check in said features.

Unless you're doing _really_ unusual things, or exercising new paths that don't just get picked up when compiling existing code (e.g., like many/most optimizations would), it's just not that likely you'll write code that triggers some unique path / state that has a noticeable bug.

To identify the bug is a compiler bug that is silent bad code generation, you basically assume the compiler is correct until you start to narrow the problem down to a state which should be impossible. After you put in enough assertions and breakpoints and logging (some of which might make the problem mysteriously go away) and reach the point of banging your head on the table, you start side-eyeing the compiler. If you know assembly you might start looking at some assembly output. Or you would start trying to make an reduced reproducer case. E.g., take the suspect function out on its own and make some unit tests for it. A tool like C-reduce can sometimes help if it's not a relatively simple small function.

How quickly you reach that point where you can actually start to narrow down on a possible compiler bug entirely depends on the problem. If it's causing some memory ordering or race condition or silent memory corruption that is only detected later or can only be reproduced at a customer sporadically, then who knows? Could be months, if ever. Others could be an almost immediate assert or error log or obvious bad result that you could debug and file a bug report in a day.

octachron · 2025-02-24T09:35:35 1740389735

A significant factor in my experience is that a lot of programs are quite similar from an compiler perspective: they use well-trodden set of features and combine then in a predictable way. Compiling those regular programs is well-tested and well-understood. Compiler bugs tend to be relegated on the exotic paths, when using language features in novel and interesting ways.

whizzter · 2025-02-24T13:31:15 1740403875

Large functions is a particular breeding ground.

Ages ago working on PS2 games one of our guys had a particularly huge "do-animations-and-interpolations-and-state-and-everything-for-the-hero-in-one-huge-switch" thingy (not uncommon to encounter in games) that crashed the GCC, the function was split up.

In the sequel I think a similar function grew enough that not only had they the function but also split in multiple files to avoid miscompiles.

Most recently I was generating an ORM binding(C#) from the database model of an ERP system, for mysterious reasons the C# runtime was crashing without stacktraces,etc (no debugger help). Having seen things like this before I realized that one of the auto-generated functions was huge so I split it up in multiple units and lo-and-behold it worked.

(Having written a tiny JVM once I also remembered that jump instructions are limited to 64kb, not 100% if the .NET runtime inherited that... once it worked I didn't put any effort into investigating the causes).

Most of the time though compiler bugs aren't the worst (unless they help cause confusion in already hard scenarios).

alexey-salmin · 2025-02-24T05:57:24 1740376644

> I am very curious, if these bugs are that common then why don’t we see more programs with weird bugs when they are running and especially having them be documented?

Any given program has N "native" bugs and M bugs introduced by the compiler. I think as long as N >> M you won't really notice. Even if you stumble across a compiler bug by chance, proving it is a nightmare: there's so much UB everywhere that any possible output is technically correct. Exceptions are compiler crashes but those are rare.

In my experience most of compiler bugs were found by well-tested and proven software during the update of the compiler version or switching compilers. That kind corresponds to the prerequisite of "N is small".