Perhaps I missed it, but too bad OP didn't submit a fix to LLVM as well, or at least file a bug report there. Sure, rustc was emitting bad IR, but LLVM shouldn't crash! It should catch the issue and exit cleanly with an error message. Probably would have been easier to debug this issue if LLVM hadn't crashed in the first place, too.
Either way, a really fun read. For some reason I enjoy reading debugging stories for bugs that I almost certainly wouldn't be able to solve myself.
> It should catch the issue and exit cleanly with an error message.
IIRC LLVM's IR verification is not enabled in release build. In other words, if you're using rustc with debug version of LLVM the error message should pop up.
EDIT: "...LLVM's IR verification is not enabled in release build..." this is wrong, LLVM doesn't turn off verification based on its build mode. It is up to the user of LLVM, namely rustc in this case, to enable verification. For instance, you can verify IR after each optimization Pass (which is pretty expensive) by configuring `llvm::StandardInstrumentation` properly. Or you can verify the IR before codegen by switching up one of `llvm::TargetMachine`'s options. Clang always enables the latter (verification) by default but disables the former regardless of the optimization level or its build mode.
...and it's probably not enabled because the Rust compiler is already slow enough as it is? But yeah, I guess it's a fair trade-off having 0.0001% of builds crash if it makes the other 99.9999% a bit faster...
In either case dereferencing a null pointer would still be a bug, right? Or is it kind of "all bets are off" if you feed LLVM bad IR and don't enable verification?
I think even deriving the argument might invoke undefined behaviour here, but I'm not certain.
If I understand correctly, it's undefined behaviour to do this:
int *ptr = (int*)42;
As, in order to avoid undefined behaviour, ptr should only be assigned a valid address of an int, or the 'address' one past the end of an array of int, or NULL (logical zero).
It's possible things are different when the type is void*, I'm not certain.
No, because the defined contract for free() is that you pass it a pointer that was previously returned from malloc(), that you don't try to free() it twice, etc.
LLVM makes trade off on enabling/disabling certain checks, including assertions and IR verification, primarily to keep compilation time acceptable in release build. The idea is that we catch as many bugs as possible in debug version of LLVM such that release version can run fast.
If you feed LLVM bad IR, all bets are off. LLVM's assertions and IR verifier are impressively comprehensive, but not a guarantee.
For example, LLVM has a pointer type and a void type, but you may not make a pointer-to-void type, see https://llvm.org/docs/LangRef.html#pointer-type . If you do call `Type::getVoidTy(C)->getPointerTo()` then LLVM will hit an assertion only in a build with assertions enabled. Without assertions LLVM may silently execute UB.
> It should catch the issue and exit cleanly with an error message.
Probably not. As the author describes, LLVM has to tool to check for invalid IR, which they used to investigate the issue and generate an explanatory error message.
Either way, a really fun read. For some reason I enjoy reading debugging stories for bugs that I almost certainly wouldn't be able to solve myself.