That's a great first step, but ready your pitchforks for this next take, because the next step is to completely eliminate Turing-complete languages and arbitrary I/O access from standard build systems. 99.9% of all projects have the capability to be built with trivial declarative rulesets.
Java's Maven is an interesting case study, as it tried to be this:. A standard project layout, a standard dependency mechanism, pom.xml as standard metadata file, and a standard workflow with standard target(clean/compile/test/deploy). Plugins for what's left.
There might have been a time where it worked, but people started to use plugins for all kinds of reasons, quite good ones in most cases. Findbugs, code coverage, source code generation,...
Today, a maven project without plugins is rare. Maven brought us 95%, but there is a long tail left to cover.
Most of these could still be covered with IO limited to the project files though.
There is a sizable movement in the Rust ecosystem to move all build-time scripts and procedural macros to (be combined to) WASM. This allows you to write turing-complete performant code to cover all use-cases people can reasonably think of, while also allowing trivially easy sandboxing.
It's not perfect, for example some build scripts download content from the internet, which can be abused for information extraction. And code generation scripts could generate different code depending on where it thinks it's running. But it's a lot better than the unsandboxed code execution that's present in most current build systems, without introducing the constraints of a pure config file.
But the xz backdoor didn’t involve a build script that tried to compromise the machine the build was running on. It involved a build script that compromised the code being built. Sandboxing the build script wouldn’t have helped much if at all. Depending on the implementation, it might have prevented it from overwriting .o files that were already compiled, maybe. But there would still be all sorts of shenanigans it could have played to sneakily inject code.
I’d argue that Rust’s biggest advantage here is that build scripts and procedural macros are written in Rust themselves, making them easier to read and thus harder to hide things in than autotools’ m4-to-shell gunk.
But that means it’s important to build those things from source! Part of the movement you cite consists of watt, which ships proc macros as precompiled WebAssembly binaries mainly to save build time. But if watt were more widely adopted, it would actually make an attacker’s life much easier since they could backdoor the precompiled binary. Historically I’ve been sympathetic to dtolnay’s various attempts to use precompiled binaries this way (not just watt but also the serde thing that caused a hullabaloo); I hate slow builds. But after the xz backdoor I think this is untenable.
Ultimately you're going to have to be adept at stuff like the Underhanded C Contest to spot this kind of thing in any Turing-complete language, so the idea of auditing the source is unreliable at worst. So I'd take another page from the Java/Maven ecosystem and require hashed+signed binaries, with the possible addition of requiring builds to be performed on a trusted remote host so that at least we can verify the binary is produced from the same source and results in the same output.
But determined actors with access are always going to try to thwart these schemes, so verification and testing is going to need to step up.
> to spot this kind of thing in any Turing-complete language
I’m not sure I follow. The bar for passing review and audits is “people can understand it”, at least to the point of preventing smuggling in a custom public keys and other shenanigans – as opposed to “all valid programs in the language”. While I agree Turing-completeness opens up undeniable complexity, it’s the best we got. And there are huge variations in obscurity across languages and tooling.
Underhanded C is meant to create code that is transparent enough to convince the reader that it's obviously and clearly doing one thing, while subtly doing something else.
reproducible builds never made sense to me. if you trust the person giving you the hash, just get the binary from them. you don't need to reproduce the build at all.
if you trust that they're giving you the correct hash, but not the correct binary, then you're not thinking clearly.
One thing a culture of reproducible builds (and thus stable hashes) however does provide is a lack of excuse as to why the build isn't reproducible. Almost nobody will build from source - but when there's bugs to be squashed and weird behavior, some people, sometimes, will. If hashes are the norm, then it's a little harder for attackers to pull of this kind of thing not because you trust their hash rather than their blob, but rather because they need to publish both at once - thus broadening the discovery window for shenanigans.
To put it another way: if you don't have reproducible builds and you're trying to replicate an upstream artifact then it's very hard to tell what the cause is. It might just be some ephemeral state, a race, or some machine aspect that caused a few fairly meaningless differences. But as soon as you have a reproducible build, then failure to reproduce instantly marks upstream as being suspect.
It's also useful when you're trying to do things like tweak a build - you can ensure you're importing upstream correctly by checking the hash with what you're making, and then even if your downstream version isn't a binary dependent on upstream (e.g. optimized differently, or with some extra plugins somewhere), you can be sure that changes between what you're building and upstream are intentional and not spurious.
It's clearly not a silver bullet, sure. But it's not entirely useless either, and it probably could help as part of a larger ecosystem shift; especially if conventional tooling created and published these by default such that bad actors trying to hide build processes actually stick out more glaringly.
Actually, this xz-utils was a sort of reproducible build issue. But the twist it is wasn't the binary not built reproducibly. It was the tar ball. The natural assumption is it just reflected the public git repository. It didn't.
Debian's response is looking to be mandating the source must come from the git repository, not a tar ball. And it will be done using a script. And the script must produce the same output every time, ie be reproducible. Currently Debian's idea of "reproducible" means it reflects the source Debian distributes. This change means it will reproducibly reflect the upstream sources. That doesn't mean it will be the same - just that it's derived in a reproducible way.
As for trusting the person who gave you the hash: that's not what it hinges on. It hinges on the majority installing a binary from a distro. That binary can be produced reproducibly, by an automated build system. Even now, Debian's binaries aren't created just once by Debian. There are Debian builders all around the planet creating the binary, and verifying it matches what's being distributed.
Thus the odds of the binary you are running not being derived from the source are vanishingly low.
So if something is hidden in the source, a reproducible build will give you the confidence to believe that the source is fully vetted and clean.
I see what you’re saying, but I don’t buy that reproducible builds actually solve anything, especially long term. As this whole xz thing has shown us, lots of things fly under the radar if the circumstances are right. This kind of thing will absolutely happen again. In the future it may not even require someone usurping an existing library, it could be a useful library created entirely by the hacker for the express purpose of infiltration a decade later.
Reproducible builds are a placebo. You must still assume there are no bad actors anywhere in the supply chain, or that none are capable of hiding anything that can be reproducibly built, and we can no longer afford to make that assumption.
A reproducible vulnerability is still a vulnerability.
You're focusing on the wrong issue. Just because reproducible builds don't solve all avenues of attack doesn't mean they're worthless. No, a reproducible build does not give you any confidence about the quality of the source. It gives you confidence that you're actually looking at the correct source when your build hash matches the published hash, nothing more.
A window that can be smashed in is still a vulnerability, so there is no value in people locking their front doors.
> I see what you’re saying, but I don’t buy that reproducible builds actually solve anything,
You're right. They don't solve anything, if you restrict "anything" to mean stops exploits entering the system. What reproducible builds do is increase visibility.
They do that by ensuring everybody gets the same build, and by ensuring everybody has the source used to do the build, and by providing a reliable link in the audit trail that leads to when the vulnerability was created and who did it. They make it possible to automatically verify all that happened without anybody having to lift a finger, and it's done in a way that can't be subverted as Jia Tan did to the xz utils source.
Ensuring everyone has the same build means you can't just target a few people, everyone is going to get your code. That in turn means there will be many eyes like Andres looking at your work. Once one of them notice something, one of them is likely to trace it back to the identity that added it. Thus all these things combine to have one overall effect - they ensure the problem and it's cause are visible to many eyes. Those eyes can cooperate and combine their efforts to track down the issue, because they are all are absolutely guaranteed they are looking at the same thing.
If you don't think increasing visibility is a powerful technique for addressing problems in software then you have a lot to learn about software engineering. There are software engineers out their who have devoted their entire professional lives to increasing visibility. Without them creating things that merely report and preserve information like telemetry and logging systems software, would be nowhere as reliable as it is now. Nonetheless, if the point you are making is when we shine a little sunlight into a dark corner, the sunlight itself does nothing to whatever we find there, you're right. Fixing whatever the sunlight / log / telemetry revealed is a separate problem.
You aren't correct in thinking visibility doesn't reduce these sorts of attacks. By making the attack visible to more people, it increases the chance it will be notice on any given day, thus reducing it's lifetime and making it less useful. Worse, if the vulnerability was maliciously added the best outcome for the identity is what happened here - the identity was burned, all its contributions where also burned and so a few years of work went down the drain. At worst someone is going to end up in jail, or dead if they live in authoritarian state. The effect is not too different from a adding surveillance cameras in self-checkouts. Their mere presence makes honesty look like a much better policy.
Reproducible builds let you combine trust from multiple verifyers. If verifyers A, B and C verify that the build produces the stated hash then you can trust the binary if any of A, B or C is trustworthy.
Or in other words, the point is not for everyone to verify the build produces the expected binary since that would indeed make the published binaries pointless. Instead, most people trust the published hash because it can be independently verified and anyone can call out the publisher of the binary if it doesn't match.
It's easier to hide nastiness in a binary than it is in source. And indeed this XZ hack too relied on multiple obfuscated binary blobs. The hash helps because it makes it a little harder to hide things like this. It's not a silver bullet, but it would have in this specific instance made it harder to hide that malicious m4/build-to-host.m4 in the tarball - after all, had the attacker done that despite publishing a hash they would have needed to use the hash including the modified build script, but then anybody building from the git repo would have had a different hash, and that's a risk for detection.
Reproducible builds and hashes thereof aid in transparency and detecting when that transparency breaks down. Of course, it doesn't mean hackers can't hide malicious code in plain sight, but at least it makes it slightly harder to hide between the cracks as happened here.
"slightly harder" isn't enough. That's what I'm saying that people are not accepting.
The days of 3rd party libraries simply being trusted because they're open source are slowly coming to an end. the problem is not unreproducible builds, the problem is that we implicitly trust source code found online.
Projects relying on 100s of 3rd party libraries are a major problem, and no one seems to care. They like it when their Rust build lists all the libraries that they didn't have to write as it compiles them. They like it when they can include an _extremely_ small Node library which itself relies on 4 other _extremely_ small Node libraries, which each rely on another 4 _extremely_ small Node libraries, until you have a node_modules directory with 30,000 packages in it.
We don't know how to write software any more. We know how to sew bits of things together into a loincloth, and we get mad when the loincloth makes us itch. Of course it does, you didn't LOOK at anything you are relying on. You didn't stop to think, even for a moment, that maybe writing the thing instead of trusting someone else with it was even an option.
As we continue doing this, relying on 3rd parties to write the heavy lifting code, we lose the skill to write such code ourselves. We are transferring the skills that we need into the hands of people that we can't trust, when viewed from a high level.
We need to get back to small applications, which we know because we wrote them in their entirety, and that someone else can trust because the code for the thing is 2500 lines long and has extremely few, if any, dependencies.
We need to get away from software which imports 30,000 (or even 100) third party libraries with implicit trust because it's open source.
"All bugs are shallow with enough eyes" requires that people use their eyes.
Sometimes the perfect is the enemy of the good. As the XZ saga shows, even clearly exceptionally well organized attackers don't have an easy time injecting hacks like this; and things that increase the attackers costs or risks, or reduce their reward can be useful even if they don't solve the problem entirely. Reproducible builds are useful; they don't need to be silver bullet.
compression. encryption. handshakes. datetime. rng or prng. i'm right there with ya but nontrivial tasks require nontrivial knowledge. i don't have an answer for obfuscated backdoors like here, or bad code that happens, but i do know if i tried to audit that shit, i'd walk away telling everyone i understood nothing
If everything is done carefully enough with reproducible builds, I think using a binary whose hash can be checked shouldn't be a great extension of trust.
You could have multiple independent autobuilders verifying that particular source does indeed generate a binary with the claimed hash.
The xz backdoor did hide the payload in test data though. Proper sandboxing could have meant that eg. building the binary wouldn't have access to test data.
When your programmatic build steps are isolated in plugins, then you can treat them like independent projects and apply your standard development practices like code review and unit tests to those plugins. Whereas when you stuff programmatic build steps into scripts that are bundled into existing projects, it's harder to make sure that your normal processes for assuring code quality get applied to those pieces of accessory code.
My standard development practices like code review and unit tests do not scale to review and test every dependency of every dependency of my projects. Even at company-wide scale.
I'm not saying that. I'm just saying that improved ability to apply such development practices is one benefit of using a plugin-style architecture for isolating the programmatic steps of your build pipeline. It's not perfect but in many ways it's still a significant improvement upon just allowing arbitrary code right in the pipeline definition.
Maybe now that we have things like GitHub Actions, Bitbucket Pipelines, etc., which can run steps in separate containers, maybe most of those things could be moved from the Maven build step to a different pipeline step?
I'm not sure how well isolated the containers are (probably not very – I think GitHub gives access to the Docker socket) and you'd have to make sure they don't share secret tokens etc., but at least it might make things simpler to audit, and isolation could be improved in the future.
For the purpose of this conversation we mostly just care about the use case of someone grabbing the code and wanting to use it in their own project. For this use case, dev tools like findbugs and code coverage can be ignored, so it would suffice to have a version of the build system with plugins completely disabled.
Code generation is the thornier one, and we can at least be more principled about it than "run some arbitrary code", and at least it should be trivial to say "this codegen process gets absolutely no I/O access whatsoever; you're a dumb text pipeline". But at the end of the day, we have to Just Say No to things like this. Even if it makes the codebase grodier to check in generated code, if I can't inspect and audit the source code, that's a problem, and arbitrary build-time codegen prevents that. Some trade-offs are worth making.
The xz debacle happened partiallybecause the generated autoconf code was provided. Checking in generated code is not that much better. It's a bit more visible, but not much people will spend their limited time to validate it, as it's not worth it for generated code. xz also had checked in inscrutable test files, and nobody could know it was encrypted malware.
I'm not a fan of generated code. It tends to cause misery, being in a no mans land between code and not-code. But it is usefull sometimes, e.g rust generating an API from the opengl XML specs.
Sandboxing seems the least worst option, but it will still be uninspected half code that one day ends up in production.
> The xz debacle happened partiallybecause the generated autoconf code was provided.
The code was only provided in a roundabout way that was deliberately done to evade manual inspection, so that's not a failure of checking in generated code, that's a failure of actually building a binary from the artifacts that we expect it to be built from. Suffice to say, cutting out the Turing-complete crap from our build systems is only one of many things that we need to fix.
In this case, I think the GP is absolutely right. If you look at the infamous patch with a "hidden" dot, you may think "any C linter should catch that syntax error and immediately draw suspicion." But the thing is, no linter at the moment exists for analyzing strings in a CMakeLists.txt file or M4 macro. Moreover, this isn't something one can reliably do runtime detection for, because there are plenty of legitimate reasons that program could fail to compile, but our tooling does not have a way to clearly communicate the syntax error being the reason for a compilation failure.
It could be required to declare which error should happen, and at what line it should trigger.
And the C program should still be syntactically valid and pass linting.
One could create a tool which would have a folder of .c files to check with such declarations. And output similar things to as the configure script.
Configure scrips tend to be a very slow part of projects, so a new tool that would support parallel compile/run of these checks would also have a considerable speedup. In addition to being more reviewable, verifiable and less error prone.
All good things if you want to use the autotools. And other tools too.
I've been porting and compiling Unix programs since 5 years before Linux. There was a lot of work needed to port between Unix variants, or even worse, non-Unix variants. So I can see the problem that the autotools solve.
But I never spent much time learning the details on how the autotools work. When './configure' has some problem, it is a very steep uphill battle to get any traction on the issue.
So I'd be more in the camp of looking for a different solution than autotools.
Yeah a lot of the complexity that configure scripts are supposed to solve are massively reduced now. Both because there is better harmonization between platforms, and many checks in a N year old are probably no longer relevant.
So a first pass should probably be to eliminate as much checks as possible. But there will almost always be a need for a couple of build time checks/switches - should be supported by some kind of tooling.
Cmake has a set of standardized macros to perform checks on the platform being built for. Including many specialized for typical usecases, like checking if a C header is present. And the generic CheckCompiles allows specifying a regex to match for check to be considered "failed".
So it seems considerably better than configure scripts. Probably also better than my idea :p
I don't know how easy it would be for an adversary to do something "underhanded" though, or how easy that would be to spot such.
> What would that accomplish? It certainly wouldn't have stopped this attack.
We could write an entire PhD thesis on the number of dire technical failings that would need to be addressed to stop this attack, so while this alone wouldn't have stopped it, it would have required the actor to come up with another vector of code injection which would have been easier to find.
> Only if you forbid bootstrapping
Codebases that bootstrap are the 0.1%. Those can be built via `bash build.sh` rather than deceptively hiding a Turing-complete environment behind a declarative one. Even if you need to have these in your trusted computing base somewhere, we can focus auditing resources there, especially once we've reduced the amount of auditing that we need to do on the other 99.9% of codebases now that we've systematically limited the build-time shenanigans that they can get up to.
Concretely, what security issues are solved by forcing the build specification language to be Turing incomplete? My guess is the answer is "none."
At worst, you're actually creating more holes. The reason autoconf/automake exist and M4 scripts are innocuous in the first place is because the build system uses an underpowered language and developers have to turn to code generation to get around it.
If you kneecap the build system's language you're not solving problems. You're creating them.
> it would have required the actor to come up with another vector of code injection which would have been easier to find.
If make was standardized and could programmatically determine the environment its run under and write full programs then the attack vector wouldn't exist in the first place.
> Codebases that bootstrap are the 0.1%.
We have different experiences, because ime it's close to 100% especially when you include transitive dependencies. When you care about supply chain security you care about being able to bootstrap from sources for your code and all your dependencies, and it's almost guaranteed that one of your dependencies needs to be bootstrapped.
I think that what we need is good sandboxing. All a sandboxed Turing-complete language can do is perform I/O on some restricted area, burn CPU and attempt to escape the sandbox.
I would like to see this on the language level, not just on the OS level.
I have been thinking the exact same thing, and specifically I would like to try implementing something that works with the Rye python manager.
Say I have a directory with a virtualenv and some code that needs some packages from PyPI. I would very much like to sandbox anything that runs in this virtualenv to just disk access inside that directory, and network access only to specifically whitelisted URLs. As a user I should only need to add "sandbox = True" to the pyproject.toml file, and optionally "network_whitelist = [...]".
From my cursory looking around, I believe Cloudflare sandbox utils, which are convenience wrappers around systemd Seccomp, might be the best starting point.
Edit: or just use Firejail, interesting...
You mention sandboxing on the language level, but I don't think it is the way. Apparently sandboxing within Python itself is a particularly nasty rabbit hole that is ultimately unfruitful because of Python's introspection capabilities. You will find many dire warnings on that path.
Looking at the state of software security in the rest of the world, this may not be much of a disincentive. At some point we need to knuckle down and admit that times have changed, the context of tools built for the tech of the 80s is no longer applicable, and that we can do better. If that means rewriting the world from scratch, then I guess we better get started sooner rather than later.
This doesn't read as a technical failure to me. This was 99% social engineering. I understand that the build system was used a vector, but eliminating that vector doesn't mean all doors are closed. The attacker took advantage of someone having trouble.
While I do agree that M4 is not great I don't believe any alternative would have prevented this attack: you could try translating that build-to-host file to say python with all the evals and shell oneliners and it would still not be immediately obvious for a distro package maintainer doing a casual review after work: for them it would look just like your average weekend project hacky code. Even if you also translated the oneliners it wouldn't be immediately obvious if you were not suspicious. My point is, you could write similarly obfuscated code in any language.
What's inscrutable code? Was it m4 or sh or the combination of the two?
Who will pay for all the rewriting you want done? Or even just for the new frameworks that are "scrutable"? How do we guarantee that the result is not inscrutable to you or others?
There is so much knee-jerking in this xz debacle.
(And I say this / ask these questions with no love for autoconf/m4/sh.)
I do think we have enough eyeballs at this point that we should stop entertaining the Dancing Bear in low level libraries and start insisting on crisp, self-explaining code. There are a lot of optimizations pushed into compilers these days, and there are a lot of architectural changes that can make things fast without making them inscrutable.
We should be moving from No Obvious Bugs to Obviously No Bugs (Tony Hoare).
That would be an improvement for sure... but this is not fundamentally a technical problem.
Reading the timeline, the root cause is pure social engineering. The technical details could be swapped out with anything. Sure there are aspects unique to xz that were exploited such as the test directory full of binaries, but that's just because they happened to be the easiest target to implement their changes in an obfuscated way, that misses the point - once an attacker has gained maintainer-ship you are basically screwed - because they will find a way to insert something, somehow, eventually, no matter how many easy targets for obfuscation are removed.
Is the real problem here in handing over substantial trust to an anonymous contributor? If a person has more to lose than maintainership then would this have happened?
That someone can weave their way into a project under a pseudonym and eventually gain maintainership without ever risking their real reputation seems to set up quite a risk free target for attackers.
> Is the real problem here handing over substantial trust to an anonymous contributor?
Unless there's some practical way of comprehensively solving The Real Problem Here, it makes a lot of sense to consider all reasonable mitigations, technical, social or otherwise.
> If a person has more to lose than maintainership then would this have happened?
I guess that's one possible mitigation, but what exactly would that look like in practice? Good luck getting open source contributors to accept any kind of liability. Except for the JiaTans of the world, who will be the first to accept it since they will have already planned to be untouchable.
> I guess that's one possible mitigation, but what exactly would that look like in practice? Good luck getting open source contributors to accept any kind of liability. Except for the JiaTans of the world, who will be the first to accept it since they will have already planned to be untouchable.
It's not necessary to accept liability, that's waived in all FOSS licenses. What I'm suggesting would only risk reputation of non-malicious contributors, and from what I've seen, most of the major FOSS contributors and maintainers freely use their real identity or associate it with their pseudonym anyway, since that attribution comes with real life perks.
Disallowing anonymous pseudonyms would raise the bar quite a bit and require more effort from attackers to construct or steal plausible looking online identities for each attack, especially when they need to hold up for a long time as with this attack.
Agreed it's mainly a social engineering problem, BUT also can be viewed as a technical problem, if a sufficiently advanced fuzzer could catch issues like this.
It could also be called an industry problem, where we rely on other's code without running proper checks. This seems to be an emerging realization, with services like socket.dev starting to emerge.
Autotools-based build infra is always crufty, fragile, slow, and fugly. In a lot of cases, it sucks because it's what we have right now without doing the work to replace it with something equally flexible.
Build infrastructure should be minimal, standardized, and not subject to endless special, undocumented fragility.
cmake, just, conan, meson, and bazel (and forks) exist.
But I've still yet to see a proper build system that does feature detection in parallel and concurrently, or supports live, incremental, continuous, cached rebuilding.
That's an improvement, but ultimately not enough, actually. The article hits at this, and you should definitely read it:
Being able to send xz patches to the Linux kernel would have been a nice point of leverage for Jia Tan's future work. We're not at trusting trust [1] levels yet, but it would be one step closer.
Nonsensical argument. JavaScript and TypeScript projects are developed with patches of the original source not some compressed artefact. Take your trolling elsewhere
Javascript is almost always executed in sandboxed, unprivileged environments. The issue here is that this type of obfuscation is easy to add in core os libraries. The JavaScript ecosystem, for all the hate that it gets, makes it super easy to sandbox any running code.
It doesn't matter if it's minified or obfuscated because you basically have to run unknown, untrusted code everywhere while browsing the web with JavaScript turned on. So the ecosystem and tooling is extremely resilient to most forms of malicious attacks no matter how minified or obfuscated the js you're running is. The complete opposite is true for bash and shell scripting in general
Its time to stop accepting that the way we've done this in the past is the way we will continue doing it ad infinitum.