More

kpcyrd · 2025-01-15T18:11:05 1736964665

The title of the submission is somewhat bait, unfortunately the Cargo.lock doesn't seem to be public. Since my current Rust side-project also has some kind of database (along with, well, a p2p system) and also totals 454 dependencies, I've decided to do a breakdown of my dependency graph (also because I was curious myself):

  - 85 are related to gix (a Rust reimplementation of git, 53 of those are gix itself, that project is unfortunately infamous for splitting things into crates that probably should've been modules)
  - 91 are related to pgp and all the complexity it involves (aes with various cipher modes, des, dsa, ecdsa, ed25519, p256, p384, p521, rsa, sha3, sha2, sha1, md5, blowfish, camellia, cast5, ripemd, pkcs8, pkcs1, pem, sec1, ...)
  - 71 are related to http/irc/tokio (this includes a memory-safe tls implementation, an http stack like percent-encoding, mime, chunked encoding, ...)
  - 26 are related to the winapi (which I don't use myself, but are still part of the resolved dependency graph)
  - 8 are related to web assembly (unused when compiling for Linux)
  - 2 are relatd to android (also unused when compiling for Linux)

In some ways this is a reminder of how much complexity we're building on top of for the sake of compatibility.

Also keep in mind "reviewing 100 lines of code in 1 library" and "reviewing 100 lines of code split into 2 libraries" is still pretty much the same amount of code (if any of us actually reviewed all their dependencies). You might even have a better time reviewing the sha2 crate vs the entirety of libcrypto.so, if that's all you needed.

My project has been around for (almost) two years, I scanned every commit for vulnerable dependencies using this command:

    for commit in $(git log --all --pretty='%H'); do git show "$commit":Cargo.lock > Cargo.lock && cargo audit -n --json | jq -r '.vulnerabilities.list[] | (.advisory.id + " - " + .package.name)'; done | sort | uniq

I got a total of 25 advisories (basically what you would be exposed to if you ran all binaries from every single commit simultaneously today). Here's the list:

    RUSTSEC-2020-0071 - time
    RUSTSEC-2023-0018 - remove_dir_all
    RUSTSEC-2023-0034 - h2
    RUSTSEC-2023-0038 - sequoia-openpgp
    RUSTSEC-2023-0039 - buffered-reader
    RUSTSEC-2023-0052 - webpki
    RUSTSEC-2023-0053 - rustls-webpki
    RUSTSEC-2023-0071 - rsa
    RUSTSEC-2024-0003 - h2
    RUSTSEC-2024-0006 - shlex
    RUSTSEC-2024-0019 - mio
    RUSTSEC-2024-0332 - h2
    RUSTSEC-2024-0336 - rustls
    RUSTSEC-2024-0345 - sequoia-openpgp
    RUSTSEC-2024-0348 - gix-index
    RUSTSEC-2024-0349 - gix-worktree
    RUSTSEC-2024-0350 - gix-fs
    RUSTSEC-2024-0351 - gix-ref
    RUSTSEC-2024-0352 - gix-index
    RUSTSEC-2024-0353 - gix-worktree
    RUSTSEC-2024-0355 - gix-path
    RUSTSEC-2024-0367 - gix-path
    RUSTSEC-2024-0371 - gix-path
    RUSTSEC-2024-0373 - quinn-proto
    RUSTSEC-2024-0421 - idna

I guess I'm doing fine. Keep in mind, the binary is fully self-contained, there is no "look, my program has zero dependencies, but I need to ship an entire implementation of the gnu operating system along with it".

tison · 2025-01-15T21:44:48 1736977488

I've updated the Gist with a full Cargo.lock file that can be audited - https://gist.github.com/tisonkun/06550d2dcd9cf6551887ee6305e...

Running cargo audit -n --json | jq -r '.vulnerabilities.list[] | (.advisory.id + " - " + .package.name)' gives:

RUSTSEC-2023-0071 - rsa

which is transitively introduced by sqlx-mysql while we don't use the MySQL driver in production.

kpcyrd · 2024-12-27T04:31:21 1735273881

> when there is no reasonable packaging story for the language

For context: I've been around in the Debian Rust team since 2018, but I'm also a very active package maintainer in both Arch Linux and Alpine.

Rust packaging is absolutely trivial with both Arch Linux and Alpine. For Debian specifically there's the policy of "all build inputs need to be present in the Debian archive", which means the source code needs to be spoon-fed from crates.io into the Debian archive.

This is not a problem in itself, and cargo is actually incredibly helpful when building an operating system, since things are very streamlined and machine-readable instead of everybody handrolling their own build systems with Makefiles. Debian explicitly has cargo-based tooling to create source packages. The only manual step is often annotating copyright attributions, since this can not be sufficiently done automatically.

The much bigger hurdle is the bureaucratic overhead. The librust-*-dev namespace is for the most part very well defined, but adding a new crate still requires an explicit approval process, even when uploads are sponsored by seasoned Debian Developers. There was a request for auto-approval for this namespace, like there is for llvm-* or linux-image-*, but back then (many years ago) this was declined.

With this auto-approval rule in place it would also be easier to have (temporarily) multiple versions of a crate in Debian, to make library upgrades easier. This needs to be done sparsely however, since it takes up space in Packages.xz which is also downloaded by all users with every `apt update`. There's currently no way to make a package available only for build servers (and people who want to be one), but this concept has been discussed on mailing lists for this exact reason.

This is all very specific to Debian however, I'm surprised you're blaming Rust developers for this.

blub · 2024-12-27T07:42:13 1735285333

> but adding a new crate still requires an explicit approval process, even when uploads are sponsored by seasoned Debian Developers.

What’s the additional security benefit of the explicit approval? A major problem with Rust (for me) is the multitude of dependencies required even for trivial software, which complicates supply-chain monitoring.

Auto-approval for crates gives me a bad feeling.

kpcyrd · 2024-12-27T11:42:12 1735299732

It's not a security control, it's for copyright compliance and in case of incorrectly filled out package metadata.

kpcyrd · 2024-12-27T03:48:58 1735271338

> but given the relative "freshness" of your typical Rust package vs say... Debian stable

That's not how Debian development is done, fresh software is uploaded to unstable only and then eventually ends up in a stable release. As a maintainer, Debian stable is something you support, not something you develop against.

kpcyrd · 2024-12-27T03:29:56 1735270196

This is outdated information. Debian (and other distros) already had their own SBOM format called buildinfo files that encodes this kind of information.

In Debian stable ripgrep on amd64 is currently on version 13.0.0-4+b2.

The relevant buildinfo file can be found here:

https://buildinfos.debian.net/buildinfo-pool/r/rust-ripgrep/...

It encodes the entire Rust dependency graph that was used for this binary, with exact versions of each crate.

kpcyrd · 2024-08-27T14:45:46 1724769946

If you want certainty you'd use Reproducible Builds and know for sure which compiler generates this binary from the given source code.

This assumes the code is open source, you know which specific source code release was used and the author of the binary is making an effort to document the build environment that was used, e.g. through an attached SBOM.

kpcyrd · 2024-08-13T09:49:27 1723542567

The build server now needs the actual build dependencies instead of relying on pre-built intermediate build artifacts. This is a good thing, and should be expected from anything that claims to "build from source".

carderne · 2024-08-13T10:31:20 1723545080

I know it’s a good thing. I’m just curious to what degree packagers see this as extra work. Seems like probably not.

vsl · 2024-08-13T10:38:32 1723545512

It's going to be a trivial amount of extra work. Packaging build systems (well, the mainstream ones: Debian, RPMs, ebuilds etc.) have the concept of runtime and build-time dependencies, so the maintainers will just need to add a few more packages to the build deps list.

bluGill · 2024-08-13T13:50:50 1723557050

Many distributions have a policy of not using generated files, and more are implementing it after the xz hack. Most package systems have (not just Debian) have the concept of build and deploy dependencies. It is of course more work, and package systems tend to not be well documented and so it is hard to figure out, but it is possible: if you ask the experts they will help

EuAndreh · 2024-08-13T13:15:39 1723554939

IIRC Debian has a policy on preferring to generated files themselves. I couldn't find the link to it right now, though.

pabs3 · 2024-08-14T03:44:23 1723607063

Some info on here:

https://wiki.debian.org/AutoGeneratedFiles

EuAndreh · 2024-08-14T09:23:25 1723627405

Yep, that says not to include generated files in VCS.

I've also seen the bison and flex output included in VCS, and the same guidelines apply.

EuAndreh · 2024-08-13T17:17:17 1723569437

s/generated/generate/

kpcyrd · 2024-04-26T13:40:10 1714138810

"far-left identitarian" - what's that even supposed to mean, https://en.wikipedia.org/wiki/Identitarian_movement is defined as "far-right ethno-nationalism".

"left-wing ethno-nationalism" is an oxymoron.

ParetoOptimal · 2024-04-28T18:47:22 1714330042

They are trying to "both sides" alt-right extremism in order to make it look more palatable to shift the Overton window further right.

Terr_ · 2024-05-05T23:03:46 1714950226

On the subject of nonsense re-definitions, you might enjoy this comic: https://existentialcomics.com/comic/289

kpcyrd · 2024-04-10T09:04:53 1712739893

I'm not a fan of the "Reproducible tarballs" section, because it's explicitly about pre-processing the source code with autotools, instead of distributing a pure, unaltered git snapshot (which `git archive` can already generate in a deterministic way).

The section following then mentions signing the pre-processed source code, which I think is the wrong approach. It makes a difficult situation because of how strongly some people encourage signed source code, yet I think autotools is part of the build process and should run on the build server (and double checked by reproducible builds). If people pre-process the .orig.tar.xz they upload to Debian, this pre-processing won't be covered by reproducible builds because it happens undocumented.

The patch for "reproducible tarballs" is quite involved[0] and has rookie mistakes like "pin a specific container image using `@sha256:...` syntax, but then invoke `apt-get update` and `apt-get install` to install whatever Debian ships at that time".

[0]: https://github.com/curl/curl/pull/13250/files

throwiforgtnlzy · 2024-04-10T10:02:37 1712743357

Autotools is slow, ugly, and horrible. There are many better and simpler alternatives that exist now.

Source releases shouldn't be pre-processed in any way. They should build whatever they need to on-the-fly as part of the build.

Other project anti-patterns:

0. Maintaining creaky, fragile, old forks of other projects as dependencies.

1. Git submodules.

2. Too many dependencies.

3. Extremely large code bases that don't prune poor patterns, over-engineering, and dead features.

4. Lots of features. OpenSSL, I'm looking at you.

5. Not signing releases with a passphrase split between multiple developers to requires 2-person keying.

Package management/distro/builder/packager anti-patterns:

6. Patching the hell out of a project without pushing fixes upstream.

7. Inability or failure to source upstream from multiple independent sources, compare them, and verify chain/web of trust using cryptographic signatures.

8. Not following reproducible build guidelines.

9. Not using build caching like sccache.

10. Not building from reproducible sources periodically on the client-side to verify binaries are identical to those built by others.

11. Dependency hell of traditional (non-nix) packages importing zillions of packages.

sloowm · 2024-04-10T09:23:52 1712741032

Maybe you could get involved by pointing out the mistake and proposing the alternative? I imagine that downstream can't easily switch to another distribution method without notice.

rwmj · 2024-04-10T09:18:44 1712740724

Is git archive deterministic? I think github has trouble here. (No argument that is ought to be deterministic.)

hobofan · 2024-04-10T10:04:43 1712743483

The GitHub implementation of git archive does it's best to be deterministic. Some reproducible build systems like e.g. Bazel heavily rely on that.

GitHub had a bug early last year[0] that broke that determinism and it caused a huge uproar. So through a mixture of promises to individual projects and just so many projects relying on it, GitHub's git archive has been ossified into being deterministic (unless they want to lose a lot of goodwill among developers).

[0]: https://github.com/orgs/community/discussions/45830

lrvick · 2024-04-10T09:33:51 1712741631

In my experience, yes. Provided it is done with a known git binary etc.

Best to containerize workflows like this with hash-locked deps so they can be easily verified by others even far in the future with any OCI compatible toolchain.

lrvick · 2024-04-10T14:30:39 1712759439

Made an alternative PR for reproducible tarballs that should work years from now.

https://github.com/curl/curl/pull/13338

kpcyrd · on Dec 19, 2023

RSA has been known to be problematic for a very long time at this point, hopefully it just fades into obscurity soon and we can move on with less error-prone ciphers.

cookiengineer · on Dec 19, 2023

> less error-prone ciphers.

...such as?

The paper also describes how all programming languages that use BigInt data types are likely to be affected. It also specifically refers to the ECDSA TLS attack from 2015, as well as the Minerva/Raccoon attacks which work in a similar way.

As isogeny key exchange algorithms are likely to be all vulnerable after the SIKEp434, SIKEp503, SIKEp610, and SIKEp751 fiasco(s) of last year, I'm curious as to what kind of alternatives you suggest?

Honestly, at this point I'd say anyone that makes these kind of claims that there even are "less error-prone ciphers" just hasn't worked long enough in cryptography to have learned otherwise. And I'd be bold to say it's probably all not possible due to how we design the CPU hardware and their memory interactions.

Even if it's not the software (and it runs in constant time to prevent analysis) there have been lots of acoustic side channel attacks that were just analyzing the sounds that the transistors on the CPU make. So all your random no-op instructions are kind of useless against that, too.

[1] Minerva attack: https://minerva.crocs.fi.muni.cz/

[2] Racoon attack against FDDH: https://raccoon-attack.com/

[3] Key recovery attack on SIDH: https://eprint.iacr.org/2022/975.pdf

[4] Acoustic recovery of GnuPG (RSA) keys: http://cs.tau.ac.il/~tromer/acoustic/

pie_flavor · on Dec 19, 2023

While timing attacks are not one of the ways RSA is more error-prone, it is still more error-prone, particularly regarding client secret selection.

coppsilgold · on Dec 19, 2023

That is only because the way RSA key exchange has been implemented everywhere involves padding. RSA-KEM[1] has been available since forever and no padding is required, though no one uses it for some reason.

RSA-KEM is also trivial to implement as it's virtually identical to textbook RSA, with the sole restriction that m has to be randomly sampled from 1 .. n-1.

And just like that, no padding oracles.

[1] <https://en.wikipedia.org/wiki/Key_encapsulation_mechanism> , <https://datatracker.ietf.org/doc/html/rfc5990>

webdoodle · on Dec 19, 2023

> less error-prone ciphers.

AKA Intentional backdoors for the NSA.

kpcyrd · on Sept 12, 2023

For programming, I would gladly go with the second option in return for proper flow-control mechanics and data structures.