> 2) Rust's ecosystem makes it easy to find and integrate solutions not offered ...

fractionalhare · on Oct 18, 2020

> Often times when asking about how to solve a problem in Rust, the answer comes in the form of "use crate X" rather than an explanation of how to solve the problem using the language itself.

Do you think this is a bad thing? How frequently would you say it's a better use of a software engineer's time to carefully reimplement a commoditized library that already exists for a speed improvement or to understand it?

I don't really understand arguments that you shouldn't introduce convenience and quality of life features because programmers will lean on them too much. Leaning on them is the point; the implicit thesis is that programmers no longer have to understand what's happening under the hood. It's specialization and division of labor applied to programming.

And if you really care, you can probably just look at the source code or pick up a book to learn how it works?

burntsushi · on Oct 18, 2020

I think it's a subtle problem and it takes more than a few sentences to make the required nuance in the argument clear (or the one line "zingers" found elsewhere in this very thread). But just as one example, see my comment elsewhere in this thread about the use of globwalk in this example. That's bringing in a ton of code when compared to just using walkdir and checking the file extension directly. The crate ecosystem encourages this kind of emergent behavior.

People make the mistake of treating this issue as black and white: either you're for or against code reuse. But the reality is far more nuanced. Often, a dependency will solve a much more general problem than what you actually need, and thus, avoiding the dependency might result in solving a considerably simpler problem than what the dependency does. In exchange, you use less code, which means less to audit/review and less to compile.

Given my position as the author of a few core crates, I actually often find myself advocating against the use of those very crates when the problem could be solved nearly as simply without beinging in the dependency. (I did not author globwalk, but I did author its 'ignore' and 'walkdir' dependencies.)

greggman3 · on Oct 18, 2020

I would throw in that it's very hard to design APIs well and it's very hard to not couple things and include the kitchen sink of features.

Let's say someone makes an XML parser (just trying to pick an example). IMO a bad XML library would read files. Instead it should just, at best, take some abstract interface to read text and outside the library the docs should provide examples of how to provide a file reader, a network reader, an in memory reader, etc...

But, I rarely see that. Instead (this is more the npm experience) such library would include as many possible integrations as possible. Parse XML from files, from the network, from fetch, from web sockets, there would be a command line utility to parse XML to JSON, and some watch option that continually parses any new or updated XML file and spawns an task to do something with it, all integrated into this "XML Library". Parts of it will respond to environment variables and it will have prettifiers integrated with ANSI color codes and you'll be able to choose themes, and it will have a progress bar integrated into the command line tool for large files

And the worst part is the noobs all love it. They down load this 746 dependency XML library and then ask for even more things to be integrated.

Maybe someday there will be a language with a package manager/community/guidelines that mostly somehow rejects bad bloated packages. It seems like a nearly impossible task though.

Note: I don't know rust well but I tried up fix a bug in the docs. The docs are built with rust. The doc builder downloaded a ton of packages which to me at least was not a good sign

burntsushi · on Oct 18, 2020

Personally, I don't see this particular problem as widespread in the Rust crate ecosystem. It's definitely one way in which dependencies can proliferate, but not a terribly common one in my experience. There's a lot of focus on inter-operation using common interfaces.

fulafel · on Oct 20, 2020

Very true.

Yet another possibility that I think has a place in this nuance conversation is adapting existing code with changes. It's interesting that we don't culturally today have an accepted way of doing copying + adaptation, but over most of the history of computing it has been very common. It has the obvious downsides of not easily getting improvements and bug fixes of later versions of the upstream code, and rightly is passed over in most cases, but in some cases it's still a win.

sullyj3 · on Oct 18, 2020

It's interesting that this line of argument perhaps leads to the conclusion "keep libraries small, to allow more granular selection of dependencies".

But this is the kind of reasoning that leads to the situation in npm of thousands of tiny libraries such as "leftpad" that many systems programmers are so derisive of.

burntsushi · on Oct 18, 2020

I'm not sure that's the only choice. I think another choice is to help educate folks when a dependency could be removed in favor of a little code. Take regex or aho-corasick for example. aho-corasick is a lot of code and only has one tiny dependency itself (memchr). There's really not much left to break apart. But I wouldn't recommend the use of aho-corasick for every case in which you need to search for multiple strings in one pass. A trivial solution using multiple passes is quite serviceable in a number of cases.

leeoniya · on Oct 18, 2020

it would be great to have an https://alternativeto.net/ style recommendation engine integrated into crates.io (or npm) that can narrow down the minimal crate required for a specific use case.

burntsushi · on Oct 18, 2020

It's an interesting idea, but would take a fair bit of creativity to execute well I think. It's really hard to enumerate this grey area! But I think something that elaborates on this grey area with lots of examples would be great. Sounds like a blog post I should write. :)

leeoniya · on Oct 18, 2020

> It's an interesting idea, but would take a fair bit of creativity to execute well I think.

yeah for sure, it's a hard problem. but at least a high level listing of similar libraries in the same category with # of recursive deps and total bloat size clearly visible so it was easy to shop around and sort by whatever attribute the developer wants to prioritize.

it would be similar to category-specific filters & comparison tables for features of e.g. SLR cameras, usb3 power supplies on amazon.

burntsushi · on Oct 18, 2020

The problem is that walkdir isn't really an alternative to globwalk. It's only a good alternative in this specific simple case. globwalk even depends on walkdir. :)

leeoniya · on Oct 18, 2020

i don't see a problem with listing both, as long as the high-level featureset is sufficiently enumerated in a comparison table for me to quickly discard or add the lib to my to-try list.

burntsushi · on Oct 18, 2020

There's no problem in listing both. The point is that a high level feature list comparison wouldn't have necessarily helped here.

Having lists of alternative libraries to solve a particular problem is great. It is valuable on its own. But it doesn't really fix the nuance that I'm focusing on here in this example. And my general claim is that this sort of nuance is a fairly common problem that leads to unnecessary dependencies.

jedbrown · on Oct 19, 2020

What about a less ambitious goal: fuzzy keyword search that displays the dependency graph and cumulative build time of each node in the package graph (activity stats on hover)? Users can assess for themselves whether a simpler library meets their needs, but this would make the relevant information more accessible.

burntsushi · on Oct 19, 2020

Sure. Also sounds useful. crates.rs (an alternative UI for crates.io) has the beginnings of this by listing the binary size of the crate's dependencies. e.g., https://lib.rs/crates/globwalk vs https://lib.rs/crates/walkdir

jnxx · on Oct 19, 2020

I think that Clojure does something interesting here, but I can't quite put my finger on it. Apparently, it uses very small and abstract libraries and seems to lead to narrowed down dependencies.

tialaramex · on Oct 18, 2020

It does (inevitably) really depend on the quality of the library. We really saw this for say, SSL deprecation.

If you made good choices (e.g. maybe you depend on Python's Requests) you had to go out of your way to make problems for yourself, by default the software understands that it should prefer shiny modern cryptography, allow anything that's probably fine, and reject nasty broken stuff. What counts as "shiny modern" or "nasty broken" will vary over time, but it's not likely you, as the application developer, were best placed to decide when you wrote the code, so unless you specifically did so the runtime decides.

In contrast Microsoft shipped not one but several distinct C++ APIs and .NET APIs where you'd need to go back and periodically modify old code (e.g. to specify newer TLS versions) or now it doesn't work because there was no provision for programmers to just let the runtime do the Right Thing.

colmmacc · on Oct 18, 2020

For almost any real world application, where people's data and businesses are at stake, developer convenience is a distant priority to security. Of course it's great when we can have both, and making secure solutions the most convenient is key, but Rust doesn't quite get this right yet. If there's a security issue in one of those dependencies you are importing, how do you even know that you are impacted? What about when it's a dependency of a dependency? Making it easy for developers to suck in a huge TCB at build time has some serious downsides.

mplanchard · on Oct 18, 2020

The ecosystem also includes stuff like cargo-audit[0], with which you can easily add dependency auditing to your CI pipelines or whatever. People are definitely thinking about this problem.

[0]: https://github.com/rustsec/cargo-audit

tijsvd · on Oct 18, 2020

I very much agree with your point, and want to add that in certain contexts, the same goes for performance.

I do think that Rust making it easy doesn't necessarily mean that you have to use it. It's fine to write applications using only the stdlib, and only link to some known libs like openssl or libpq.

In Rust's defense, it is easy to list all dependencies, and they don't automagically upgrade from one checkout to another.

jamesmishra · on Oct 18, 2020

When security is a concern, developers can simply avoid using untrusted crates. But concerns over security are not reason for a package manager to not even exist. Pretty much every major language and operating system has a package manager. While supply chain attacks are real, they are not unique to Rust in any way.

vvanders · on Oct 18, 2020

As someone who's spent a few decades working in C++ codebases I'm not sure how this is a Rust specific problem.

Last large-scale C++ codebase I worked on took 60+ minutes to compile when you touched the precompiled header.

In fact, I've found it to be common to have binary-only library dependencies in C++ which makes it harder to tell what those downstream dependencies do.

allo37 · on Oct 19, 2020

I think you just run into it faster in Rust: I tried making a simple REST HTTP server (like 60 LoC) using actix-web and it pulled in over 200 sub-dependencies and took something like 10 minutes to compile.

This toy program written using Boost.Beast or something similar would compile much faster. Of course I would have spent all day setting up the library instead of waiting for it to compile...

brundolf · on Oct 18, 2020

> this inevitably leads to deep dependency trees, slower compile times

I've never thought that "having more friction when adding dependencies" is a valid strategy for preventing overuse of third-party code. Unlike other "trust the developer" scenarios, dependency bloat is a minor issue at best and is very easy to detect and diagnose. IMO you should cut down your dependencies when they actually become a problem that's bigger than the one they're solving.

> and less knowledge about what's actually happening inside most rust code bases

I actually think the package ecosystem (along with macros) is crucial for the breadth and accessibility that Rust provides. Rust is a low-level language, but you have high-level libraries at your fingertips if that's the level where you need to be working. This means that instead of writing hot-paths in a low level language and application code in a high-level language (and dealing with the added complexity in terms of FFI, builds, tooling, developer knowledge, etc), it's very possible to just write the entire thing in Rust. Again in this case, I think "forcing devs to learn how everything works" is a weak argument for increasing the friction for adding dependencies. I'm pretty sure that everything on crates.io is required to include source, not just a binary, so it's trivial to dive in and learn how it works if you're so inclined.

serial_dev · on Oct 18, 2020

I think it's important to say the silent part out loud: "just use crate x and you can look up how the crate works behind the scenes if you are curious".

akiselev · on Oct 18, 2020

The worst part is how easy it is to add derive and proc macros that kill compile times. I have a ~3k LOC project that blows up to over 45k LOC after `cargo expand` due to a dozen or so macros and the compile time is really starting to hurt iteration speed. Sadly, macros are by far one of my favorite Rust features.

I ended up paying for Clion just to get better debugger integration and a quick action for cargo expanding a file in the UI so I can copy paste the macro results into my codebase. I'm hoping that will improve incremental compilation times until I buy a Threadripper workstation.

brundolf · on Oct 18, 2020

Does the expanded code take longer to compile than the equivalent handwritten code? Is it just the expansion itself that takes a long time?

In the latter case, maybe cargo needs an intermediate macro-expansion cache (instead of just a crate-level build cache)

TheCoelacanth · on Oct 19, 2020

Often, you would write a lot less code if you were doing it by hand.

mrlonglong · on Oct 18, 2020

I adore CLion, it's brilliant for coding. At the moment I'm using it with Python and Git.

throwaway189262 · on Oct 19, 2020

Compile times in Rust are really the only problem with this. Rust with LTO on is really good about removing unused code, so it doesn't hurt you during runtime

Whenever you run anything there's millions of lines involved across the OS API's directly or indirectly, so in practice it doesn't matter what's happening behind the scenes. In languages with fast compile times like Go and Java people add dependencies without even caring. And I think that's how it should be.

All our Java backends at work are like 200MB and it doesn't matter and nobody cares. It still compiles in less than ten seconds, and all that code barely affects startup time.

agumonkey · on Oct 18, 2020

I'm not a rust/cpp guy but I have a feeling that people skilled enough to need rust or cpp can probably use public libs for rapid prototyping and then recreate a smaller version if needs be.

hannofcart · on Oct 18, 2020

This statement is a bit confusing: are you saying that there is such a thing as too much code reuse? That it should be necessarily a bit difficult to integrate a third party library in your codebase?

skohan · on Oct 18, 2020

There's always a balance point. For something like JSON parsing, or an HTTP client, for sure I want to use a library. But when code reuse means pulling in a giant library to do one specific thing, or else introducing a runtime like tokio where it otherwise might not be needed, I would probably rather implement a small tailored solution specifically for my use-case.

mypalmike · on Oct 18, 2020

Ideally, there should be no harm in pulling in a library for just 1% of its functionally. A combination of careful library design and tooling should take care of dependency bloat ("shaking the tree"). In practice of course, libraries don't always consider this aspect, and the tooling never seems quite up to the task.

skohan · on Oct 19, 2020

Yeah I think it’s also a matter of API design. Like a crate which by necessity has to handle the “general case” to some degree will often have more API complexity than a solution which is tailored to one specific use-case. For instance, you might have to concern yourself with configuration parameters which have zero relevance to your use-case.

Measter · on Oct 18, 2020

In one sense, the way that larger crates in Rust are fragmented does address some of that.

This does result in large dependency trees, but it does also mean that if you need that one specific functionality, you may not need to pull in the full library, but only that smaller section.