I think it's a subtle problem and it takes more than a few sentences to make the required nuance in the argument clear (or the one line "zingers" found elsewhere in this very thread). But just as one example, see my comment elsewhere in this thread about the use of globwalk in this example. That's bringing in a ton of code when compared to just using walkdir and checking the file extension directly. The crate ecosystem encourages this kind of emergent behavior.
People make the mistake of treating this issue as black and white: either you're for or against code reuse. But the reality is far more nuanced. Often, a dependency will solve a much more general problem than what you actually need, and thus, avoiding the dependency might result in solving a considerably simpler problem than what the dependency does. In exchange, you use less code, which means less to audit/review and less to compile.
Given my position as the author of a few core crates, I actually often find myself advocating against the use of those very crates when the problem could be solved nearly as simply without beinging in the dependency. (I did not author globwalk, but I did author its 'ignore' and 'walkdir' dependencies.)
I would throw in that it's very hard to design APIs well and it's very hard to not couple things and include the kitchen sink of features.
Let's say someone makes an XML parser (just trying to pick an example). IMO a bad XML library would read files. Instead it should just, at best, take some abstract interface to read text and outside the library the docs should provide examples of how to provide a file reader, a network reader, an in memory reader, etc...
But, I rarely see that. Instead (this is more the npm experience) such library would include as many possible integrations as possible. Parse XML from files, from the network, from fetch, from web sockets, there would be a command line utility to parse XML to JSON, and some watch option that continually parses any new or updated XML file and spawns an task to do something with it, all integrated into this "XML Library". Parts of it will respond to environment variables and it will have prettifiers integrated with ANSI color codes and you'll be able to choose themes, and it will have a progress bar integrated into the command line tool for large files
And the worst part is the noobs all love it. They down load this 746 dependency XML library and then ask for even more things to be integrated.
Maybe someday there will be a language with a package manager/community/guidelines that mostly somehow rejects bad bloated packages. It seems like a nearly impossible task though.
Note: I don't know rust well but I tried up fix a bug in the docs. The docs are built with rust. The doc builder downloaded a ton of packages which to me at least was not a good sign
Personally, I don't see this particular problem as widespread in the Rust crate ecosystem. It's definitely one way in which dependencies can proliferate, but not a terribly common one in my experience. There's a lot of focus on inter-operation using common interfaces.
Yet another possibility that I think has a place in this nuance conversation is adapting existing code with changes. It's interesting that we don't culturally today have an accepted way of doing copying + adaptation, but over most of the history of computing it has been very common. It has the obvious downsides of not easily getting improvements and bug fixes of later versions of the upstream code, and rightly is passed over in most cases, but in some cases it's still a win.
It's interesting that this line of argument perhaps leads to the conclusion "keep libraries small, to allow more granular selection of dependencies".
But this is the kind of reasoning that leads to the situation in npm of thousands of tiny libraries such as "leftpad" that many systems programmers are so derisive of.
I'm not sure that's the only choice. I think another choice is to help educate folks when a dependency could be removed in favor of a little code. Take regex or aho-corasick for example. aho-corasick is a lot of code and only has one tiny dependency itself (memchr). There's really not much left to break apart. But I wouldn't recommend the use of aho-corasick for every case in which you need to search for multiple strings in one pass. A trivial solution using multiple passes is quite serviceable in a number of cases.
it would be great to have an https://alternativeto.net/ style recommendation engine integrated into crates.io (or npm) that can narrow down the minimal crate required for a specific use case.
It's an interesting idea, but would take a fair bit of creativity to execute well I think. It's really hard to enumerate this grey area! But I think something that elaborates on this grey area with lots of examples would be great. Sounds like a blog post I should write. :)
> It's an interesting idea, but would take a fair bit of creativity to execute well I think.
yeah for sure, it's a hard problem. but at least a high level listing of similar libraries in the same category with # of recursive deps and total bloat size clearly visible so it was easy to shop around and sort by whatever attribute the developer wants to prioritize.
it would be similar to category-specific filters & comparison tables for features of e.g. SLR cameras, usb3 power supplies on amazon.
The problem is that walkdir isn't really an alternative to globwalk. It's only a good alternative in this specific simple case. globwalk even depends on walkdir. :)
i don't see a problem with listing both, as long as the high-level featureset is sufficiently enumerated in a comparison table for me to quickly discard or add the lib to my to-try list.
There's no problem in listing both. The point is that a high level feature list comparison wouldn't have necessarily helped here.
Having lists of alternative libraries to solve a particular problem is great. It is valuable on its own. But it doesn't really fix the nuance that I'm focusing on here in this example. And my general claim is that this sort of nuance is a fairly common problem that leads to unnecessary dependencies.
What about a less ambitious goal: fuzzy keyword search that displays the dependency graph and cumulative build time of each node in the package graph (activity stats on hover)? Users can assess for themselves whether a simpler library meets their needs, but this would make the relevant information more accessible.
I think that Clojure does something interesting here, but I can't quite put my finger on it. Apparently, it uses very small and abstract libraries and seems to lead to narrowed down dependencies.
People make the mistake of treating this issue as black and white: either you're for or against code reuse. But the reality is far more nuanced. Often, a dependency will solve a much more general problem than what you actually need, and thus, avoiding the dependency might result in solving a considerably simpler problem than what the dependency does. In exchange, you use less code, which means less to audit/review and less to compile.
Given my position as the author of a few core crates, I actually often find myself advocating against the use of those very crates when the problem could be solved nearly as simply without beinging in the dependency. (I did not author globwalk, but I did author its 'ignore' and 'walkdir' dependencies.)