Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't get what it does. Does this extension make large binary files "diffable," as it states that's the problem it solves?


Binary files are already diffable, both in how they're stored (in fact, the only thing that Mercurial stores internally are binary diffs), and in terms of sending around patches (that's what the Git patch format is for).

There are two problems that largefiles tries to solve: first, that while binary files are technically diffable, most of the popular ones store large amounts of compressed data, which means that their diffs are insanely poor. Combine that with the second problem, which is that distributed version control systems tend to include the entire history in every repo, and you've got a recipe for disaster: those 200 MB worth of textures that you just color-corrected are now going to be another 200 MB of data that every last developer needs to get whenever they attempt to fetch your repository.

largefiles solves this by saying that certain, user-designated files, are not actually stored in the repository. Instead, stand-ins, which are one-line text files with the SHA-1 hash of the file they represent, are stored instead. Whenever you update (checkout, in Git parlance) to a given revision, largefiles fetches your missing files on-demand, either from the central store, or (if available) from a per-user cache.

The benefit of this approach is that, if just want the newest revision, you don't have to also fetch all the historical versions of all the assets. The downside of this approach is that a clone doesn't, by default, have the full, reconstructable history of the entire repository. Whether this trade-off works for you will largely depend on who you are and what your workflow is, but we've found many Kiln customers who find it to be an excellent trade-off.


That's exactly what it doesn't do. It versions the checksum referring to a largefile along with the rest of your repository, meaning it's little more than sugar for a fancy "network symlink".

However this should be enough in most circumstances, e.g. to allow representing the complete state of a Debian package repository, without causing Mercurial to slow to a crawl. (Note I just made this use case up, it's just an example)




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: