Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't want this to be interpreted as a negative comment about Pijul: I didn't try it, and don't want to judge.

My question is: what is the motivation for making distributed VCS? Over the entire lifespan of Git the number of times I had more than one remote... I can probably count on my fingers. And I've been in infra / ops for the better part of my career. And, all those times were exceptions. I'd do it to fix something, or to move things around one time, and then remove the other remote. Other times it was my hobby projects I shared with someone in some weird way.

Most developers who aren't in infra will never see a second remote in their repositories even once in their career. It seems like developing this functionality adds a significant overhead both in terms of development effort and learning effort on the part of the user. So... why?




> what is the motivation for making distributed VCS?

it depends on if you're asking about the motivations for distributed version control for linux kernel development in 2005 or the motivations for distributed version control today. git predates AWS and predates the state of the industry being that it's very easy and cost-effective for people to make central servers and web apps and things of that nature. My understanding is that "emailing a patch to a mailing list" was a more reasonable workflow then, since it piggy-backed off of people's email hosting providers (which, at that time, wasn't even "everyone using gmail", since back when git was created, gmail was invite-only; git predates gmail having open signups). Plus Subversion's branching model wasn't particularly great, so having different people work on things on different branches and giving them feedback and merging the branches when they were ready wasn't really a great experience.

The distributed nature of the version-control system facilitates branches, since a branch and a copy of the repo somewhere else are abstractly the same thing. Practically speaking, people don't push and pull code between their workstations and the network topologies are typically centralized in nature, but on a data level the distributed model is dual to the branching model, and the branching model is the thing that people actually care about. Although I _do_ think it's pretty neat that you can use a thumb drive or NAS as a remote instead of needing a server, it's probably not a core use-case for most people and most projects.


Git was the third VCS I had to migrate to. I still remember very well the transition. Existence of AWS has very little to do with any of it really...

The way programming shops used to be run around the time Git appeared was what today you'd call "self-hosting". I.e. a company would have a dedicated machine(s), depending on the size of the codebase, and those would host company's Git repository. Not at the start and not now and not ever was Git primarily used as a distributed system anywhere outside of Linux kernel (and perhaps few similar projects).

At the time Git appeared it offered some practical advantages over Subversion, which was its main competitor. But those advantages weren't due to centralized / distributed distinction. Eventually, Subversion caught up to some of the features Git had.

In other words, what you say about making cost-effective servers is absolutely backwards. It's more expensive today to do that. Back in the days you paid for the physical components and electricity, while today you are also financing huge infrastructure built around physical components and electricity which you don't own.

Where AWS or the likes do win today is in situations like when your company had multiple international branches and you needed to somehow move a lot of data between them. I remember that Git was very welcome in our Israeli office (after switching from Perforce) because the other office was in Canada, and synchronization with them was painfully slow and expensive. Public cloud contributed to solving this problem, but, mostly, it "solved itself" due to network latency and bandwidth increases over time.

> The distributed nature of the version-control system facilitates branches,

This is just not true. Branches exist in both distributed and centralized VCSs. There was a time when it was "expensive" for eg. Subversion to have branches (because, oh horror! they had to be created on the central server!) but, today, the way developers work with Git, branches are almost always duplicated on the VCS server anyways. Also, the amount of traffic necessary to service the code is really tiny compared to everything else an organization does, so it's a moot point.

> Although I _do_ think it's pretty neat that you can use a thumb drive or NAS as a remote instead of needing a server,

Nothing stops you from doing the same with centralized VCS... This isn't the function of distributed / centralized... maybe in a particular VCS it's harder to do, but the reason would be that it's virtually never needed, so nobody bothered to implement that / make it easy to do.


This question is really deep. The "distributed" nature of Git makes some things easier than SVN: you can scale to large teams, work offline, split a repo into independent subrepos for a while (managed via branches; good luck with the merge!).

However, this isn't really what Pijul calls "distributed": in Pijul, this term is about the work that people are doing when working collectively on a shared file. Which datastructures allow asynchronous contributions to happen? How to represent conflicts? Those questions belong to the field of distributed computing, together with things like CRDTs and leader elections.


> (managed via branches; good luck with the merge!).

Haha... Oh, this reminds me... there's Ada mode for Emacs that functions like that. In order to build it you need to combine multiple branches that have each its own contents (i.e. it's essentially multiple repositories combined in one) in the same checkout. It's the most bizarre way to use Git I've seen in my life (outside of total noob stuff, like when a guy wrote his entire project in .gitignore)

On a more serious note: thanks. I see now what that means. Maybe I should find time to look more into the project!


Enable development that does not rely on a central repository. For some people/projects it's important, although these days it's a minority.


My point is that it's a huge overhead for virtually no gain. This is something Linux kernel needed because of how the community around it is organized, but no commercial product works like that. Even vast majority of open-source projects don't work like that.

I mean, when you get hired into some programming shop, you don't go door-to-door and ask your new coworkers where their repository is, right? You open company's Wiki and it tells you where the repository is. You clone it and start working on your tickets, pull, push, rinse repeat. There's no reason for you to pull from your colleague's remote, even if it existed -- all communication is centralized and happens through the central hub, where various corporate policies wrt' working with repository are enforced (protected branches, CI pipelines etc.)

Over 99% of all developers in the world don't need this functionality. So, to say that you want to "Enable development that does not rely on a central repository" isn't answering the question. Yeah... it does, but why would you (or the authors of Pijul) care about this extremely rare case?


> why would you (or the authors of Pijul) care about this extremely rare case?

I'm the main author, and my answer is: because it allowed to to model with great mathematical rigor what conflicts are, how to represent them and how to treat them in the most intuitive and accessible way. The rest is indeed less essential, but still nice to have (I like doing my backups on an external hard drive using Pijul to copy my software projects).




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: