CVS was awful, but it was so much better than RCS, which operated only on single files. There was a time in the late 90s when CVS + Bugzilla + Tinderbox was the state of the art of CI if your toolset was all open source. I used this combination at multiple workplaces 1998-2002. It worked and we shipped, but we didn't really understand how bad the tools were.
At Sun all stuff was in Teamware, except when it wasn't, because every group at Sun could do whatever they wanted to (we used SVN in the x86 ILOM team). Teamware was good but suffered by being a wrapper on top of SCCS.
SVN was a revelation after CVS. I resisted switching to git from SVN for a long time because the mental model for SVN is so much simpler: everything is in one central place, every change is marked with a monotonically increasing version number, remember a number and you can always reproduce the state of the project. Eventually I saw the huge benefit of the "git workflow" (local branches, pull requests) for collaboration. Branches in SVN are error prone and always risked painful conflict resolution, so we made them rarely.
Perforce, which is commercial, was like stepping into another dimension (you mean all this stuff just works? Out of the box?). There's another historical timeline where Perforce has an early open-source model (like Redhat) and is the dominant VCS.
I enjoyed that remembrance of 20 years working on free-software based projects, not a bad way to start Sunday. I hope it was worth your time!
The branching isn't the problem, but the merging is. My experience is that it depends on how closely the team is working together. If everybody is working on separate files, then SVN can merge together on a file-level. If people are working on the same files, then every such file requires manual merging. Since git works on a line-by-line basis, merging is much easier. It still requires a human to look through the resulting merge for logical consistency, but it gets a lot closer on the first try.
IIRC SVN merges only created conflicts if the same lines were changed. If I edited line 10 and another developer edited line 200 there was no problem. It's been a while, but I also never encountered any serious issues with branching and merging in SVN.
We have multiple people working on the same files and use Eclipse to do line-by-line merges automatically (except when there are actually conflicts). It works fine. So I still haven't seen any real evidence of error prone merges.
If someone can come up with an error free merging for all situations, it would be the "killer app" for configuration management. I would especially look forward to merging for XML files as well.
You're never going to avoid all errors, but with additional tools SVN merging seems no more error prone than Git.
I do agree that source code control needs to move beyond lines to deal directly with nodes in a tree structure. That could be XML, JSON, a C++ parsed abstract syntax tree, etc.
So I've actually thought about this a fair bit recently, since I'm making some changes to C++ code that touch a lot of files in reasonably straightforward ways. The changes should be possible to merge conflict-free in basically 100% of cases, but since so many lines are touched, no VCS merge tool can manage it. Most of my changes are just typenames, so I figured, you could do some sort of AST-based merge to allow those changes to merge with semantic changes to the surrounding code.
But this has a lot of problems. Namely - most of the time, when two people change the same line, the changes really do conflict, and someone really does need to manually merge them. In fact, sometimes even when your VCS merges things happily, it still gets them wrong because two changes conflict semantically but don't touch the same lines.
So I guess I would say, I think merging is an impossible problem to 'solve', and going to a more granular merge strategy is actually a move in the wrong direction. You could maybe fix my particular case by 'whitelisting' those typename changes, and saying 'these are allowed to merge with anything else', but at that point... the effort required to specify that unambiguously and make it work properly is probably higher than just merging the changes.
My experiments showed that you can get good character-based diffs from using a syntax highlighting tokenizer. [1] It was faster than a pure character diff algorithm, but the resulting diffs/patches are interoperable, and provided quite reasonably semantic diffs on the examples I tested. Most tokenizers are great too because they tend to operate very well with work-in-progress code or broken/edge-case code (because they are the same sorts of tools we use in our editors).
I've not seen someone actually use it for source control, but the core idea is simple enough for the taking.
For example, I have used .gitattributes to make git use MS Word to merge docx files, and LabView to merge .vi files. TortoiseGit (for Windows) includes diff/merge driver scripts for use with MS Word files.
While not a source control system, the JetBrains IDEs have a very good merge tool that is filetype aware. For example, WebStorm/IntelliJ Ultimate, when merging JavaScript, often has a "magic wand" option that can detect when merge conflicts are actually resolvable if you parse both sides first, and merge the AST.
It doesn't always work quite right, but it's good enough that I use it quite often.
Totally agree. Could AI or machine learning be used for this type of application? I don't know much about machine learning, but I could imagine that it could be used to learn the programming habits of a development team and, perhaps, use it as a context for determining its merging strategies.
I think my memory of pain is during merging. We settled on a fairly simple, conventional system: mainline new dev, branches for each release, and occasional personal branches for experimentation. This meant we rarely merged other than fixes to sustaining branches.
As others have mentioned, the problem with branches in SVN isn't the branching, it's the merging. IIRC, this was because SVN didn't track merges, so if for instance you have a "stable fixes for release X" branch which is repeatedly merged into the trunk, you'd soon encounter wonky behavior.
Subversion absolutely tracks merges, but you don't really need that just to keep a stable branch regularly updated. That's a very normal and expected use of a branch and should not give you "wonky behavior" in neither git nor svn. Thousands of projects do it every day. Any encountered problems are likely not inherent to your version control system.
It definitely didn't use to track merges. That got added after most people jumped ship to git. Before that, you had to specify the common ancestor commit of the things you were merging, which you usually didnt know and had to dig around in slow logs to find out.
It has tracked merges longer than github has existed, just to keep things in perspective. But, again, it's not required just to keep stable branches around since commit numbering is linear in svn. It's very convenient for anything less trivial than that though.
I still don’t like the Perforce model of workspaces or whatever they call it (clients?). That mapping between the server folders and your local ones. Extra complexity for no obvious benefit I could see.
We used to do code + art in the same 1TB perforce repo back when I did gamedev. Art folder was ~600GB. Having a workspace to eliminate that as a developer was awesome.
Git still fails to even approach the usefulness that P4 brought to those types of shops. Between the auto-cache proxies, dealing with 1TB+ repos and workspaces there's a reason P4 still does pretty well.
I won't reply to all the comments saying mostly the same thing, but Git doesn't have this, true. SVN, another centralized system, does have it. Shallow checkouts plus folder checkouts and you can have any combo you want. Without adding another concept to take care of.
It was great for large projects. I worked on a project that was similar to Android. Imagine 10 device trees (device drivers), two OS trees (Linux, vxWorks), and two application trees (the old stuff, the new stuff). You could define a mapping that got you just the trees you needed, which both sped up builds and VC operations (although p4's local indexing made most VC operations lightning fast).
Imagine a world where 'svn up' or 'cvs up' takes 20 minutes. Not only did the client mappings limit the scope of the operations, but local indexing brought the time for those operations down to 0-4 seconds.
The complexity is completely optional. Your client map can just map the entire depot to a single directory. And it has always been very easy to enforce this at the organizational level, with a small script in the server.
Rule of thumb, one man's useless version control feature is another man's saving grace. I've never needed to open multiple changelists simultaneously in one work area, but I'm sure someone loves that feature.
Yeah, the mapping was a wart. Piper+CitC is much nicer, it's a file system that pretends you have the whole tree locally, while actually storing only the stuff you changed.
This article makes in seem like ancient history, but there are still projects around that use CVS or very similar VCS like for example ClearCase. My software engineering career started after git got popular, but I had to work with ClearCase. I wouldn't wish it on anybody.
The inability to go back to old states of your project unless you happened to tag them makes finding the cause of bugs extremely hard. I don't know how it is in CVS, but in Clearcase you can tag only a subset of the files and what you check out from the server is determined by a complex configuration file ("the configuration specification"). This is an additional hurdle to reconstructing old states of your software because now you need to know the config-spec to do so. Even if you have the old config spec, it might contain fallback rules ("just take the latest version of the file") that effectively make it impossible to reproduce the old state. Unless you're extremely disciplined in your usage of the tool it's a real challenge to fix bugs for old releases. And don't get me started on trying to backport bugfixes into several old releases when branches are managed for each file separately.
This lack of a proper project wide history also makes it extremely challenging to migrate to a different tool without losing a lot of information.
> My software engineering career started after git got popular, but I had to work with ClearCase
Me too
I agree with you, the code I had inherited was built with clearmake and no one had any idea how to move it to a newer build system. It was multi-platform code and you had to compile all the C++ code on an HP-UX machine first. The build would fail the first time and then succeed the second time. Once it was built on HP-UX, it could be built on any other platform
There was a dedicated "build engineer" who was the only one who knew how to fix build issues. Thankfully the project was finished by the time he had left the company :)
We previously used Clearcase at work. At one point a coworker ran a survey asking for people's experience with different VCS options. The best reply I saw was : "Clearcase is the single most productivity destroying tool I have ever worked with."
ClearCase is the reason why I will avoid all and any (and will encourage others to avoid) SW products by that famous three-letter company unless absolutely needed (but it never is)
ClearCase was as delightfully terrible before IBM bought it and before Rational bought it. Writing off all software from IBM because they sell some lame software just seems weird. It's a huge company. I'm sure some people who enjoy soup have written bad software but it's saner to evaluate whether software is actually good or bad rather than checking if the authors like soup.
>This article makes in seem like ancient history, but there are still projects around that use CVS or very similar VCS like for example ClearCase.
Don't remind me. I worked for a group in a top tech company that used cvs until 2014 and were very resistant to switching. They finally switched due to a department mandate to stop using cvs (they were not the only team in the department using cvs).
I have source sitting around that was maintained in CVS. Nothing I'm still using, but old archives I keep for nostalgia... But I feel really old now, when tech I used in the early part of my career is now considered ancient enough to justify "historical" accounts of how we worked in the dark ages.
But I do remember when CVS was considered a step up, so I suppose I'm getting old.
This is a really fun article. I started with CVS, graduated to Subversion, and now to Git. It makes me really curious what the next paradigm shift will be!
The way I see it the evolution of these version control systems was driven by the precipitously falling price of disk space.
CVS only stored a tiny bit of metadata locally so almost every operation required contacting the server. That made using CVS very slow and branching expensive.
Subversion stored a whole separate copy of every file locally. This made the critical "show me just my local changes" operation an order of magnitude faster and made branching cheap, at the expense of doubling the disk usage.
Git stores the entire history of everything locally (compressed). This makes most operations an order of magnitude faster still, so much faster that some things that were completely impractical with the earlier systems is now routine, and branches are free.
It's possible to check out partially without hacking, so should be in a better place than git in that respect. Other than that I'm not sure if there is any support (similar to LFS). It's also very much in its infancy so performance is likely not a primary focus.
> ... makes me really curious what the next paradigm shift will be
I know this is HN and we need to know 100 git commands. We should do interactive rebase, we should cherry picking changes, and all that jazz, but I really hope the next source control will be a whitelisted git commands subset. push, pull, commit, amend and rebase or something similar.
> I know this is HN and we need to know 100 git commands.
The ones that I normally use are fetch, merge, push, reset, rebase, clone, diff, status, branch, log and checkout. When I used svn, I had to use co, commit, add, delete, diff, status, log, copy, and update. Eleven versus nine commands, so the difference isn't really that much.
I followed pretty much the same path as you over the past 30+ years. I am curious at how easily you adapted from the centralised to distributed 'mental map'? I struggled with that a lot, especially with everyday things like 'checkout' and 'commit' doing subtly different things in SVN vs Git.
I actually started with CMVC, an IBM proprietary VCS which was horrible. CVS or pretty much anything was an improvement over that so I had no problem making that switch.
I was an early adopter and promoter of Subversion. I loved how much faster it was than CVS, although it took me a few years to fully understand some of the more complex things like the details of merge tracking.
I was very resistant to Git at first - I fundamentally just didn't get it. The whole concept of a distributed VCS seemed like anarchic nonsense. I basically had to be dragged there kicking and screaming, but of course now I'm very comfortable and would never go back to Subversion.
Everyone rightfully complains about Git being hard to use but it's totally worth it for the power. It seems unlikely to me that the next generation will be "works like Git but simpler to use". I think whatever comes after Git will have to be more significantly different than that.
I guess I’ve become an old greybeard while I wasn’t paying attention. When I started out in software development CVS, and later Subversion were the best we had.
The emergence of git and GitHub have transformed Open Source development, being able to just open a pull request or an issue and know you’ll get notified when things happen is great - I’ve submitted patches for many things which I just wouldn’t have bothered signing up to a mailing list to keep track of in the past.
Signed,
Github user 362 (from back before you could just sign up)
It also tells you how quickly software employment has exploded, that so many (half? more?) developers have never used any version control (fundamental) tool other than one that's been common for only a decade.
As recently as the first dot-com boom, Git didn't even exist. Even Subversion was brand new, and it was mind-blowing how much easier it was to work with than CVS.
One aspect of history that this article glosses over is that Git is not the only or even the first third-generation version control tool created. The earliest buzz I remember around DVCS was for darcs and bazaar, neither of which I've heard mentioned since about 2009. Mercurial and Git were released around the same time as one another, and were in a vim-emacs sort of grudge match for a few years before Git became the clear winner.
Mercurial seems to still be in use in some odd corners of both the corporate and open source worlds - probably a legacy of people choosing it for projects during that period before Git "won". When I first tried it it felt a lot like Subversion made distributed. Nowadays it feels incredibly clumsy next to Git.
Mercurial has exactly the same features - and maybe something more - than git, with a UX which is at least more consistent.
The main problem I see with mercurial is that its team stays too quiet: everything is so smooth that probably nobody feels the need of doing much fuzz about it.
The HN community has a high technical level, but during my career, out there in the world, I have seen a lot of different folks: people that constitute the bulk of the workforce often do not have a strong mastery of the tools they are required to use. Sometimes they just endure them. For these people, using an easier tool (and mercurial is a good candidate in my experience) could probably help them to really improve their skills and become more productive team members.
Back in the day, once I’d understood the value of decentralized revision control, I was convinced Mercurial would prevail, and that Git would become a footnote in history.
This is one moment in my past that was formative in helping me make good tech predictions. Namely, if I see a technology that I think is better, and I think it will win out over an inferior technology, I simply reverse my prediction and enjoy being correct.
Coming from Subversion, Mercurial made so much more sense than Git. I still think the CLI is more consistent. I keep hoping for a new shakeup in revision control systems, though I suspect such a change will be a long time coming.
To be blunt I don't believe Mercurial has reason to still exist. It's feature advantage over git is very marginal at best, and it simply lost on popularity. If you ignore the inherent value in popularity (popularity means you spend less time training new hires who aren't new to the industry), then Fossil wipes the floor with Mercurial and Git in terms of features.
In my view, the legitimate choices for FOSS version control these days are git (good enough, and popular) and fossil (featureful and obscure.) Why you would ever pick mercurial instead (marginally better than git, but almost as obscure as fossil) is completely beyond me. It occupies an uncomfortable middle ground of mediocrity.
(The one caveat here is that Fossil may not be appropriate for very large scale decentralized projects, but frankly that's a problem git and mercurial are solving that most companies don't have.)
(Incidentally, editor extensions for git have almost eliminated my CLI interaction with git. The UX of vim-fugitive is great, and similar extensions exist for just about any modern text editor. I think CLI UX is becoming less and less relevant when it comes to version control.)
I feel that editor extensions for git are the future for 99% of git interactions. Git+[fugitive/etc] still may not be as slick a UX as Mercurial+[mercenary/etc] (I've not actually used any editor extension for mercurial) but it really comes close to closing the gap. To the extent the UX advantages of Mercurial really seem very marginal to me.
But when it comes to Fossil, nothing I've seen comes close. It's built on sqlite which I think turns off a lot of people who have prejudices against SQL, but sqlite is the furthest thing from a big 'enterprisey' RDBMS that give most users the shivers. It's a really tight piece of software, and in fact Fossil is created by the creators of sqlite and the sqlite project is managed with Fossil. So fossil's vanguard project (sqlite) has more deployments than even git's vanguard project (linux)! Mercurial has Firefox and Python, nothing to sneeze at; clearly it's a capable VCS, but I guess my point here is that Fossil doesn't get the attention it deserves whenever people talk about a FOSS alternative to git.
Don't forget monotone, which can be thought of as git's direct inspiration. The main difference being that monotone was painfully slow, while git was blazing fast.
software development CVS, and later Subversion were the best we had.
It might have been the best you had access to but commercial version control systems of various stripes were common. The first version control system I used for work was distributed and that was a decade before git. Version control systems with global locks, version control systems pretending to be a filesystem, version control systems fueled by the souls of the damned - it was like a Rule 34 of VCS - if you could think it, someone was selling it as a VCS.
> When people talk about Git being a “distributed” system, this is primarily the difference they mean. In CVS, you can’t make commits locally. A commit is a submission of code to the central repository, so it’s not something you can do without a connection.
Not just commits - log, diff, status, almost everything I can remember needed to go off to the remote repository for information. Not only was this annoying when you didn't have connectivity, it was slow when you did.
I do occasionally miss the ability to version files individually though.
CVS was really designed for multiple users logging in to the same server, so that repository access was working directly against the local filesystem. It was pretty fast in that setup.
Subversion, for all its improvements, was the same way. I remember jumping through a lot of ssh proxytunnel hoops just to be able to check svn history when I was working at a remote customer site in 2006.
The first time I ran “git commit” and it finished almost immediately blew my mind.
Thus is wrong: you can not commit locally means that the commit cannot be local to any checkout. You can perfectly have /some/directory on your computer be the CVSROOT.
I wonder whether something like pijul (https://pijul.org) might represent the next step forwards; I am no expert in these things but the patch based approach it takes sounds interesting and potentially very intuitive to work with. I might have to actually give it a go one of these days!
I feel bad for thinking that it's a terrible name, since it's probably a fine name in some other language, and I don't want to be an entitled American English speaker expecting literally everything software related to be in English.
And yet, it's a great tool that I suspect that it will never get traction for precisely that reason. The abundance of options enables people to be shallow enough that a silly-sounding name knocks it out of consideration.
I would like to see sub-tree checkouts, sub-tree history, storing giant asset files in repo ( without the git-lfs hack ), more consistent commands, some sort of API where compilers and build systems can integrate into revision control etc
OpenBSDs involvement here is conveniently missing, arguably GitHub may not have ever existed.
Also of note, despite the paper being presented in 1999, AnonCVS was operating as early as 1995. Other projects were still putting tarballs on FTP, no read access to source history.
Others having mentioned the terror that was Clear Case, but I just want to highlight one of its bad features: everything was stored in a relational DB. And consistency was not guaranteed, so at one point all the code was corrupted. After recovery from the backup, it was retired with vengence, and everything went back to RCS and CVS.
When I wanted everyone to switch to Git or Mercurial (~2007) the main questions were about branch merges (MUCH easier in git than CVS) and the reliability of the version storage.
Many have now moved back to a centralised model of control (github), even if they have many partial copies. The incredibly RDBMS (non-git) method of managing the meta-data of github systems is very disappointing, but not surprising. If github is 4th gen, then I'm hoping for 5th gen, where all SE meta data is also available as a replicated database, which you can spin up with a local httpd.
While there are nice features in fossil (and sqlite is fantastic), but it is not git compatible. If the code and artifacts were kept in git underneath then exchanging code with other repos would be seamless. I don't understand why they rejected keeping tickets as files in a version control system (separate from the code), instead of blobs. It just seems unfriendly to manipulation by other tools. Philosophically, I like that git can be extended with small tools and modules instead of a monolithic executable.
Git is pretty agnostic to storage. There are implementations that use RDBMS for storage quite successfully (TFS/VSTS). It wouldn’t surprise me if a lot of queries can be significantly faster to perform over git repos stored this way.
https://www.richard-banks.org/2014/02/tfs-internals-how-does...
It is unclear from this if the packfiles are actually stored only in the database, it sounds like he is saying they are stored in the file system (and accessed from there for git operations), but there is a copy in the DB too (updated after transactions have committed? Or in the commit but after the local file?). So it sounds like they are keeping repo metadata in tables to speed up queries, manage parallel access, etc. That is sensible, but quite different to keeping every code change as an SQL database entry.
My first professional coding job used SourceSafe which was lock-based so you couldn't even do concurrent versioning. Everyone had to take turns with the files.
Source safe had no real server. So if you faced a difficult merge because your colleague just committed something huge, you could just set your clock back and commit before him, walk over to his desk and tell him he broke the build.
Or just delete the source safe files from the share and have everybody in a panic (this happened where I worked in ~2010 for some legacy apps still in source safe...it was an untrained user trying to undo a change).
SourceSafe also allowed you to checkout files in non-exclusive mode which made it more like CVS. It also had a way to share files across different directories, a feature we emulated in CVS by symlinking files in the CVS_ROOT.
Also, fun fact: SourceSafe was the first cross platform VCS I used. There were Mac, Unix, and dos/windows versions. Then Microsoft bought it and axed everything except dos/windows. :-(
Our merge strategy with SourceSafe was to always do a 3 way merge (using Araxis) between:
- your latest changes
- version of the code before you made changes
- latest version of the code on the server
So yes, would have 3 full checkouts of the project locally to accomplish this. I guess it boils down to "patch" workflow, except you get to both create and apply the patch yourself. We used a real-world commit token (rubber duck IIRC) to make sure only one person was doing merges at a time...
(Tortoise)SVN was an easy sell when we discovered it.
Why people liked it or even tolerated it still puzzles me.
Edit: While I'm sure this was what I experienced it might be because of configuration by the organization I worked for, but I doubt it as I remember reading everything I could find about Perforce since I disliked it so much and wanted to find out why everyone seemed to like it.
Perforce has supported concurrent versioning since at least 2001 if not earlier. Individual files could be marked as requiring locks which is useful for binary files for which concurrent changes cannot be merged (one of the big reasons why Perforce is so popular in game development), but it's not either the default or only option. It sounds like your organization had severely misconfigured Perforce.
I started my software engineer using CVS. I was using CVS as recently as 2014.
Why? Two reasons. I'd played with git but hadn't really understood the power of trivial branching (though I was one of those CVS power users who could branch, but tended to use my IDE to manage it). I remember thinking to myself, oh this is like CVS, because that is how I used it when I played with it.
The bigger reason is that I was managing a team of 2-4 developers that rarely worked on the same thing. We all worked in the same room. The codebase was relatively small (35k loc). I could see no good reason to make the change when CVS was "good enough". I was the same reason we used the same old crufty bug tracker--too many features to write to spend time upgrading infrastructure. Unless it was a 2x efficacy improvement; we did add automated testing and scripting around deploys because the benefits were obvious.
Now I love git and the power to branch and stage commits but I am still not sure it's needed for colocated teams of that size.
I'll second your suggestion that centralized version control systems have advantages for small projects. I use SVN for most of my personal projects, because no one else is contributing. (Only one branch. In fact, I have never branched in SVN.) The simplicity is a major benefit.
The main downside is how condescending some Git users can be to SVN users. When I mention I use SVN, I often hear nonsense like "Oh, you must not understand tree structures." Actually, I do, and I see that they have no benefit for certain things I'm working on.
The one thing I'd like is the ability to commit without an internet connection, which distributed systems can easily do. But this hasn't been enough of an issue to motivate a switch.
And Subversion, being centralized, allows you to opt-in to a "lock, change, commit" workflow, on a file-per-file basis, or on basis of file-type. Which is great for binary files, as pretty much any binary file such as Excel sheets or Photoshop documents or even PNG assets can't be easily merged, so Git's "work independently, then merge later" workflow can never work with those files.
At least git stores its local copy of the repository compressed -- svn keeps a second uncompressed copy of the current revision, to be able to run a diff. So neither are great for large binaries, although at least subversion will only grab the latest.
But use of compression and binary deltas does mean that (for regular text-based code) a git checkout including all history can be smaller than a subversion checkout with just the latest version.
I don't find the maintenance burdensome. I keep my software up to date, including Git, so this isn't an argument against SVN. And I run a weekly backup script that first checks the integrity of the repositories. (This check was motivated by a hard drive failure which caused a small amount of data loss. Git probably would have helped in this case, but now I think I'm good.) I haven't done much anything else specifically for SVN in the past 5 years.
To be fair, most of my issues were with Apache, not SVN itself. The most egregious thing I remember was that Apache changed configuration directives twice during the time I maintained it in a way that broke my SVN server, but there were plenty of other papercuts.
This was also during SVN's heydey. I'm sure it's much more stable now. ;-)
My recollections of using Subversion, a few years ago, are mainly about terrible performance and frequent unrecoverable damage to working copies and repositories. "Atomic" commits appeared to be named for their ability to leave a radioactive wasteland behind them.
And of course, a policy of not branching because it's difficult and dangerous doesn't mean taking advantage of simplicity.
Managing branches is a pain with subversion compared to git, sure. But dangerous?
There are major projects out there still using subversion. GCC
for example. They have lots of branches. As for performance... well, svn is way better than CVS.
Sounds to me like you didn't know what you were doing with SVN. If I complain that Git caused some problems, Git users are likely (and often justified) to tell me that I was using Git wrong. But it's rare that the same reasoning is extended towards SVN by many Git users.
The worst I've had with SVN was a broken working copy, which is usually easily fixed. With Git, problems like that occur much more frequently. In the past 6 months I likely have completely wiped my local Git repository for a particular project alone more times than I've ever had to fix a broken SVN working copy. My experience is closer to the famous XKCD comic:
Now, perhaps you could argue that I don't understand Git well enough, and I wouldn't necessarily argue against that. I think Git has a terrible and confusing UI compared against other distributed systems. (And the simplicity of SVN makes its UI good, I think.)
As for performance, again, I use SVN with small projects, so performance hasn't been an issue. To be honest, I probably save time compared against Git from not having to type as much with SVN!
And I don't branch in SVN because I don't want to, not because it's difficult or dangerous. Branching would provide no benefit in my case. If I wanted to branch, I'd switch to a distributed system. My experience talking to some Git people is that they often branch as a habit without considering what could be gained from branching.
If network failures can leave working copies in pieces, in an unknown state, it isn't a matter of knowing what one's doing. Out of the box, Git and similar modern VCS systems offer better safety, and better auditing whenever something goes wrong.
> My experience talking to some Git people is that they often branch as a habit without considering what could be gained from branching.
There's just never a reason to not branch. It keeps ideas, efforts, tasks separated really nicely and has essentially no cost to doing so. It lets me have a completely different environment to try things out, wreck, and abandon things without ever touching the branches that are important. When I'm done, a simple merge brings it all in at once.
> It lets me have a completely different environment to try things out, wreck, and abandon things without ever touching the branches that are important.
These are all completely valid reasons to branch. They also don't apply to most of my projects. And for the ones they do apply for, I use Git.
My small projects tend to be relatively simple, and often contain a lot that's not code. (Some contain almost no code at all.) For example, I've had people recommend branching to keep track of different versions of the same paper they're writing. But this immediately struck me as a waste of time. I'll be submitting only one version of the paper. Why keep multiple internal versions?
I am not convinced by the argument that I can have a branch for each person I ask to read the paper. Merging in the handwritten changes they provide me is not hard. Branching would just be extra work in this case.
It can be nice to try different organizational structures sometimes, but I've found it easier to simply have a different TeX file in that case. (Or better yet, multiple TeX files for each part, and then a set of master TeX files that organize the paper differently.)
If someone has a good argument for distributed version control in this use case, I'd be happy to hear it.
> I use SVN for most of my personal projects, because no one else is contributing. (Only one branch. In fact, I have never branched in SVN.)
Same here, I have used the SVN + Trac combination for my personal projects for almost 12 years now, it isn’t broken for me so why should I ever think of fixing it?
In both CVS and Svn you can just check out from a local directory. For a couple years I used CVS for my private local repos where CVSROOT was sth. like /my/cvsroot. Same is possible with Svn too.
The SVN documentation (used to) warn that concurrent access in this configuration could lead to data loss. Even though I'm the only human who accessed my repository I did use scripts, so I always avoided this setup because of that.
I resisted moving from SVN to Git for my personal projects for a long time, mainly because I was too lazy. But conflicts or repos getting into an inconsistent state that required fixing was too regular an occurrence, and eventually I snapped.
I agree that the mental model with SVN is much simpler that Git, and I used SVN on Windows with AnhkSVN and TortoiseSVN, so maybe it works better on Linux
It seems as though our profession has such a low barrier to entry, especially with open source, that a lot of tooling is seen as having no associated cost.
To draw an admittedly flawed comparison I work at a contract engineering and manufacturing firm. There are some products that we produce by the tens or hundreds of thousand and benefit greatly from a lot of automation in both assembly, testing, packaging, etc. We also do low count production runs that quite simply don't get much automation because the per unit cost would end up being astronomical. There's no reason to tool up for a 100,000 piece run if you're making 10 pieces.
In our field the barrier to entry seems free though. So while git was designed to meet the needs of the linux kernel, people also use it for their own person 1kloc side project. It doesn't stop here of course, introductions for making a simple web app are often filled with tooling, frameworks, etc that need to be included, configured, and used. Undoubtedly these make sense for large projects, but are used for personal sites as well.
Sure, and now git is well enough known to be a good default. But if the choice is no version control or CVS, I'll pick CVS. (If the choice is no version control or RCS, I'll pick RCS, too.)
Note that I realize that you didn't argue for no version control.
I didn't have much contact with CVS, but I used SVN quite a bit. Things I remember from these days:
* There was no separation between commit and push. How weird.
* "svn log" or "svn blame" would take ages, because it had to tal to the server.
* Well-run larger projects had branching guides, because the built-in commands didn't track enough meta data to do merges safely later on
* SVN made it trivial to checkout only a subdirectory of a bigger repository (which I still sometimes miss in git), so people often tracked different projects in separate directories of one repository.
The only thing I remember about CVS was that to clone something from CVS, you had to know some root directory (this presumably was the webroot), and sourceforge.net didn't show available webroots -- so there were tons of technically "open source" repositories that you still couldn't clone, because the webroot wasn't documented.
With cvs, specify just "." and it will checkout all the modules in the CVS repository. That would include CVSROOT which was the module containing hooks. But, yes it was sometimes annoying that the module name wasn't clearly documented.
I enjoyed skipping the part of the article that explains how CVS worked, because I lived it. :)
CVS at the time felt like an amazing upgrade to RCS, just like Git feels like an amazing upgrade to CVS.
I wonder though, have we reached the end? If there anything beyond Git? When I used RCS, I would always lament "it would be nice if two of us could work on a file at the same time". When I was using CVS, I'd lament, "It would be nice if two of us could work on a group of files at the same time and merge our changes".
But using Git, my only lament is, "I wish this were easier for new developers" and "it would be great if there were a standard workflow". Problem one has been somewhat solved by GitHub/GitLab, and problem two has been solved by some pretty standard git-flow tools. Neither one really demands a new paradigm in VCS though.
> Have we reached the end? If there anything beyond Git?
The ability to split and merge repositories as easily as we can split and merge branches might open up some new use patterns.
The particular context I'm thinking of is scientific repositories. These tend to grow in size and scope in an unplanned manner. Pieces inevitably need to be split off for a collaboration, to be made public, or because someone is changing institutions and needs to take part of the project with them.
I remember importing all our RCS files into our first CVS repository, around 1991. It was back when CVS was primarily used for maintenance of SunOS. As a result, this is the first commit in the repo I spend most of my time in:
RCS was okay if you were a sysadmin, terrible for everyone else. CVS was okay, but still limited. SVN was more advanced, but buggy as hell. Git is more advanced and less buggy, but over-complicated and unintuitive.
Git was actually started in order to complete core functionality and let someone else make the front-end for the VCS. But somewhere along the line people just decided they didn't need a user friendly frontend, and now the core is what people use every day. 13 years later and it's still difficult to use. Unless someone comes up with a really slick universal frontend for it, it's probably time for a new VCS.
the funny thing about git is the conceptual model is clean. DAG of commits, content-addressable file-system. cool. what's insane is the CLI -- a random mismash of commands, all of which do like 4 different only vaguely related things. it's a total clusterfuck.
A CVS lifesaver that you can put into your shell init file:
export CVS_RSH=ssh
This defaults to rsh and figuring out that I had to set it like this took me a shameful amount of time (mostly because it produces such and unhelpful error message). I literally couldn't pull the OpenBSD src tree for months...
I'm one of the few people who deliberately learned and used CVS (for a while) in recent times. I did not have any public repositories at the time and needed VCS for my configuration and some documents (Org mode mostly), and the model where I could have a central repository on a local directory which I could easily back up was compelling. Then I figured out a filesystem layout where I could back up all my work easily and this became useless, thus I switched to Mercurial. Nowadays I'm considering going just git, because it's what everybody uses, and Magit is a compelling piece of software.
I use RCS regularly along with Mercurial and Git nowadays. RCS is good for e.g. when I have a tree where most of the content is pdf files (papers), images, and other binary data that does not really need to be version controlled, together with an Org mode files for notes. I also have a pool of Elisp files which contain the personal bits of my Emacs configuration, and I use RCS on them because their histories are not related to one another. It's no good for projects anymore because it is essentialy a tool used when people used a single computer to develop software to which they connected w/ terminals, so they were all users of the same machine and the code was always in a determined location.
One thing people tend to confuse with CVS or SVN is that they think it's a client/server model whereas it's actually a repo/checkout model. The repo is central and can totally reside in a local tree, and checkins from different checkouts go directly to that repo. This is akin to sharing one .git tree between all your checkouts of a single repository.
> Did you passive-aggressively rewrite a coworker’s poorly implemented function out of cathartic necessity, never intending for him to know? Too bad, he now thinks you’re a dick.
When did it start to be like this? Making code better is a dick move now? Who rewrites stuff passive-aggressively? What does that even mean?
Nice article but the conclusion doesn't really follow from how the article is build up.
> It illustrates well why understanding the history of software development can be so beneficial—picking up and re-examining obsolete tools will teach you volumes about the why behind the tools we use today.
As the article re-examines this obsolete version management tool, it becomes clear it's pretty easy and straightforward and can do a lot of things that git can to a certain degree. On top of that it's dead-easy to setup and use, in fact, its simplicity might be an indication that's it's not all that obsolete and might be exactly the right fit for new small personal projects.
Those are some really optimistic takeaways. I hope you'll try CVS and report back (especially a merge, or looking at project history including deleted files). I don't think any of these statements will stand up to scrutiny.
I am using CVS (and even RCS) and for small personal projects, it's perfect. Maybe there are problems with merges or project history in a larger context, but as things are, at the end of the day, I type the single command `cvs commit' and have a magic backup with revision history as a bonus. Sometimes that's all a project needs.
I see a couple of other references so I know I'm not unique, but my first experience of version control was SCCS on SCO Xenix. Then RCS, CVS, SVN and Git. I never used Perforce or ClearCase. Knew some folks who used PVCS and StarTeam and Visual SourceSafe.
I played with Bazaar, Monotone, Mercurial and Darcs but not enough to really appreciate them.
As an aside, I met Larry McVoy at a Linux convention in 1999 and heard him speak about BitKeeper. Those were interesting times.
My first job in the US ~2011 was in design team at monster.com.
One of the engineers had made a SVN repository for all our design specs and had cooked up a simple intranet page where the latest version of a design could always be shared by a permanent URL but also a history of all earlier versions.
That was my first experience with version control and I remember thinking it was magic. I never found out who made that, so if you’re reading thanks for going the extra mile :)
Though I had experience with cvs and subversion mostly through open source (mostly in sourceforge), I remember installing Trac around the time it first came out, and using that as equivalent to what would be done with GitHub today. Of course you had to have one install of it per project (or per 'organization', depending on your repository setup), and run it on a server somewhere.
Trac was great though, especially for the time: subversion server, source and changeset browser, tickets, wiki, roadmap. Aside from my own personal stuff, I switched several open source projects to it, and got a couple companies on to it. It quickly became to me an essential part of the dev stack, and was a great way to get the full dev stack* up and running relatively quickly.
* Other than continuous integration, but for me that came later. I worked on a lot of php stuff that could be deployed from source and never really saw the need then. Now I think it's essential and don't work without it.
> If cvs commit is what you were used to, then I’m sure staging and pushing changes would strike you as a pointless chore.
But using only `cvs commit` would be the equivalent of using a single script with git that adds every single thing in the relevant dirs and then blindly commits it.
In other words, the people coming from cvs and svn and complaining that git added a step for them were either doing an impeccable job of keeping their source dirs clean at all times, or they were implicitly admitting that they weren't keeping track of what they were adding to their own repos.
I would guess there are old projects that fit the latter description. But I know from experience there are old projects that clearly fit the latter.
At least with svn it was best practice to run `svn diff` before committing, just like `git add -p` is considered a better practice than `git commit -a`.
SVN also allowed you to commit only specific files, so if your working directory wasn't clean, you could mostly still commit just the parts you wanted to commit.
This post does not do justice to just how awful CVS is.
OpenBSD still uses it, and it's the main reason I've only rarely contributed patches. CVS is just that crappy.
When I say it's awful I should admit that like FORTRAN it's not bad for its time. But 1986 was 32 years ago. It's not bad because it's old, but it's not 32 years of good either.
Commodore 64 was great for its time, but I'm not going to load my version control from a cassette player in 2018.
My professional career started in 2008, but I didn't personally use Git/Github until maybe 2010 or so, but it still took until 2015 for me to be able to use Git/Github at work!
Version control system timeline for me has been
IBM ClearCase -> CVS -> Perforce -> SVN -> Git
The non linear path is from switching jobs/working on legacy projects, but yeah.
Git has it's problems, especially on usability, but it's much better than all the others in that list!
While CVS may have originated with Grune, I think the CVS everybody is actually familiar with is the version Brian Berliner wrote and maintained for years.
My temporal progression went CVS -> SVN -> Mercurial -> Git.
Every one of those was a major upgrade in functionality except the last. I never fought against Mercurial, and I rarely needed to Google "how do you do X with Mercurial" because it was so intuitive. But I spend about 30 minutes a day fighting with or googling git. Git is a horrible mess, but I like everybody else am forced to use it.
> So I invite you to come along with me on an exciting journey and spend the next ten minutes of your life learning about a piece of software nobody has used in the last decade.
I know of companies who only very recently moved from CVS, and I'd bet there are many that still use it too..
Assertions like this are always dangerous, because inevitably, someone somewhere is still using that tech you think is long dead ;)
RCS was god. I loved RCS. Firstly, unlike SCCS it was head state so fast (SCCS recalced head from the history of change from base) Secondly, because it wasn't CVS all that branch and remote crappe was irrelevant.
I never learned CVS. My peers at work hate me because I think branches are something you prune off a rose bush in winter.
This is a really interesting perspective for people who haven't grown up with Git. I'd not grokked a distinction between three generations of VCS before and it's strange to me to hear git's bad UI described as something other people think.
i work at a big bank that still uses CVS for most of their software. They use SVN for "modern" projects, and only recently started dabbling in git. The place is a tech museum.
Pure git isn't easier than CVS. Git just adds complexity over CVS (in parts just for the sake of being obtuse). CVS had its shortcomings, but complexity wasn't one of them.
Work on a code base with 40 plus developers and 30-40 commits a day. Having a clean commit history is a must versus "test commit", "fixed that thing", "added" etc.
Probably because Perforce scales to truly massive monolithic repos while git traditionally doesn't (the insanity going on at Microsoft notwithstanding.)
Personally I feel the best method is using lots of small repos, one for each service or library, that get stitched together by the build system. I know some large tech companies have created such systems and I have experience with one of them working very well (they migrated from perforce). But this is a big change from the monolithic repository model and institutional inertia is very real.
(Perforce will eventually start to hit a wall when you get to the point where money won't buy hardware big enough for Perforce to serve your repo fast, but that's a very long way off for most organizations and I believe there are some mitigations for it.)
> Personally I feel the best method is using lots of small repos, one for each service or library, that get stitched together by the build system.
But then your teams have to manage dependencies - or your release team has to do it for them. It's very easy to run into diamond-dependency problems or runtime classpath issues.
Then some user will copy a directory, check it in, and declare that s/he «has branched the code». Oh, and quietly pull in some other directory, which is clearly not branched. This is just broken IMHO.
I think one has to work a little without a tool to truly understand why the tool has exists. Many new developers are forced to use git and they just don't understand why they have to go through this painful process.
I started without version control. I very quickly realised that it's very easy to break a project but forget how to undo your latest breaking changes. I discovered subversion and it was amazing. It was 2006 and I was the only person on my course to my knowledge who was using version control.
At around that time git came out and some people were trying it but many people said it was completely unnecessary for most projects. I then tried to use svn for a project with more than just myself as a developer and it was a disaster. We had giant commits once a day that cause conflicts every time. It was horrible. Git was truly amazing. I agree the cli isn't great (I use magit) but you have to have lived without it to understand why it's so important.
SVN is one of the sanest and most robust VCSes there is. If even that leads to disaster, then throwing git at the team will only make things worse. You need to educate them about proper VCS use instead.
We've recently transitioned to git at work and a couple of weeks later I'm already stuck in a week-long repository cleanup project on one of the central repositories because people just created a phenomenally huge mess in it. They were experienced and happy SVN users before, but somehow the boss forced the git hype train on us and has to pay the price now.
Having worked in both git and SVN shops, SVN is adequate as a server-side VCS. Most of the time you don't need branches on the server side. However, using only SVN client-side is absolute misery - no interactive rebasing, local commits, local branches, stashing, or committing specific lines from a file. You absolutely have to use something like git-svn or hg-svn to be productive.
The other (and this is a major one) downside of adopting SVN for your org is the dearth of decent tools for code review and collaboration. At a previous company we used Fisheye/Crucible which is seriously not fit for purpose. At another SVN shop I worked at, we emailed patches to each other (seriously). And the lack of quality tooling is down to SVN's declining popularity - there's no market.
The lack of good tooling for SVN is because git is so utterly inadequate at doing it's job. So this created a need for a ton of tools that pave over those flaws and now the resulting ecosystem is a money making machine.
You're saying git is so bad that an ecosystem sprung up to support it? But if it's so bad why would it have a large enough userbase to support that ecosystem? That logic seems backwards.
The initial users came because of an enormous hype fueled by "Linus made a VCS" and a weird quasi-religious belief in the community that it is better than anything else in that space. Crazy times back then, really.
You're better off releasing from trunk/head/master than cutting a release branch. And if you are cutting a release branch, you don't need to merge it so SVN branching is adequate, if still unpleasant and awful to use. The only other use-case for server-side branches is collaborating with team-mates on a long-lived feature. Feature flags work well enough for this and you should probably be using them anyway.
It is not sane. They created a low level versioned network filesystem and then never bothered to implement branching, tagging, or merging in a workable way. In 2018 I still get bogus merge conflicts if someone before me didn't commit things just so.
It might be helpful if you could spell out why the branching, tagging and merging of svn isn't workable for you. It's clearly worked for many people and large projects for a number of years.
Linus didn't set out to create git because of some missing features in svn, but because he wanted a fundamentally different tool. If you find yourself with merges that are clean in version control system but creates conflicts in another, which is entirely possible, you are likely doing something very special that isn't a great fit for either.
There might be more straightforward process to follow that doesn't end up with such difficult merges. Maybe it's just merging more often, maybe it's something else. But it's very easy to blame the tools when the processes are broken.
> Linus didn't set out to create git because of some missing features in svn, but because he wanted a fundamentally different tool.
Whoa whoa whoa, hold up there. Linus isn't some VCS visionary, he didn't magic the idea of distributed VCS out of thin air. There were plenty of other DVCS out there at the time, he just built another one.
SVN has quite workable branching and tagging, as opposed to git. Refs are just a nightmare to work with. The mental overhead that their utterly broken logic creates is astonishing.
I’ve built some big things on SVN. The same mistakes you make in SVN are totally portable to git as well. This isn’t talked about much because people are too ashamed to admit that it wasn’t the tools and was the process and management of the project that caused carnage.
I’ve seen this three times now.
The best and simultaneously the worst feature of git is the offline commit ability.
>Did you passive-aggressively rewrite a coworker’s poorly implemented function out of cathartic necessity, never intending for him to know? Too bad, he now thinks you’re a dick.
At Sun all stuff was in Teamware, except when it wasn't, because every group at Sun could do whatever they wanted to (we used SVN in the x86 ILOM team). Teamware was good but suffered by being a wrapper on top of SCCS.
SVN was a revelation after CVS. I resisted switching to git from SVN for a long time because the mental model for SVN is so much simpler: everything is in one central place, every change is marked with a monotonically increasing version number, remember a number and you can always reproduce the state of the project. Eventually I saw the huge benefit of the "git workflow" (local branches, pull requests) for collaboration. Branches in SVN are error prone and always risked painful conflict resolution, so we made them rarely.
Perforce, which is commercial, was like stepping into another dimension (you mean all this stuff just works? Out of the box?). There's another historical timeline where Perforce has an early open-source model (like Redhat) and is the dominant VCS.
I enjoyed that remembrance of 20 years working on free-software based projects, not a bad way to start Sunday. I hope it was worth your time!