Mercurial is a fantastic VCS system that helped me get 'into" DVCS, and for that I am thankful. Even in 2013, it feels like it has the superior windows interface in TortoiseHG and I loved how extensions for Mercurial were usually a lot more cross-platform than git since they were usually just written in Python.
That said, once I got my head around the full ramifications of Git's lightweight branching, I still think it was the better choice over named branches and bookmarks. When learning Mercurial, SVN-minded me naturally gravitated towards branches for everything because of the name, and learning the distinction of when to use bookmarks and when to use branches wasn't exactly clear until I finally understood how Git's branches worked - and at that point the thought of enshrining branch names in the history forevermore suddenly turned from an obvious "why not?" to a "why?". Git's interface can always get better, but Mercurial will always have branches that live forever in history and are named "Branches".
Still, in Bizarro-2013 where Git never existed or never got the boost from Github, I think Mercurial would have been just as serviceable, and I think that it's a great second option to present if you want to migrate to a DVCS but your teammates can't get their head around Git. Kudos to them on another release.
There is a good reason to have branch names tattooed on commits: it makes inspecting the history much easier. When you are looking back and history and want to know why you did something, unless you're meticulous about encoding this information in your commit message, it helps a lot to have it encoded in the commit's metadata instead.
There are also hg tools that help leverage named branches, e.g. revsets. Just today I wanted to figure out exactly on what commit did we merge the gui branch of Octave into default. I ran the following:
hg log -r "children(parents(branch(default) and merge()) and branch(gui))"
This told me that the merge happened a year ago, and told me exactly what commit did it. This would not be possible if the history did not record the branch name.
I like hg's way: bookmarks for lightweight changes, named branches for long-lived lines of development. Both have their place, and both are useful.
Just because Git branches are lightweight doesn't mean they MUST be lightweight and thrown away. In your use-case, the `gui` branch ref would point at the merge commit, so you could simply ask for it by name. Being an old branch, however, you could then move the branch into an 'attic' or a different namespace than the active branches so it doesn't come up every time you `git branch`. That sort of flexibility to move branches around when needed was what made me question Mercurial's approach to forever-branches in the first place.
This is but one example. On git, if there was more back and forth merging between two branches, you can't figure out on which branch a commit was made later by merely inspecting history unless you encode this information in commit messages.
Mercurial is great at many things, but surprisingly, it seems quite awkward to set up a shared repository on your network for multiple users. It doesn't come as standard with a simple, self-contained server, and they recommend against using a shared directory on a network drive directly, presumably because of the wicked data loss bugs they've had (or may still have, I haven't checked this recently).
Compare the Mercurial guidance on publishing repositories[1], which comes in at about 15 screens on my system, with the equivalents for say Git[2] or Bzr[3] that fit in a couple of screens, and you can see how striking the difference is.
The instructions you linked for git would basically work as stated for Mercurial, but without all the mucking about with making a bare repository.
Note that on the Mercurial wiki page you linked, it does mention ssh setup, although perhaps not as prominently as you'd like.
Also, hg /does/ come with a simple, self-contained server: hg serve. It doesn't support authentication, because there's a bunch of http servers that can do that for you better and with less bugs than we'd inevitably have.
The instructions you linked for git would basically work as stated for Mercurial, but without all the mucking about with making a bare repository.
Except that if you're running Windows rather than Linux, they don't. I have literally just tested it to make sure I'm not imagining things.
The difference is that with Git you probably also have Git Bash installed on Windows, which while somewhat clunky does at least provide a fairly standardised mechanism for setting up keys etc.
To my knowledge, there is no equivalent for Hg, and attempting to use hg with an ssh:// path to the repository seems to depend on what other software you have installed (Tortoise*, for example).
If you know better, my team and I would love to learn something. This has been bugging us for years and across multiple projects, and none of us has ever found a simple, effective way of doing it.
For the record, the mention of SSH setup on the page I linked to is literally just that: a mention, with no further details at all, and as noted above the obvious change of specifying an ssh:// path to a repository instead of a local one doesn't work by default on Windows. There's also a second entry for "Shared SSH", but that goes to a separate page describing half a dozen components that mostly aren't included with hg out of the box and again seem to lack much documentation in some cases.
[Edit: Yes, there is also hg serve, but as you point out it lacks even basic security checks, and even the main Mercurial web site doesn't recommend relying on hg serve for more than temporary purposes.]
The configurations I've seen are trying to use some sort of Linux-based server or NAS, and a variety of clients including some on Windows.
FWIW, I've just been told elsewhere in the thread that what we've been trying using SSH should have worked out of the box as long as Hg and TortoiseHg were both installed (and, I assume though we didn't state this, as long as Hg is installed and properly reachable on the server side).
So, while I can confirm that the simple SSH access doesn't work reliably here right now, it is starting to sound like we've hit some unfortunate case that isn't necessarily Hg's fault. If so, I apologise if my comments were unfairly harsh, though I would still suggest that Hg would be more user-friendly if it could handle SSH connections itself without relying on additional software that must be installed separately.
1. It does come with a simple self-contained server.
2. Why would you run anything off a network drive? Has no one learned from Visual Source Safe and the lessons of the definitely not glorious history of NFS file locking?
Now stop complaining, because I'm stuck with winzip and windiff as a VCS for the current thing I'm working on (an old and obsolete NT4 C++ behemoth that won't go away).
Yes. I use it daily, and so does my team, some of whom have also used it in other places as part of other teams. None of us knows a simple, effective, reliable way to set up a centralised server without going via one of the web server routes.
Literally all you have to read is:
No, it isn't. If you're going to be patronising, please at least have the courtesy to read your own links before posting. There is exactly one place in that document that refers to using direct SSH access, and as I've noted elsewhere in this thread, it doesn't work out of the box if you're using a Windows client rather than say Linux, a fact I have just personally confirmed before posting here.
Install VisualSVN and use TortoiseSVN if you want a centralised server running on Windows. Seriously. Nothing works properly on windows like this - you're going to end up hacking together UNIX things. Let someone else do it for you.
Thanks for the link. We'll check it out and see if there's anything in there we weren't already trying. The note about preregistering each server using plink before trying to connect to it using the hg client isn't one I remember seeing previously; maybe that explains the mystery delays/authorisation problems we've observed.
I don't think your proposal to use SVN instead is very helpful, though. We run a very heterogeneous network, working on a lot of different projects. "Just change your entire development platform" isn't exactly a constructive suggestion. We also routinely use Git on other projects, and we've never had any trouble setting up shared repositories in those cases. While it does rely on installing some UNIXy tools, and that is indeed more hassle than it needs to be, everything pretty much works once you've done that. Moving everyone onto real Linux workstations is a non-starter, because there are way too many professional software packages we use on Windows without anything in the same class available on Linux.
Agree entirely. My point was really directed at the situation we're all in regarding cross platform dev tools. We settled on SVN as its the only thing that's fairly easy to get running cross platform as we have both windows and Linux machines online. We also need a centralised repo due to the nature of our work that has strong authorisation against LDAP (our AD).
Now we're actually looking at git and TFS as a forward-looking solution.
Bugs, bugs, bugs galore. Not joking but I'm sure they have no tests. We regularly fall over trivial shit that should work but doesn't.
Page state problems everywhere. You eventually learn not to use the browser back button.
Upgrades are hell due to the tinkering you have to do with Java settings and the container to get it performing properly. we have to stick it behind an apache mod_proxy setup because it falls over when shifting SSL. In fact their documentation says they won't support it of you use SSL (seriously!!).
It needs an 8 core Xeon with 32Gb of RAM and 15k SAS disks to get reasonable performance out of for 100 users barely doing anything (WTF).
Set it up to use InnoDB as the schema type in MySQL and it doesn't even add FK constraints reliably. Some are added, some are not. This results in random key violation failures that you have to go and manually fix or the ORM in it falls over and takes the entire JIRA instance out.
Plugins that you rely on because the basic feature set is rubbish suddenly start costing lots of money when you upgrade. There is no notice of this. Basically pay up fuckers (at least $93/plugin/year) or lights out.
We have just over 105 users but that's over the so have to fish out for the 500 user version which costs twice as much as the 100 user version. And its not cheap. Basic JIRA one off installation with greenhopper/crucible/fisheye costs us $20000 up front and a bit less every year in maintenance for which we receive broken crap.
Crucible is so slow it takes nearly a week to index our repository which has to be done regularly because it craps itself reliably and corrupts the indexes. It doesn't even run as a service on windows reliably relying on some pile of crap documentation on Confluence that doesn't work.
Clean upgrades are a week-long project on average.
You have to reindex it regularly because minor process changes cause faults and anomalies everywhere. Reindexing (until recently) blocked the server entirely for up to an hour.
The crucible web interface is so slow it doesn't actually work properly. People have to wait up to a minute for a page hit on a good day. It has a giant lock inside it somewhere apparently that they can't get around.
You can't trust their OnDemand service either - they have admitted massive customer data loss from their previous platform. Google for reference.
The whole thing is a house of cards that I wouldn't go near.
To any Atlassian employees who will probably read this and start the marketing spiel: don't give me the "we're aware and are improving speech" because I've been promised that for 3 years and it hasn't happened. It's just got worse.
Also, to those who do the "it works for me": it worked for us long enough to get through the evaluation but it doesn't scale as promised, doesn't work at all well and is not fit for purpose.
To those who say "you've set it up wrong": we've had Atlassian on the case and they can't make it any faster.
Atlassian have a reality distortion field like Apple do as well I've experienced. They have great marketing but that is all.
To be honest, I'd directly compare the space they're in with JetBrains (we use Team City as well). Nothing we've had from JetBrains is like this - it's orders of magnitude better in every way. It just works. We haven't tried YouTrack from them to be honest but I'd start with them if you're going to evaluate a product in a similar space. Either that or trac which I've had precisely zero problems with 50 users on a SQLite database!
With TortoiseHG, ssh does work well out of the box on Windows. I only know the webserver routes, but I don't see the problem with that. We use nginx on a Windows server and host our repositories there, it's not too hard to set up. With no knowledge of the process, it might take an hour or two to set up.
I'm not familiar with setting up a git server, but i doubt it is any easier than mercurial.
With TortoiseHG, ssh does work well out of the box on Windows.
FWIW, we've had problems with (correct) passwords not being accepted via TortoisePlink when trying that, but we already had other Tortoise* software for different VCSes installed, so I can't rule out some sort of unfortunate conflict.
If SSH really does work for most people once TortoiseHg is installed and we've just been unlucky, then I partially withdraw my criticism, given that in practice I expect a lot of Windows users of Hg do install TortoiseHg as well anyway.
I only have TortoiseHG installed, but I use a standalone pageant (not the tortoise one) and TortoiseHG doesn't seem to have any problems using the keys stored in there.
Even if you just use the command line client, I think it's worth installing tortoise for the ssh support.
> but surprisingly, it is by far the most awkward version control system I have ever used if you want to set up a shared repository on your network for multiple users.
I take it you have never tried darcs; it's the ultimate user experience nightmare.
>Compare the Mercurial guidance on publishing repositories..
I don't understand what you are getting at. If you have ssh access to the machine, then you can push or pull from a repo in that machine. It is that simple.
In that page, it details ALL the possible ways you can publish your repos. And goes into much detail regarding the hgweb set up. Setting user permissions etc. That is why it is big. The other two pages does not goes into much details. The git page only describes the ssh method.
If you have ssh access to the machine, then you can push or pull from a repo in that machine. It is that simple.
Unfortunately, as I've been discussing with others elsewhere in this thread, it's not really that simple at all in some cases. I'm typing this on a Windows machine that has Hg installed, and I routinely SSH from this machine into various Linux servers that also have Hg installed, yet for reasons we've yet to determine, trying to access any sort of ssh:// repo using the hg command line client fails.
Have you set the configuration option that is used by mercurial to find the ssh program to use for ssh operations?
also,
HG requires the path to the repo as one relative to the home folder of the user. ie, if you are logging in as silhouette and your repo is located in /home/silhouette/projects/myproject then the command to push to this repo would be
As far as we can see, those things are all set up properly, but we aren't getting that far anyway. Something is failing at the authentication/authorisation stage when setting up the SSH connection, hence our current preference for the web-based alternatives.
> presumably because of the wicked data loss bugs they've had (or may still have, I haven't checked this recently).
I am not aware of any data loss bugs in that area, or those are related to buggy network filesystems (in which case other VCS would also be affected). The usual issue around shared on disk repo is with permissions (and since a long time mercurial tries to be smart when creating new directory/files in order to propagate the perms regardless of umask).
Cloning a repo hosted on a Linux-based NAS/server to another location on the same NAS/server from a Windows machine tends to set up the clone using hard links rather than a true copy by default.
That in itself is not a problem, but unfortunately the Windows hg client doesn't seem to detect this situation reliably (or at least didn't last time I checked, which is a few months ago now). That means if you then commit changes, you can be unintentionally affecting the common linked files rather than separating them on demand first. The same was true for TortoiseHg the last time I checked, this again being a few months ago.
This is a particularly wicked bug because it means even if you've cloned a repo elsewhere on your network drive with the intent of keeping an independent backup of everything, both versions will be corrupted, and the first you're likely to notice is if you run an hg verify and find the index data for your original repo (which as far you know is untouched) suddenly has errors in it.
Incidentally, if you are using this sort of scenario for whatever reason, there is an option you can set at cloning time to force a full copy to be made.
I've also heard of problems with file locking being unreliable if you're using NFS to access the server, but the only cases I've seen were set up a different way so I can't offer anything more than a general warning in that case.
If you're using NFS, CIFS or smbfs to host or synchronize repo data, You're Gonna Have A Bad Time (YGHABT?) regardless of which VCS you're using - it's only a matter of time.
Mercurial comes with a built-in "hg serve" command that you can use to serve repositories through http. It creates a web server that you can access through any web browser. Unless you need authentication or you have a lot of users you don't need to setup any external web server.
Otherwise setting up Apache + mercurial on windows is not very hard. If you need help please drop me a line.
Thanks for the offer of help, but we're OK with the web server side of things. I was just suggesting that one possible reason for Git's popularity compared to Hg's is that it isn't a walk in the park to set up a common repo for a team with Hg. If you've got someone who's familiar with setting up a web server anyway, you'll be OK, but with Git you don't need to do anything like that at all.
I don't sure I understand what is the problem with setting up a basic mercurial server. If you have TortoiseHg you just open your repository, click on "Repository / Start Web Server" and you are done. If you have bare mercurial just cd to your repository and execute "hg serve".
Perhaps you have some other requirement (e.g. authentication) that I did not take into account?
I suspect hg serve is fine for temporary use, but it's not really designed as stable, long-term solution. As you say, it lacks authentication, which isn't ideal (or allowed at all) in some circumstances. Also, it needs to be started manually, so it needs some sort of supervisor process/start-up script to be set up.
Obviously this isn't some horrific burden, but it's still more demanding than the basic server set-up for some other DVCSes. The original question was about the reasons for the relative popularity of different systems, and if we're talking about people who are making decisions about a DVCS for the first time, they're not experts already and this stuff probably does make a difference.
Absolutely, in the same way that some companies went for Mercurial simply because BitBucket (at the time) offered free private repositories and only supported Hg. Now, BitBucket support Git repos, and I've noticed a few companies I've worked at that are starting to move towards Git on BitBucket, probably because it's the more popular choice and more people will know it because of GitHub.
I use both Hg and Git at work, and to be honest for my basic workflow there are very few noticeable differences in day-to-day work.
I use git on bitbucket for all my personal stuff, because by signing up with a .edu email account, they gave me free unlimited private repositories. Sold.
My only concern would be that people are finding my mostly-bare github profile instead, since that rules the hivemind.
Ha, you're not alone there! Hell, the only reason I have projects on my GitHub profile are because I felt scared about having a bare public profile linked to my real name. Mine still looks a bit stupid, because I'm a .NET dev with three Java projects from uni, and a Python blog written for Google App Engine.
Totally agree with the first point in particular. The only reason I started using git is because it's required for github. Otherwise I'd use Mercurial exclusively.
It should probably be pretty trivial to roll together a TimestampMod equivalent using a similar methodology as git-notes. Hell, just store a git-note for every commit in a non-standard ref; each note would have a printout of the tree and corresponding timestamps. To reapply timestamps, you'd just have touch run through the git note, re-touching every file to have the appropriate timestamps.
Like others have said though, I don't understand what a use case for this would be. The page for that Hg plugin seems to indicate that it is just meant to ease the transition to version control, for users who don't yet 'get it'.
I'm really curious about your use-case. Are you just looking to tweak the behavior of the VCS w.r.t. a timestamp-sensitive build system?
I will say that file timestamps invoke horrible, horrible memories of an employer that used Visual Source[un-]Safe -- a VCS that relied on accurate client system time for history integrity. I.e. the repo was easily corrupted if a client wasn't on synchronized time. Also, race conditions. AFAICT, it met none of the conditions casually denoted by the words "Version Control System".
I'm curious why you need to preserve the timestamps. I'm not saying it doesn't make sense, I just haven't ever had a time where that's been a requirement.
One company I worked for chose Hg instead of Git precisely because it had better Windows support.
One year and one acquisition later, every developer was on either Linux or OSX and everyone was asking why did we go for Hg if Windows was not a concern...
Linus doesn't work on git anymore and I don't see what Linux has to do with anything. GitHub is to me the main reason for the success of git which I've discovered is incredibly difficult to use without GitHub. BitBucket good for me.
It is definitely not. I wrote in another comment, there was a time when hg was leading in popularity (for the right reason: good ui), but git won for the right reason: everything else.
Could you outline what it is that you think makes Git better?
From my daily experience with both Mercurial and Git, i would say the exact opposite. I can't think of anything that Git does better than Mercurial. I can certainly think of things it does in a needlessly more complicated way - the handling of remote tracking branches and the use of the index are obvious ones. Git's fixation on branches over commits is a persistent deep source of pain for users around me. And you can pry revsets from my cold, dead hands.
Now, there was a time when Mercurial didn't have bookmarks. During that time, if you wanted a good lightweight branching model, Git was your only option (er, well maybe bzr or Darcs were, who knows). I was never that excited about lightweight branching; the weight of heavyweight branches does not seem substantial. Today, Mercurial has bookmarks that work pretty much like Git branches, so this advantage has evaporated.
Also, in a world of worse-is-better defacto software standards I'd say the choice between hg and git almost doesn't matter. I mean thank god people pulled their heads out of their asses enough to actually let Subversion lose the trench holy wars inevitable with the kind of sea change DVCS represented.
When was the last time you looked at Mercurial? Recently the community put a lot of effort into performance and it should be very close to git's performance. On the flexibility site, Mercurial has the histedit extension as well as rebase which should give you most flexibility. The bare Mercurial is intended to not do much, but you have to enable a few of the shipped extensions in order to get all the functionality.
For the record, you don't need to enable the transplant and graphlog extensions anymore. Those are now part of core mercurial:
- The transplant extension has become the "graft" core command (do "hg help graft" for more details)
- The graphlog extension functionality has been rolled into the core log command. You must simply use the --graph (or the equivalent -G) flag.
Not quite - graft doesn't support remote repositories, transplant does. I didn't know about hg log --graph, I think the muscle memory I have for hg glog will take a little while to break :)
You are right. That is a small limitation of the graft command compared to the transplant extension. As far as I know that is the only difference between them. I have never needed to do that so I forgot to mention it, sorry!
Problem with histedit and rebase though is that they won't allow you to change published history. This means if I want to push my changes to a backup repository (not the main development repository) I loose the option of rebasing my feature-branch before pushing it to the main repo :(
You have some ways you can work with this in hg using phases. One example is to mark the repo you're publishing to as non-publishing so that hg knows it's safe to edit history pushed to that repo. Another way is to just tell hg that you know better and to use "hg phase --force --draft" whatever commits you want to histedit or rebase.
Yes, I haven't used Mercurial too extensively (I use git), but Mercurial's commands make more sense to me. The only problem I have with it is that conflicts throw me into vim, and if I exit it to try and make sense of the merges, it thinks I'm done merging and leaves me in an unreasonable state, whereas git just marks the files and lets me resolve them at my leisure.
Using "hg help merge-tools" it describes some different merge tool settings that you might fine helpful. For example if you want it to merge all files it can, and mark the ones it can't fully merge you can try "internal:merge".
To set it for your user globally, in your .hgrc put "merge=internal:merge" under [ui].
You probably need to disable a merge tool. Check 'hg showconfig merge-tools' and you'll probably find something. If this is on a debian derived system, I think that package maintainer turns on a bunch of stuff by default, so you may need to find a file in /etc/hg or similar.
I feel Mercurial's commands are more intuitive than Git's, especially to former SVN users. Functionally, they're so similar it doesn't matter. This is why I love Kiln Harmony--I can work on a Git team and use Mercurial. Kind of like I can work on a dev team that uses Emacs while I use Vim. It shouldn't matter.
I work in a company that uses Git, plus Subversion for legacy stuff. I use Mercurial with hg-git and hgsubversion. There are occasional hiccups, but they mostly work like a charm. I'm happy because i get to use Mercurial, and my colleagues are happy because they have someone who can help them out by running complex revset queries!
Kiln Harmony was awesome, as I'm one of two developers in a team of eigth who prefers Git instead of Hg. Then we decided to use sub-repoes/submodules, and kiln harmony can't handle that, and our other team members simply WONT use git :(
I've found that for general workflow, they're about the same, except that Mercurial has a slower checkout process and doesn't resolve conflicts as nicely. I'm not sure if there are features it has that git is missing..
If by that you mean windows: that's no longer the case. The git Windows installer is vastly superior to hg's windows experience currently. I'm using it daily and hg integrates really well into the system now. On top of that is tortoise git a very pleasant experience now.
I mainly mean Windows, but it is not the only non POSIX system around.
I am using Git on Windows on a current project and it is still far from what Mercurial was offering on my previous project in terms of usability and resilience to infrastructure issues.
> I am using Git on Windows on a current project and it is still far from what Mercurial was offering on my previous project in terms of usability and resilience to infrastructure issues.
Which part exactly? Git seamlessly integrates into Putty/PLink/Pageant and tortoise-git does the same. Git also has excellent support for end of line conversions. Mercurial last time I was checking required an extension to deal with line endings.
Git has had issues talking to the server, loosing credentials already stored, asks for the passwords multiple times, every now and then, a bunch of HTTP errors comes back when doing pulls/pushes.
The mercurial server is hosted on the same server and works without issues.
I like that Mercurial still tries hard to get new concepts into the core. There is something they call "phases" which is a boundary marking what was published and what not. Phases then forbid people to rebase changsets that were already published (unless you obv tell it to force it anyway), which is more more convenient than git's way. In addition they work on something called obsolete markers which will also record how changsets moved on rebase, etc giving it much more flexibility and power than git.
Whenever the Git vs Mercurial topic comes up I think this thread [1] about Facebook having massive slowdowns with their large codebase using Git is quite interesting.
Thanks for the link. Interesting to see how slow it gets with such a large codebase. From what I could tell, people seem to think the project should be split into a number of smaller projects to fix the issue.
This post brings up one of the major issues with DVCS, if you have a large single repo, it just doesn't play well with git/hg. This is even more apparently if you have a lot of binary objects.
If you try to hammer a nail with a crowbar, you are going to experience some difficulty.
Large projects like Android get along swimmingly on git because they have structured themselves such that git is the right tool for the job. If Android insisted on having one single git repo, despite that not being a git best practice, they would have trouble too.
Structure things right and a git based system will scale far beyond what centralized monolithic repos like perforce can handle. If your codebase is growing properly big, splitting your code is something that you'll need to do eventually anyway.
This sounds strange, seeing as that the kernel is a single repository. Right? I would expect it to compete in size with most any other project. That not the case?
The kernel isn't actually that large. It should be sub-GB for the foreseeable future iirc.
Single git repos get stretched to their limits when companies try to put 10-20 years worth of sourcecode from every single project they ever had (often with large binary files for testing or whatever) into a single repo because they are trying to use it like they used perforce. We're talking anything from several gigabytes up to the unimaginably large.
I was thinking more in terms of total lines of code.
Is the "best practice" to archive code after a while? Would make some sense, but I'm not sure how that would work. Everyone has to make the switch at once, right?
LOC doesn't matter all that much, the main problem is the shear number of git objects necessary (basic operations start to become very slow when working on a massive DAG) and the shear size of the repo (making the initial git-clone a procedure you start before you leave work for the night..)
Git should be able to handle just about any "single project" with ease for the foreseeable future though. If it can't, that is a strong indication that you need to start restructuring your projects into multiple separate "packages" or projects. You can do that slowly over time while you are still on your traditional VCS (just start breaking components out both in source code and in organization responsibility/hierarchy.
After this process is underway, if you are careful and a little clever, you can allow individual packages/projects to migrate themselves to git/Hg. You will probably be in this stage (many people on the old monolithic VCS, many people using git/hg for their projects) for a while. Managing a concept of "packages" at a higher than version control repos is somewhat important here to abstract away exactly which VCS is being used by a particular project (Android's 'repo' is sort of an example of a higher level concept above version control that facilitates Android development).
Ideally you would eventually give everybody using the old system a deadline to migrate to git.
Note that this sort of situation is really only something that large organizations (facebook, or larger) should ever find themselves in. If you're on a small team and you are running into these sort of problems, then you probably have a slightly different sort of problem: perhaps lots of checked in auto-generated code (suggested solution: stop doing that. work on caching in your build system if auto-generation takes too long to do it every time), or maybe too many checked in large binaries (suggested solution: if those files absolutely must be checked in, perhaps look into git-annex).
Edit: here is a slide deck that covers Perforce scaling at Google: http://www.perforce.com/sites/default/files/still-all-one-se... Note the page "Perforce at Google: Main Server". That is the sort of situation that you don't really want to get backed into, but after restructuring your codebase you can construct solutions with git that allow you to scale much further with lower operational cost.
The kernel argument still jumps out at me, though. There are many "subsystems" to the kernel that could have easily been split to separate repositories had that been the idea for the project. Indeed, this seems to be the entire monolithic kernel debate. They do not split things out, even if they could potentially do so.
Now, in general I think I agree with your points. Your two specific points I specifically promote at work and on school projects.
And, I fully see how the large organizations you are referring to would hit this. Especially when they essentially have independent projects in development. What I do not understand is where that line is drawn. To the point that I often find myself on the counter argument at work when teams want to immediately start a project in 3 repositories because we may want some utilities used elsewhere.
Take gnome. At face value, it seems like most of the core of gnome could be in one repository. Instead, it is very highly split out and has a specialized build system to support the it. Was this strictly necessary? Or is this more to support other infrastructure ideas at play? (This make sense?)
Facebook tried to use git but ran into some scalability limits due to gits design (frequent index updates), whereas mercurial is append-only and doesn't have this problem.
Hence, facebook has some legitimate technical reasons to not use git -- contrary to the common opinion that git is always "fast".
The definitely do. They migrated their repositories to mercurial a while ago. There are plenty of mentions of this fact on mercurial's development mailing list (mercurial-devel@selenic.com).
Facebook is a big user and backer of mercurial. I attended the last mercurial sprint in Facebook's London office (which was great, BTW). Facebook will also host the next mercurial sprint in New York. They have recently hired Matt Mackall, mercurial's creator. Several other mercurial core developers also work for Facebook and are paid to work on improving mercurial as their main job.
I believe at some point they considered both git and mercurial as their new VCS. They have some huge repositories (hundreds of thousands if not millions of commits) and a huge amount of people accessing those repositories. I think they found some scalability issues with git's performance with repos of that scale (http://comments.gmane.org/gmane.comp.version-control.git/189...). Apparently it was easier for them to improve mercurial's performance, perhaps because mercurial is written in python with some performance sensitive parts written in C. Over the last year they have made a lot of progress and mercurial's performance on huge repositories is now even better than it used to be.
I'm glad that the shelve extension (which is like stash in git) is included by default now. Even though that extension has been around for a while, it feels like it's considered a core function now, rather than an "extra." (Especially for git users, who are accustomed to having `git stash` out of the box. So now there's a little less friction when trying to do the same things in Mercurial.)
That said, with Git, my main use of stash is to put away uncommitted changes so i can pull updates from the origin. Stash, pull, pop. However, Mercurial is able to pull changes into a working copy which has uncommitted changes - it essentially just treats it as a merge. This, with Mercurial, i need to stash much less often.
It's still useful as a way of putting changes aside while you work on something else, though, so i am glad to have it. The alternative was doing things with patch queues which were a bit scary.
> Mercurial is able to pull changes into a working copy which has uncommitted changes - it essentially just treats it as a merge. This, with Mercurial, i need to stash much less often.
JFTR: git 1.8.4 has --autostash as argument to rebase.
Is mercurial's merging significantly different than git pull --rebase? (which acts like "cvs update" or "svn update" with respect to uncommitted changes?)
Is git pull --rebase like cvs update with respect to uncommitted changes? If i issue a git pull --rebase (with Git 1.8.2.2) in a repo in which i have uncommitted changes, i get:
Cannot pull with rebase: You have unstaged changes.
Please commit or stash them.
From what i remember of CVS, if i update into a working copy with changes, it applies the update where it can, and leaves merge conflict markers (or runs your configured merge tool) where it can't. That is also what Mercurial does.
Ok, now I understand. Yes, git expects you to commit or stash before doing that, matching the philosophy that there should be a commit you can go back to undo (even if you have to look at the reflog to see what that commit is).
Personally, I commit and stash all the time, and feel better than the old CVS in which I manually stashed (by copying) the pre-"cvs update" state in case I needed to go back to it.
matsushiko above mentions that git now has --autostash; legit had "sync" which does stash-rebase-unstash - but I rarely ever needed that workflow.
I spent a few years developing tooling and processes around Mercurial, then moved to a new role some months ago where Git is used exclusively. I am afraid that I still prefer hg; while git is, as a rule of thumb, faster, hg takes better care of your data (Much harder to lose changesets) and presents a cleaner UI out of the gate.
Much harder to loose changesets? The reflog keeps changesets for up to 90 days (default, can be configured), if you don't notice that changesets are missing for 90 days, then chances are they don't matter :/
I am utterly flummoxed by the idea of a version control system that autonomously discards versions at all, even if it keeps them in a mysterious trashcan for 90 days.
Last week, some colleagues of mine 'lost' some commits because some other colleagues had misused a forced push. They could probably have got them out of the reflog, if someone had told them about the reflog in time; before i could, they recovered a diff from their console history. If they'd been using Mercurial, the situation just wouldn't have arisen; they would have had a branch with two heads, and they could have resolved that straightforwardly.
The fact that is would possible to recover from this situation using a command that one never has any other need to even know about is no excuse for the fact that it is even possible.
> If they'd been using Mercurial, the situation just wouldn't have arisen; they would have had a branch with two heads, and they could have resolved that straightforwardly.
The would have this situation in Git too. If the Git-server was setup to only allow fast-forward merges, then they wouldn't be able (little uncertain here) to force it through either. Forcing something is never a good idea, even in Hg.
> The fact that is would possible to recover from this situation using a command that one never has any other need to even know about is no excuse for the fact that it is even possible.
So... Git provides a feature, allowing you to clean up history before merging (what it is mostly used for anyway). Git even provides an option to recover from these changes... And you're upset because colleagues of you who apparently didn't know Git, fucked up?
Atlassian Stash comes with a plugin that exposes a "Disable Force Push" toggle. I have that on for all the repositories it hosts. Nobody complains that force push doesn't work (~10 devs) because if you're doing force push you're probably doing something wrong.
>> some other colleagues had misused a forced push
> Git has a learning curve.
That sounds a bit like a euphemism for "is hard to use".
However, that's not what i'm getting at. The first pair of people misused a forced push because they're clowns. They are never going to learn to use any version control system properly, even if the learning curve was down a slide. If we disabled force push, they'd probably have reenabled it (they're sysadmins, so it would be hard to stop them).
The problem is that the second pair of developers, who are not clowns, but are also not Git power users, nor have any interest in becoming Git power users, got screwed by the first pair's mistake, and could only have got out of it by using an obscure feature known only to Git power users.
> and could only have got out of it by using an obscure feature known only to Git power users.
Obscure? I spent one afternoon reading the Pro Git book that is available free online. The first thing you learn, after learning to alter history, is the reflog.
Look, everything people in our business do is based on trust. It's fairly easy to mess up for everyone, and this includes Mercurial as well. Like switching bookmarks around or messing with hg strip.
This is not an acceptable solution as a plan. It is, instead, a terrible hack to deal with expected error. Better yet is to plan for errors to be exceptional things.
git bills itself as the stupid content tracker. It's much better to add a small amount of smarts to ensure that you and your friends don't have to reach in the trash and get the changesets you erroneously deleted.
And it's not an error that the user tells git (s)he wants to remove or squash or whatever some commits. Git gives you that flexibility, which alot of people like, and at the same time gives users an undo-history in case they mess up.
On the other hand, I've used the reflog once in the last four months. For me, the use of reflog really has been exceptional.
That said, once I got my head around the full ramifications of Git's lightweight branching, I still think it was the better choice over named branches and bookmarks. When learning Mercurial, SVN-minded me naturally gravitated towards branches for everything because of the name, and learning the distinction of when to use bookmarks and when to use branches wasn't exactly clear until I finally understood how Git's branches worked - and at that point the thought of enshrining branch names in the history forevermore suddenly turned from an obvious "why not?" to a "why?". Git's interface can always get better, but Mercurial will always have branches that live forever in history and are named "Branches".
Still, in Bizarro-2013 where Git never existed or never got the boost from Github, I think Mercurial would have been just as serviceable, and I think that it's a great second option to present if you want to migrate to a DVCS but your teammates can't get their head around Git. Kudos to them on another release.