Things I wish everyone knew about Git (Part II)

mabbo · on July 20, 2022

I said this in another comment recently about git, but I find it odd that even after a decade of using git as a mandatory part of my professional life, I'm still learning new things.

That's maybe not a good sign about the usability of git.

That `@{'3 days ago'}` thing? That's incredible. Why didn't I know about that 8 years ago? Why wasn't it obvious, intuitive to me that this was possible?

git is brilliant and I love it. But there are very few affordances that make it obvious what to do next, what is possible. There's few patterns in it where I can apply what I already know.

The thing that replaces git will have most of the power of git, but an intuitive interface that makes learning everything about it easy.

js2 · on July 20, 2022

> I said this in another comment recently about git, but I find it odd that even after a decade of using git as a mandatory part of my professional life, I'm still learning new things.

If I can humbly offer a suggestion: read the man pages. A man page read once thoroughly is much more useful than the same man page skimmed 1000 times. The syntax in question is in the rev-parse man page (`git help rev-parse`).

These are tools we use every day. Look, I'm the first person to throw out those furniture assembly instructions and dive right in. But the truth is, we'd all be better off reading the documentation for our tools.

The git man pages are far from the best, but they contain tons of useful information.

And it's not just with git that developers seem allergic to reading documentation. I noticed a colleague the other day doing this from a shell script:

    BRANCH=$(python -c 'import os; print(os.environ["BRANCH"].split("/")[1])')

I asked what they were trying to do. They wanted to strip "origin/" from the front of BRANCH. I showed them you can do that directly in the shell:

    BRANCH="${BRANCH#origin/}"  # strip origin/ from front

Or:

    BRANCH="${BRANCH#*/}"       # strip */ from front

They'd never read the bash man page. Had no idea the shell could do this. So I pointed them at https://www.gnu.org/software/bash/manual/html_node/Shell-Par...

I think we'd all be better developers if we RTFM, at least for the tools/languages we use every day. You simply don't know what you don't know. This way you at least know a little more what you don't know. :-)

mabbo · on July 23, 2022

I think rather than asking everyone to read the manual, the designers of software systems ought to all read Don Norman's "The Design of Everyday Things".

Apple products became what they are because no manual was needed. They were designed with patterns that were intuitive, affordances, and a deep understanding of human psychology.

What I am arguing is that git lacks those very things that make reading a manual not needed.

vasergen · on July 20, 2022

Agree with your point, but I am thinking now how many people understand python part and how many bash version.

GrumpySloth · on July 20, 2022

I don't think the fact that you keep discovering features is a bad sign for a tool like git. It would be a bad sign if you kept discovering new footguns in the features you use (there is a fair bit of software like that).

I'm 25 and a couple weeks ago I discovered a new way to use a knife to prepare a fish, even though I've been doing that all my life. I don't think knives or fish have bad UX, even though I keep discovering new ways to use knives on fish. What's important is that I don't keep discovering new ways to hurt myself with knives.

atq2119 · on July 20, 2022

All those bones in fish are pretty bad UX, compared to some of the competition.

NKCSS · on July 20, 2022

GIT is just the new Regex. Everybody uses it, most have no clue how it works and just copy/pasts stuff from other places/repeats the same little trick

scrollaway · on July 20, 2022

No, that’s terrifyingly untrue.

Git is more like… the new car. It’s complicated and most people have no idea how it works, but they know how to steer very simply and can drive it with a bit of training. Experienced people however, who know how cars work, can do really cool things with them and have no problem repairing them on their own.

Still, cars are useful. A bike would be better of course.

foobarian · on July 20, 2022

I like it! I'll do you one further. Git is more like... someone ripped out all the controls from your car and installed a Boeing 787 dashboard. You'd probably figure out how to start it and make it go forward, left and right, but you'd miss out on a lot of functionality and likely destroy something while playing with unknown buttons.

stackbutterflow · on July 20, 2022

I find regex easier to write than to read.

I never copy them from stackoverflow because I'm worried it does some weird magic that I don't want. Instead I build them against one or more examples.

pif · on July 20, 2022

> I find regex easier to write than to read.

There is nothing new under the sun.

Joel, 22 years ago:

There’s a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. [...] The reason that they think the old code is a mess is because of a cardinal, fundamental law of programming: It’s harder to read code than to write it.

https://www.joelonsoftware.com/2000/04/06/things-you-should-...

MontyCarloHall · on July 20, 2022

Joel is 100% spot on. This is exactly why I don’t think things like Copilot will catch on. Making programming involve less writing code and more reading code actually makes the job more difficult.

Tijdreiziger · on July 20, 2022

And the corollary: since code is harder to read than write, steer clear of writing clever code where simple code would suffice, as the clever code will be much harder to read.

roeles · on July 20, 2022

A second argument is that debugging is harder than writing.

morberg · on July 25, 2022

AKA Kernighan's Law: https://www.laws-of-software.com/laws/kernighan/

dgfitz · on July 20, 2022

s/writing/reading

stackbutterflow · on July 20, 2022

I say that because there's a meme that nobody writes regex and that everyone copy-paste it from stackoverflow.

maicro · on July 20, 2022

I can highly recommend regexr.com for testing and developing regex patterns; paste a representative sample in the "Text" area and it'll show you how things match (and usually, crucially, don't match) as you change the regex pattern.

I'm not associated with them, and I understand you don't seem to have any real issues with regex, but thought this would be a good place to mention a useful tool.

layer8 · on July 20, 2022

Using such testing you can only prove a regex isn’t what you want for specific inputs, and make a number of plausibility checks, but you can’t prove it is what you want for all inputs. For that you need to do the reasoning on the expression as if you had built it yourself.

mgkimsal · on July 20, 2022

I've used that and regexbuddy and others over the years. Almost anything the gives you some visual rendering of what's going on is helpful. Personally, I've taken to using the regex stuff in the Jetbrains IDEs, mostly because I'm there already most of the time, and it's 'good enough'. But I'm not always at my own setup, and regexr and similar are always a good tool to have (and to share with others to show them how a regex is working).

If I did this all the time, I might not need tools like this. But complex regex are something I only dive in to a handful of times per year, and it's never the same problem twice.

lazide · on July 20, 2022

Nice! I’ve been writing (and reading) regexes for over 20 years now. I can usually ‘read’ them on sight, but I’ve been wanting this tool for when it all goes wrong.

Which of course always happens at least once or twice in any new dataset/usage.

Rediscover · on July 20, 2022

Thanks for pointing out regexr.com.

I really appreciate the availability of tools like that.

See also the Emacs built-in M-x re-builder (for elisp-style regexp).

ExtremisAndy · on July 20, 2022

Totally agree! But oddly enough, I have this problem with all code/languages… I always write code much more easily than I read it. I can write fairly complex applications just by following the logic of what the application must do in my brain (but with ZERO cleverness). But give me the source code of even the simplest Unix command and I’ll struggle to understand the purpose of every #define, the variable naming scheme, etc., etc. I’m sure this is telling me something interesting about how my brain works but I’m not sure what!

pif · on July 20, 2022

> I’m sure this is telling me something interesting about how my brain works

See my sibling comment. Joel says you are just normal.

layer8 · on July 20, 2022

Regex is much simpler than Git, but also the different regex implementations in use have a lot of subtle and less subtle differences, whereas there is roughly only one Git.

EastSmith · on July 20, 2022

I use 3% of git commands and that's it. I probably use 3% of regex rules too.

dotancohen · on July 20, 2022

  > even after a decade of using git as a mandatory part of my professional life, I'm still learning new things.

I still learn new things in Python every month. I still learn new things in Magento every _day_. I'll learn something new in VIM or PyCharm or PhpStorm at least once every few weeks. I'll learn something new in my native language every few weeks, and I'll learn something new in my other languages almost every day. I'll learn something new about Bash or Linux or CORS or Selenium or some useful Chrome or Firefox setting or extension at least every week.

Learning new things is not unusual in the software development industry.

snarfy · on July 20, 2022

It's vindicating to see the general response in this thread vs threads about git a decade ago, where everybody defended git and decreed all of us simpletons just don't understand it enough.

Oh we do, and we understand how most of it is a flaming trash pile.

mabbo · on July 20, 2022

Oh no, I think it's a beautiful piece of art. With sharp edges that keep cutting my hands as I try to handle it.

I love git. I just wish it was more usable.

valenterry · on July 20, 2022

git could also have just followed some standards. For instance, how am I supposed to know the syntax of "3 days ago" in general?

They could have used time spans from ISO 8601 standard, so for instance simply "P3D". Or "P3Y6M4DT12H30M17S".

git is powerful but it involves a higher amount of time investment than it could if it were not reinventing the wheel and taking shortcuts as much. It's easy to see that it is and was built for powerusers mainly.

nedbat · on July 21, 2022

Are you really suggesting that more people know what "P3D" means than "3 days ago"?

valenterry · on July 22, 2022

I think more people know how to define "3 days ago" when you tell them "provide it as an ISO 8601 time span" vs. "provide it in gits own format".

When it comes to presenting it nicely, then sure, use 3 days ago. No problem with that.

praptak · on July 20, 2022

I love this quote from Part I:

"Git has an elegant and powerful underlying model based on a few simple concepts:

  1. Commits are immutable snapshots of the repository
  2. Branches are named sequences of commits
  3. Every object has a unique ID, derived from its content

Built atop this elegant system is a flaming trash pile. "

eulenteufel · on July 20, 2022

The quote is kinda wrong in that point 1 and 2 are mixed up. Branches are just pointers to commits. Commits contain a reference to their history. It's only kind of incorrect because in praxis branches are used to refer to a history (a sequence of commits). But it's also misleading once you have to do anything more complicated than just commiting/merging.

When I started out using git I was working with the same assumptions, but I was perpetually confused. Git became a lot easier to use once that misunderstanding cleared up.

Maybe it's just like that for me but I think we might be doing newcomers to git a disservice by explaining the basics of git in this simplified manner.

sophacles · on July 20, 2022

Do you have an actual example of this? I've never encountered a situation where those three points are incorrect.

kleinsch · on July 20, 2022

If you're in a detached head state and create a new commit, you have a commit that isn't on a branch. The commit still has a parent, so it's not the branch that's tracking the sequence, it's the commit. Branches are just pointers that get updated as you add commits.

sophacles · on July 20, 2022

This feels like declaring squares aren't rectangles because you can make rectangles that aren't a square.

wonderbore · on July 20, 2022

Git is only commits. Branches and tags are references to the commits; one moves while the other one doesn’t. That’s it.

Commits link to their parent(s) until the initial commit. You can visualize a “branch” as a series of commits, but the same thing can exist without calling it a branch:

    A -> B -> C —> Initial

Is that a branch? No. How about:

    /refs/heads/main = A

That’s a branch, and it looks exactly the same as a tag:

    /refs/tags/v1 = A

Neither a branch or a tag is “a series of commits” even if you think it does.

sophacles · on July 20, 2022

> You can visualize a “branch” as a series of commits, but the same thing can exist without calling it a branch

You can visualize a "square" as a rectangle with equal sides, but the same thing can exist without calling it a square.

praptak · on July 20, 2022

2. is true or untrue depending on how deeply you interpret it.

A branch is named, points to a commit and under normal circumstances will track the sequence of commits made via itself (i.e. when you commit and HEAD points to branch b, b starts pointing to the child commit). So you could say it's a named sequence of commits.

On the other hand you can reset the branch to any commit in the tree, even ones which aren't even on the current sequence of parents. It is still technically a named sequence of commits, just a totally different one.

badpun · on July 20, 2022

Git branch is actually very similar to git tag[1]. For example, there are commands to point branch to completely different commit in different section of the commit graph, just like you can do it with tags.

[1] The main difference perhaps is that, if you create a new commit, the branch pointer will actually be moved to point to that latest commit (while tag stays fixed).

stonemetal12 · on July 20, 2022

Git supports history rewriting so 1 isn't true. Git uses hashes for "unique ids" hashes aren't unique just low probability of collision, so 3 is also not true.

Having run into issues that appear to be caused by 3 not being true, I don't see that as a theoretical issue.

stu2b50 · on July 20, 2022

1 is true. When git is “rewriting” history it’s actually creating a new commits and moving the branch pointer over.

Until the gc reaps them, you can absolutely git checkout the hash of any of the rewritten history and it’s still there, same as you left it. You can even move the branch pointer back, undoing the history rewrite.

stonemetal12 · on July 26, 2022

So when I do a pull I get their un-rewritten history? Just because they have a short undo buffer doesn't make modifiable history immutable.

pc86 · on July 20, 2022

> once you have to do anything more complicated than just commiting/merging

If you really have to do more complicated things with git.. why? I mean that seriously, if your workflow necessitates anything more complicated that committing or merging with any level of regularity, it sounds like you have a bad workflow. Committing and merging should be 99.9% of your activity within git, shouldn't it?

themulticaster · on July 20, 2022

It depends a little on how you use git. Apart from stashing (git-stash), I use interactive rebases a lot. Combined with autosquashing it's a very powerful tool to ensure a somewhat nice history.

Of course, this only matters if you care about your history. It feels like there are two camps of git users: One camp squashes every MR/PR together even if it's huge, hasn't heard of git-bisect, creates plenty of merge commits in both directions, and piles unrelated changes into a single commit. The other camp cleanly groups changes into self-contained commits, rebases often resulting in a clean patch-series-style MR/PR, and likes to use git-bisect.

atq2119 · on July 20, 2022

This observation about the two camps rings very true. The sad part is that while the second camp is the original user base of Git, at least GitHub basically acts as if they don't exist and only really caters to the first camp.

layer8 · on July 20, 2022

Items 1 and 2 are somewhat misleading.

Commits are only immutable in the sense that the same commit hash will (with huge likelihood) always refer to the same commit history. But they are not immutable in the sense that you couldn’t change the history of e.g. a branch. Other VCSs actually offer more immutability than Git here.

Branches are sequences of commits only up to the last merge, because they are really just labels on branch tips. In other VCSs, branches are truly linked with each commit of a sequence.

Meaning, when you’re familiar with such a VCS, reading items 1 and 2 may lead you to think “well, that’s what I know and expect from a VCS”, whereas really it is a bit different in Git.

stu2b50 · on July 20, 2022

> But they are not immutable in the sense that you couldn’t change the history of e.g. a branch.

That sentence doesn't make sense, though. Assuming from context that `they` = commits,

"[commits] are not immutable in the sense you couldn't change the history of a branch"

Branches and tags are mutable - they're just pointers to commits - but that doesn't change whether or not commits are immutable.

When you "change" the history of a branch, what git is really doing is creating another series of new commits, and then making the branch pointer point to the new commits. However, because commits are immutable, the old commits are still there, and at any time you can go back.

layer8 · on July 20, 2022

You are defining commits to be different by the mere fact that their contents (and history) is different. With that definition, how could a commit ever be mutable? Only by an indirection were the VCS allows the referencee to change. And that is exactly what Git allows with branches.

When rewriting history, the fact that the “new” commits are new is a tautology if you define commits to be different when their contents is different. This only matters when you use the commit hash as your reference point, which is what I was pointing out in the parent comment.

The expectation of immutable commits would be that history never gets lost and cannot be changed (without rewriting history into a different repository), but it certainly can in Git.

The real point about Git’s content-derived addressing is that it enables the distributed aspect, because it makes the path a commit has taken between repositories not matter. But that is already better expressed in point 3.

ajanuary · on July 20, 2022

> With that definition, how could a commit ever be mutable?

It can’t. Hence commits in git are immutable. It’s a function of gits design that commits are immutable.

I could make my own VCS that doesn’t use content derived addressing, but instead uses UUIDs for commit IDs, and let you issue commands that changes the contents of that commit. In that system, commits are mutable. But that isn’t git.

> The expectation of immutable commits would be that history never gets lost and cannot be changed

No it isn’t. The expectation of immutable commits is that a commit object can’t be changed. And in git that is true, at least it can’t be changed without also changing its identity, which is functionally the same.

As has been explained, you can make an alternative parallel history, and you can switch to using that new parallel history as you’re new truth, but you haven’t changed the original commits. They remain un-mutated, because they are immutable.

pmarreck · on July 20, 2022

So, like Nix, basically... ;)

brenainn · on July 20, 2022

A simple one is being able to write multiple messages (subject and body), e.g.

  git commit -m "subject" -m "a longer body"

or just running 'git commit' and using your text editor (subject first line, body after).

I've worked on too many repos with messages like "fixed the thing" where a few more sentences of context would've prevented headaches when trying to debug or change something.

rjmunro · on July 20, 2022

I always tell people the `-m` flag of `git commit` is only ever to be used in scripts. Let git open an editor and write your multi-line message there. Look at the comments telling you which files have been changed.

Even better than using `git add` and `git commit` is to use `git gui` to add and commit files. The (standard) GUI is not to help beginners, it is to make you a more effective user of git. Note there are other apps you can use with more features, for example I like gitx on MacOS, but there are also terrible apps out there that do try to be easy for beginners, and they should be avoided. The most obvious is github desktop (I haven't tried it for a while - it might have slightly improved, but I still won't go near it)

pfarrell · on July 20, 2022

Good points. I occasionally use gitk to browse recent commits. A gui definitely has its place.

I like using the terminal, so my workflow is to use `git add -p`. That makes me look at each change, in small chunks,as I stage them. As a bonus, it helps me keep commits small, or decide if there’s some logical separation in my changes that would be better expressed in two or more commits.

cassianoleal · on July 20, 2022

What's `git gui`?

    $ git gui
    git: 'gui' is not a git command. See 'git --help'.

    The most similar commands are
      ci
      gc
      grep
      init
      lg80
      pull
      push

greggyb · on July 20, 2022

Your package manager seems not to include it. `git gui` is available in the standard Windows installer for git. The package on Ubuntu 20.04 seems to not include the gui.

As noted on git website, there are two GUIs that are in-tree -- native components of the project. https://git-scm.com/downloads/guis

cassianoleal · on July 21, 2022

Yeah it's packaged separately. I'm using git from Homebrew on macOS.

WorldMaker · on July 20, 2022

You may be able to get to it from `gitk` (it's under File > Start Git Gui). I've seen distributions that don't include the `git-gui` shortcut in PATH but do add `gitk` to somewhere in PATH.

cassianoleal · on July 21, 2022

Thanks for the tip. I was mostly curious about it as I had never used or heard of it. I have looked into gitk in the past, and also other GUIs but I find git easier to use via the CLI.

everybodyknows · on July 20, 2022

Alternative:

  git-log --graph --all --stat

rjmunro · on July 22, 2022

That's more like gitk, not git gui.

On ubuntu it's a separate install from the main git, https://launchpad.net/ubuntu/bionic/+package/git-gui

Try `apt install git-gui`, or similar.

chrismorgan · on July 20, 2022

> Look at the comments telling you which files have been changed.

And you can make this even better by adding the diff that’s being committed with verbose mode (see `git commit --help`), so that you can scroll down and easily see exactly what’s going into the commit:

  git commit --verbose

You can make it permanent so:

  git config --global commit.verbose 1

I also recommend setting commit.cleanup = scissors as a related thing.

bityard · on July 20, 2022

Thanks for this tip. One of the things I always did in larger commits was compose the commit message in a separate text editor while reviewing the diff in my terminal. This puts the diff right where I can see it, although time will tell if I enjoy scrolling up and down to switch between reading the diff and editing the commit message.

chrismorgan · on July 21, 2022

See if your editor supports a split view; in Vim, for example, you have :split or :vsplit.

atq2119 · on July 20, 2022

In addition to 'git gui' I would also recommend 'tig'. It allows you to stage individual lines and hunks like 'git gui' but also has a history view. With a little bit of creative key binding, this makes creating --squash commits a breeze. (It also has mouse support which needs to be enabled manually in the config.)

afarviral · on July 20, 2022

I'd love to know how to do this in VSCode. Its infuriating that it yells at you for exeeding a 50 character limit.

WorldMaker · on July 20, 2022

The 50 character limit warning is just for the "top line" of a commit, the "commit subject line". It's general good advice to keep things like `git log --oneline` readable for people using fixed width command line terminals to view commit logs.

VS Code expands the box as you add newlines to allow you type a longer body below the "subject line". The lines in the body also give you suggested warnings to stick to 72 characters or below, which also comes from general good advice to keep things readable in fixed width command line terminals for things like `git log` and `git show <commithash>`.

In some ways those warnings/general advice follow things like writing a plaintext email (or usenet posts or…) in pre-GUI days.

You don't have to follow those warnings' advice. You may not care how your commits look in fixed width command line terminals. Some GUI tools format commit messages poorly if you actually format it like an ancient email, and you can just write paragraphs and let the people with fixed width command line terminals use a pager tool that can better reflow the text for them.

The point is the advice comes from a good place, and there are git repos that are sticklers for plaintext commit formatting requirements which is why VS Code shows that advice. But you don't have to follow it.

Vendan · on July 20, 2022

do `export EDITOR="code -w"`

`git commit` will then open up the commit message as a temp file in vscode, you can write your message then save and close (cmd-s, cmd-w on mac, probably ctrl-s ctrl-w on windows and linux?) and git commit will continue on. `code -w <file>` is telling vscode "open this file for editing and don't return until the user closes it"

pie_flavor · on July 20, 2022

Hit enter. Line #2 starts the body.

pc86 · on July 20, 2022

The terminal in VSCode. Over the past 6 months or so I've been forcing myself to move as much of my workflow into the terminal as possible and I've found things have been continually getting easier and easier. A lot of arbitrary restrictions based around half-baked UIs get removed when there's no more half-baked UI.

MBlume · on July 20, 2022

"It is really hard to lose stuff"

It is really hard to lose stuff that you have ever committed. It is really hard to lose stuff as long as you commit early and often. Uncommitted work is pretty easy to lose with 'git reset'.

TheChaplain · on July 20, 2022

Depends. I use an IDE from Jetbrains, their "Local History" feature makes it fairly difficult to lose anything, even between changes on uncommitted files.

stu2b50 · on July 20, 2022

I don’t think that counts as a “depends” if it involves the use of another tool.

atom_arranger · on July 24, 2022

This has saved me many times. Memory is cheap these days, this should probably be a more common editor feature, on by default.

perlgeek · on July 20, 2022

... or with `git checkout` or with `rm` or `git clean` (for files not yet under version control).

tsimionescu · on July 20, 2022

... or git merge / git pull (but not git rebase) ...

rjmunro · on July 20, 2022

How do you loose files with git merge / git pull?

witcher01 · on July 20, 2022

merge and pull update the reflog just like rebase does, so both can be undone!

rob74 · on July 20, 2022

My thoughts on this:

- even after years of working with git, I'm still not familiar with all the dark corners I can get myself into by running the wrong command at the wrong time. Yes, most of the time my changes are still there, but the way back to the state I actually want for my repository may be long, complicated and require lots of googling/asking around.

- as the article mentions, git cares a lot about stuff that has been committed, but not so much about the files in your working directory. Because of that, IDEs that keep a local history of your file system changes (like the JetBrains products) are really helpful. "Commit early, commit often" may save you from this, but produces more noise in the repo (which you can reduce by squashing commits, but that's extra effort) and may make it harder to find things - and I guess that goes double if you follow the suggestion of committing changes automatically "every few minutes". The local history is there when you need it and gets out of the way when you don't.

colonwqbang · on July 20, 2022

I don't think it's so strange that we need to rework the patch series a bit before sending it for review. At least not if we see readability as a main goal.

Not many great writers can sit down and write a perfect script on the first try. They write a first draft and then rework it, move things around etc. In the same way the git user goes over the patch series with rebase and amend to massage it into something that flows nicely and is pleasant to read.

I find that it's actually easier to make a good patch series if you start from many small commits, rather than a few big ones. The tools in git that merge patches together are faster to use than those that pick patches apart.

rocqua · on July 20, 2022

FYI The committing every few minutes idea commits to a separate branch to prevent this clutter.

papito · on July 20, 2022

I can't tell you how many times IDEA's "local history" saved my ass. I use it extensively. Instead of stashing local changes and going around messing with my working tree, what I did 30 minutes ago is just a right-click away.

(Also, if you right-click on the top project folder itself, you can restore the entire project to a previous snapshot, if something was working before and you heroically went down the wrong path).

valenterry · on July 20, 2022

Yeah I agree. This is a huge productivity boost.

OJFord · on July 20, 2022

If I could just convince colleagues to `pull.rebase`, so CI isn't constantly building 'Merge branch master' on the master branch, I'd be happy enough.

planede · on July 20, 2022

Uh, couldn't disagree more. I just discourage the use of pull itself, but having it and defaulting to rebase is quite invasive.

formerly_proven · on July 20, 2022

Maybe it would have been good if the git defaults were changed from the defaults used for the kernel development workflow to the git workflow ~everyone else is using, even though git was created for the kernel people by the kernel people - most users are not kernel people.

"git pull" in the kernel workflow is for integrating downstream changes into your own upstream repository. Meanwhile, "git pull" in the everyman workflow is for synchronizing your local changes with upstream changes (you know, re-basing your patches on upstream or something).

Even the merge commits left behind by default git pull reveal this. "Merge branch master of ssh://yourorg-git/upstream" - completely backwards for what you are doing - you're merging the upstream into the downstream? What, your local repo is the boss now? Doesn't make any sense.

shikoba · on July 20, 2022

The idea of "merging into" has no meaning with git. You're merging two things into one. But saying that there are a master thing and a slave thing has no meanings.

andreareina · on July 20, 2022

git does privilege the first parent[1] in a merge in a bunch of contexts

[1] i.e. the branch you did the merge from

IshKebab · on July 20, 2022

Not true because commit parents are ordered. One of them is first - the main parent.

shikoba · on July 20, 2022

It's a detail. But for some people it looks like it changes everything.

donatj · on July 20, 2022

> It is really hard to lose stuff

Furthermore I recommend turning off automatic garbage collection. Turning it off makes it even harder to lose things, and unless you have an insanely massive amount of churn, you really don’t need it.

https://donatstudios.com/yagni-git-gc

Groxx · on July 20, 2022

Or committing large binaries frequently, but yes. Totally agreed.

GC manually unless you're dealing with scale where you know precisely how much it helps you. Otherwise it's safer and easier to not do it, and the disk space is utterly inconsequential (and trivial to find and `git gc` if it proves otherwise). You can afford a few more megabytes every year or so.

js2 · on July 20, 2022

> What if you can't find it?

Funny enough, I just wrote "git-lost-and-found" last week because I'd added a new file (`git add`) but before I created a commit I did a `git reset --hard` which deleted the file. It wouldn't have been in the reflog since there was never a commit created. But I knew it would be under .git/objects until the next time a `git gc` runs, so the trick was to just take a look at a few recent blob objects till I found what I was looking for:

https://gist.github.com/jaysoffian/f42f4b1806f65158b40408814...

Put that in your PATH as `git-lost-and-found` and then you can call it as `git lost-and-found` from inside any repo.

Edit: oh, MJD beat me to it by 6 years but I hadn't seen this when I wrote the gist above:

https://blog.plover.com/prog/git-reset-disaster.html

js2 · on July 23, 2022

I guess I could use `git fsck --lost-found` too.

https://git-scm.com/docs/git-fsck

stinos · on July 20, 2022

> But the old commit is still in there, pristine, forever.

> (Git will eventually throw away lost and unused snapshots, but typically not anything you have used in the last 90 days.)

Without having exlplained what a 'snapshot' is and what 'typically' means, it's pretty unclear whether the first statement is affected by the second, and if it were the first is incorrect, so I feel this clarification should be added to the list of things you want people to know about git.

dale_glass · on July 20, 2022

> Without having exlplained what a 'snapshot'

A commit is effectively a snapshot of the repository at a given point in time. This is unlike some other revision control systems where a commit is a diff. In git, a commit contains whole files, and the diff is generated if needed.

> and what 'typically' means

Pretty much everything is configurable, and you can invoke the cleanup process by hand if you really want to.

> it's pretty unclear whether the first statement is affected by the second, and if it were the first is incorrect

Yes, I think it's incorrect and shouldn't be relied on. Things floating around unconnected to a branch can be removed by git eventually. There's plenty time to fix any mistake, but it's not "forever".

pjerem · on July 20, 2022

Well, the Part I have en entire chapter enticing you to read Git From The Bottom Up [0] before continuing. So the author can expect that you know what is a snapshot.

[0] : https://jwiegley.github.io/git-from-the-bottom-up/

codingdave · on July 20, 2022

> It is really hard to lose stuff

If that stuff is committed.

Maybe I'm stating the obvious but conversations seem to revolve so much around finding stuff and merging stuff that my favorite use of git isn't talked about as much -- throwing stuff away. A commit is your save point - once made, you can basically code risk-free, knowing you can always go back to that save point. Try new things, thrash your code, don't stress even if you break everything - because you can always just revert back to that last commit.

So I'd say that it is really hard to lose the stuff you explicitly declared you wanted to keep. And easy to lose the rest.

lelandfe · on July 20, 2022

As a corollary: cheap branches are wonderful.

The workflow I teach the newbies at my job is to just do a quick `git checkout -b tempbranch` when trying out: complicated rebasing, cherry-picking commits, whatever.

Do your thing on that temporary branch. Did the thing work? Hell yeah. `git checkout - && git reset --hard tempbranch`. Presto magic, your old branch is now a perfect replica of tempbranch. It's as if you did the whole thing on your original branch to start.

Did it fail miserably? Oh well. `git checkout - && git branch --delete tempbranch`. It vanishes and no one has to know that the rebase fell apart. Your working branch is in pristine condition still.

stu2b50 · on July 20, 2022

Those are pretty much the same thing. Branches work for that purpose because commits are immutable - the branch just saves you from having to remember the hash. A tag would also work in this case.

lelandfe · on July 20, 2022

> just saves you from having to remember the hash

You're underselling it :) That is the entire point. Don't tell me you enjoy using reflog. This "just saves you" in the same way that a rebase "just saves you" from having to write out enormous cherry-pick statements.

While "the plumbing is the same" is a nice bit of trivia, the porcelain is all that we should care about when we're recommending behaviors.

lelandfe · on July 20, 2022

> A few things can be lost forever!

> The dangerous commands are git-reset and git-checkout

Don't forget about `git clean`, which will happily gobble up all your important ignored files (API keys, IDE configurations, etc)

beeforpork · on July 20, 2022

That!

Also, 'git reset' is only dangerous in some modes, e.g., --hard. It's a user experience nightmare to have the same command be nice in one case and dangerous in another, depending on an option parameter. It is a very useful command, so it's likely that you accidentally lose your local changes the one day when you're a bit tired, maybe.

Also, 'git checkout' is used for many good things like, well, checkout. To abuse it also for reverting local changes is a user experience nightmare, too. It should have a different command for that functionality.

WorldMaker · on July 20, 2022

> It should have a different command for that functionality.

It does in today's git. Rather than using `git checkout` you can migrate to using the two split commands `git switch` (switches branches, generally safe) and `git restore` (restores files, generally "dangerous"). `git restore` also has the benefit that the some of the most dangerous commands at that point have similar names: restore, revert, and reset.

It will take a while for people's `git checkout` muscle memory to be replaced with `git switch` and `git restore` (and `git restore` is still marked as "experimental" despite being stable for quite a few git releases now, though that "experimental" suggestion seems to be mostly earmarked for people thinking to use `git restore` in automated scripts than it is for human users), but the command split is a very good thing for git user experience.

(I've mostly gotten used to `git switch` at least and have stopped using `git checkout` in my own work.)

TacticalCoder · on July 20, 2022

But git clean requires confirmation, unless you configured Git to not ask your permission to trash the .gitignored files.

It even shows you which files are going to be deleted.

Now, on that topic of "important but (.git)ignored files", I don't know what the best practices are but I like to symlink my IDE config files / test API keys etc. that aren't to be committed to Git. So in case I'm not cautious I'm only deleting the symlink but not the file itself.

To me it's the best of every world: I don't need to take any special care of "not backuping" these infos were they shouldn't be backed because I don't backup my Git repo with anything but Git (and they're .gitignore'd so I'm good), I don't risk losing them even if in engage in reckless behavior (like a git clean or some "rm -rf ..." the whole repo and re- git clone, you guys know what I mean) and I can centralize my secret / personal config files into a specifir dir which I know is important and which I can securely backup (say doing encrypted backups).

If I delete the symlink (and git clean only deletes the symlink, as expected), I can just re-create it.

yogeshg · on July 20, 2022

["Git from the Bottom Up"](https://jwiegley.github.io/git-from-the-bottom-up/) changed my life. Someone recommended this to me, and I recommend this to anyone seeking more information about git. People often just want to know "which commands to run", which is a fair expectation, until they reach a threshold of what they can do without knowing the internals.

[When should someone read "Git from the Bottom Up"? What are your git recommendations?](https://forms.gle/aqFy52uvTp8CUvkF9)

emacs28 · on July 20, 2022

I made a simple script below I find useful that commits current changes then automatically rebases and squashes it with the previous commit, preserving the previous commit message. I find it useful for a way to quickly save current changes before switching branches, or for easily backing up work before its worth making a new separate commit:

  git add --all
  git commit --fixup=HEAD~1
  GIT_SEQUENCE_EDITOR="sed -i -re 's/^pick (.*)fixup! (.*)/fixup \1\2/'" git rebase -i --autosquash --autostash --keep-empty HEAD~2

rjmunro · on July 20, 2022

Isn't this the same as something like:

  git commit --all --amend

With maybe a --no-edit thrown in?

emacs28 · on July 20, 2022

You know what you're right, the command

  git commit --all --amend --no-edit

seems to do essentially the exact same thing, thanks!

lelandfe · on July 20, 2022

HN italics markup is mangling your code – indent the code block to fix :)

    like so!

> save current changes before switching branches

Also: why this over a stash? Stashing fundamentally just creates a hidden little commit with all your unstaged work on it. It does have one advantage over your approach, which is easily applying the work to another branch.

emacs28 · on July 20, 2022

> indent the code block to fix :)

Thanks!

> Also: why this over a stash?

I just like this better because with stash you have to do one more command (stash pop) to get the changes back after switching back to the first branch, but this way the changes are already there when you switch back.

wdkrnls · on July 21, 2022

I wish that instead of focusing on this API view of git that everyone seems to have run with that something akin to a plan9/fuse file server was the main interface everyone learned. I would imagine it would work like a souped up version of RepoFS:

https://github.com/AUEB-BALab/RepoFS

There is a paper to go along with the package which showed some convincing examples that much of the git tooling could just go away in favor of already existing shell commands: e.g. no need for `git grep` when `grep` will do the job.

awaisraad · on July 20, 2022

a colleague accidentally force pushed a stale branch to master and a weeks worth of work was potentially lost. That day git reflog helped and saved our jobs as we were contractors for a big Germany based client.

papito · on July 20, 2022

Always disable force pushes to trunk (if that feature is available, as it is on Github).

LgWoodenBadger · on July 20, 2022

I hope you(they) are backing up their source code as well, which would have helped you if reflog did not.

onpensionsterm · on July 20, 2022

>Some people automate this: they have a process that runs every few minutes and commits the current working tree to a special branch that they look at only in case of disaster.

This sounds nice. Does anyone have a concrete example?

I'd like to try with a cronjob but it would be useful to avoid having to create the job manually for each repo & I don't want a job repeatedly scanning all my disks. Is there anything like a global post-clone hook? Additionally, I'd be worried about race-conditions from saving work during the job.

jwilk · on July 20, 2022

Part I: https://news.ycombinator.com/item?id=31923429 (191 comments)

mg · on July 20, 2022

"It is really hard to lose stuff"

Indeed. This also means that garbage keeps piling up in git repos.

This is how I make sure, I do lose stuff eventually:

https://github.com/no-gravity/git-gc-all-repos.sh

A script that goes through all my repos and performs garbage collection.

Arnavion · on July 20, 2022

I know your link says you don't want to do `gc --aggressive --prune all`, but for anyone who does: a) That still leaves objects referenced by the reflog, so you may want to `reflog expire --expire=now --all` first if your reflog is full of rebases etc that you don't need, and b) you may want to run gc twice because the second time compacts a tiny bit more than the first.

donatj · on July 20, 2022

Disk space is cheaper than lost data.

bloopernova · on July 20, 2022

True, but wouldn't the reduction in size make it a little quicker to transfer over slow networks?

Slade_ · on July 20, 2022

You might be interested in the `git maintenance` subcommand that was introduced in Git 2.31. It can do per-repository automatic background garbage collection on a configurable schedule, among other things.

joe8756438 · on July 20, 2022

Learning git is not easy.

There is one thing that will almost guarantee no work is ever lost: commit. if something is commit, even if it’s not pushed it can be retrieved.

One other thing that has nothing to do with how it works, but makes all the difference in usability: write commit messages describing _why_.

To learn git is to use git, reading about it has not helped me.

shikoba · on July 20, 2022

> write commit messages describing _why_.

Could you provide a concrete example? I don't understand your point.

joe8756438 · on July 20, 2022

As in: "changed this property in the config _because_ it will extinguish all evil from planet earth" vs. "changed this property in the config".

I can see _what_ somebody did well enough from the code diff itself -- reading an uninterrupted stream of messages explaining _why_ in a git log is a wonderful experience.

shikoba · on July 20, 2022

Thank you, I think I get your idea.

raju · on July 20, 2022

> The dangerous commands are git-reset and git-checkout

Don't forget about `git-restore` which replaces `git reset -- filename`

WorldMaker · on July 20, 2022

It's also designed to replace `git checkout branch -- filename` and related complicated `git checkout` uses. (It was added alongside `git switch` and between `git switch` and `git restore` you can entirely avoid `git checkout`.)

wokwokwok · on July 20, 2022

Hrm… can you really recover branches and tags once you delete them?

I was under the impression commits were retained but removing labels (tags, branches, remotes) was unrecoverable and immediate.

shikoba · on July 20, 2022

You can't recover where your references were but all the objects (anything with a SHA1) are still there.

formerly_proven · on July 20, 2022

reflog tells you when you created/changed a label and what ID it had, so you can recover them.

wdkrnls · on July 21, 2022

Has anyone written a tool to actually does this? I want to add that to my post-rewrite hook immediately so that I can gain peace of mind I will always be able to switch to a branch where I can study the previous reality after rebasing a new one.

arxanas · on July 21, 2022

My project https://github.com/arxanas/git-branchless does this. Use `git undo -i` to get a graphical view of where branches were at an arbitrary previous point in time.

It's worth noting that, in principle, there are cases where the reflog will have never observed a branch move. The reflog we usually look at is that of HEAD; if HEAD did not have the branch checked out when it moved, then that information will be lost. (For example, if you run `git branch -f my-branch abc123`, the branch will be forcibly reassigned without having been checked out.) See https://github.com/arxanas/git-branchless/wiki/Architecture#... for more details.

pif · on July 20, 2022

From Part I:

> "I don't need to know how it works. I just want to know which commands to run."

> with Git, this does not work.*

Well, so what is your post about? Make it work, then call back.

sdrwefgfvb · on July 20, 2022

Unrelated to git, but I LOVE how fast and user-friendly this page is.