Hacker News new | past | comments | ask | show | jobs | submit login

This best practice of committing often never made much sense to me. I find it typical that we as developers come up with such practices that start to control how we work instead of us focusing on getting work done. This practice makes even less sense when we make PRs that once merged are squashed into one commit?!

I try to make my commit history nice but the way people obsess about it, I don't think is productive.




I work in a way similar to the author, treating my micro commits like a videogame quick save. But many videogames separate "quick save" and "real save" in the UI and so do I.

I have a local branch, usually just called "wip" for work-in-progress and commit to it frequently. Then when I want to push to origin I pull all the contents from that branch into my "real branch" and commit it as one commit. Then delete the wip branch and repeat.

If I'm working on multiple tasks at once as my distracted brain loves to do, I'll have several wip branches. wip-new-hook-events, wip-update-sdk-version, wip-refactor-button-state. Sometimes the branches conflict with eachother, but dealing with that is worth being able to drop a task for a few days while I'm stumped or blocked. I also don't tend to micro commit when tests pass, just whenever I feel a smaller idea is either done, or I don't know how to progress yet.


I find it easier to use rebase for this than having to manage branches.


Rebase works if you're comfortable with it, but it would make working on multiple tasks at the same time a bit harder wouldn't it? I'll be honest I'm not great with rebase.

If you do 'git checkout wip .' it pulls in all the changes committed to your wip branch into the staging area of your current branch. Then you can just 'git commit -m "My meaningful commit message."'

That's the cool thing about git though, it has a lot of compatible workflows so you can do what's most comfortable personally.


I consider myself a git noob, but I do find myself using rebase several times weekly. It seems to get easier the more I use it, along with all other git tactics.

Why bundle all changes into a single commit? How can the commit message be meaningful at that point? Why not just submit multiple atomic commits?


Sometimes I do that if it makes sense, but usually in that case those would be new wip branches with my workflow. If I'm at the point where my changes make for a good single atomic commit, they're ready to be committed to the main branch and the existing wip branch deleted.

If I'm working on really small tasks where I know I'll be done quickly and won't need to look back at my earlier changes, I don't bother making a wip branch.


Any chance you still have the link to the micro-commits post you mentioned below? Even better, do you have a post on your blog describing your workflow? I'd love to read a concrete example and apply it in my daily work.


Oh man, I'll have to try that workflow! You describe exactly how my thought process works as well, but I never thought of doing that.


I picked up the process when I first read a blog post about micro commits and wanted to try it but my company standards didn't allow tons of commits to clog the history like that. It feels like the best of both worlds to me.

I'll also add that sometimes I do break them into a few commits on "real branch" if it makes more sense to do so, and I use a GUI tool for merging if I need to solve conflicts. I also don't really worry about commit messages being useful in the wip branches. Oftentimes the commit message is just "wip".


Luckily my company doesn't have much prescription about commit process, so its up to each team to determine what works for them and we've been pushing for small atomic commits (probably what you'd term micro-commits).

The idea is that each commit should be small enough to fit in one "page" of PR review, with source lines and test line diffs <150 each, and everything should be one integrated and independent change with unit tests all passing and deployable to prod, etc. Honestly, I can't see a downside to this, other than maybe taking a while to internalize this process and to properly decompose commits.

Have you considered advocating for a change of company standards? Sometimes thats a sisyphean task, but if the only argument against it is "that's how it's always been done" it might be a worthwhile endeavor.


I think the difference in my case is an atomic commit works in isolation and pass tests. The micro commits don't always do that. That's where I differ from the article author in my workflow.

I have a "micro commit" sitting in a branch right now that's just a single line change, for example. That's not super common, but I don't have any strict rules for myself about when to or not to quicksave.

> Have you considered advocating for a change of company standards?

Oh trust me I did. I'm at a better job now :p My current company's git policy sounds identical to yours. But I still like this workflow for myself.


I've always disliked when code review tools are prescriptive about my commit history (i.e. prefer that each successive patch to a given CR needs to still just be a single commit rather than a separate commit per patch), and I ended up writing a shell function that just wraps whatever command I need to use to submit a CR that first swaps to a branch called `xxx-temp-xxx`, then does an interactive rebase (in which I squash all the commits I've made into one), then invokes the CR command, and finally swaps back and deletes the `xxx-temp-xxx` branch. This was made easier when I found out about `git checkout -`, which switches back to whatever you just swapped from last (like `cd -`).


I think the focus should be on atomic, independently reviewable commits, not time. Sometimes, those commits are obvious and sometimes it takes a while to figure it out. I don't think you should feel compelled to commit on a rigid timeline (though if you go too long, that probably indicates an issue with scope, processes, etc).


It's far easier and less time-consuming to squash commits than it is to tease out logically distinct changes from a big mess into separate commits. The latter is particularly troublesome when you require preserving bisectability, i.e. every final commit builds and functions. Being able to bisect saves tons of time chasing bugs.

This is one of those "ounce of prevention vs. a pound of cure" situations.


For the developer working on the change, or for the reviewer + everyone that comes after them looking at your commit (which can include your future self)?

We tend to discount everyone else's time but our own.

Might also be worth thinking about long term asymptotic effort vs marginal effort at a particular point in time--typically, the more often you do something (decomposing a commit into multiple independent parts, in this instance), the easier and less time-consuming it becomes.


> For the developer working on the change, or for the reviewer + everyone that comes after them looking at your commit (which can include your future self)?

Both. I don't really follow what you're getting at, I think we're in agreement? confused


Yes we are; I thought you were advocating for larger commits since they're too difficult to break apart into logically independent pieces, but it seems I just misunderstood what you wrote the first time I read it.


To me, the ability to bisect is a good reason not to squash (I assume you mean squash merge?).

My commits all stand alone; each is a distinct change that pretty much always is in a working state. Often it will be a series of refactors before the real change. Ironically, though it makes bisect easier, I find it rarely need to use bisect: just looking at commit messages (via either log or blame) is often enough to isolate a problem.

I accomplish this using interactive staging, and within my branch, and before I push, using any combination of rebase, squash, amend and fixup. I also prefix commits with the ticket to make referencing way easier.

My failed experiments (a change that later is undone) don't usually make it to anyone else, unless it's something modified during a PR review. On a small handful of occasions I've changed something radically and will delete the remote branch, modify and interactive rebase ny local one and push it again (or push as a new branch). For a feature I work on for a couple days, I'll commit a couple dozen times or more, rebase/squash to maybe 5-10 commits anyone else sees, and push to origin maybe 2-5 times.

When merging the branch (PR), I use merge --no-ff. This preserves commits and makes it easier to see later during a blame. And since the commit history is clean, it's not "noisy".

One of the massive benefits of this is the ability to take a big PR and split it up for easier review: I can branch off halfway, make a PR there with just the refactors, merge it, then the actual change merges cleanly later, and preserves the parent commit relations, so you can visually see what happened. This also works great when you fix a bug on the way to adding a new feature, or want to ship the early part of a feature early before everything is done (but only decide that after work has already started on stuff you're not ready to ship).


Two concepts are bound up together in a commit. "Publish" and "Save".

In general our OSes aren't storing a complete history of a file, so we rely on git "quick saves" (as someone else put it).

The git log is as much an artifact of the development process as the ticket history or the code itself. You don't want to publish your quick saves for someone else to have to dig through in two years time - give them something a bit more polished than that.


That distinction should be part of Git. For now, we may rely on wip branches for "save" and merging them into master for "publish".


I thought that too, but in writing that comment I realised "save" should actually be part of the filesystem.


I find that having one commit per logically-grouped set of changes really helps me to focus on those changes and not get distracted. Before I got into that habit, for years I would try to cram so many features in at once, and would have a really hard time keeping track of each of their progress and what's left to be done for each, because I tried to do them simultaneously. No more.


> This practice makes even less sense when we make PRs that once merged are squashed into one commit

Up until the point of merging, other developers reviewing the PR can see your development path, and optionally examine individual commits. In GitHub at least, these individual commits usually make their way into the squashed commit's comments as a bulleted list of changes (after some cleanup), which encourages better final commit messages in general.


Rarely do I see people examine individual commits, perhaps because it is already established that PRs should small and branches short living.

The point about bulled list of changes is a good one.


I examine commits regularly, sometimes going way back in time. Maybe not daily, but pretty frequently. Usually, I'm trying to figure out why something is the way it is (so I know if it can be changed, removed, or whatever), and a clean history makes this exponentially easier.


Our team has been pushing for and implementing small, atomic commits on mainline. Honestly, I'll keep fighting for it and never go back. PRs are easy to review, commits are easy to reason about, what's not to love?

Why squash commits to begin with? Unless you're getting too many commits per package per time period, then maybe that's an indication that you should modularize your codebase.


I always find practices about commits weird while working strange

I usually have my changes planned out before I start writing code, including which commits I want to make. Figuring out how to go about making a change is a design time problem, rather than implementation time imo


I usually commit early and often locally specifically for the ability to bisect during feature development. Then the squash and merge is for the "public" commit history


I have a guy on me team that commits basically every 5 or so lines. Hell have a 30 line bug fix spread across like 20 commits because he never goes back and cleans up those small commits that get changed afterwards, not to mention he’ll merge the development branch into it numerous times and never rebased. It’s absolutely infuriating reading through git logs that involve his stuff.

His commit messages are equally maddening - “change this variable to 5”, “fix broken spec”, “change variable to 7” ad nauseam…


This is kind of like me except I list the component, function or class I did the change in. We do a squash on each branch before merging anyway


You're lucky that you have him.


Not really because none of his commits have any context on what the purpose any of his changes are actually for. He never describes why he does anything, just that he is changing something (which is obvious from the code).

It's the git equivalent of "useless comments" like `int x = 10; // set x to 10`


In my opinion, commits aren't a record of why. They are a record of what. Commit messages a shorthand of what for easier skimming.

Comments answer why (and even 'what' is fine if the code is difficult.)

But simple rules like that always have exceptions.


Assuming you squash the PR and view it holistically, what's the issue?


PRs dont' squashed, nothing gets rebased and only merge commits are allowed for anything non-local. It adds noise to trying to figure out why changes are made because the commit messages only explain what was changed, not the purpose or context.


If you're not squashing your PR's I can see the problem, but I also see the other problem.


Do you think squashing the PR is the superior choice? Why?


Because then you don't see all those minor little changes and can view the PR holistically. I don't typically find that viewing the changes in chronologically order buys much, it's not the order I care about, it's the final changes themselves that have an impact on the system, especially when those changes could be transient (changing something in commit #1, change it back in commit #3). There's a larger load to comprehension with a squashed commit, but it shouldn't be THAT large, and it removes all the temporal noise.

It also makes tracing the historical "why" of a change easier to understand. If you don't squash it's a random change in the log, if you do squash then it's "adding feature x".




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: