Hacker News new | past | comments | ask | show | jobs | submit login

You're lying about the path that you took to get to that point, and you're creating a lot of (public) nonsense commits on the way there (unless you're very careful, and/or overly squash-happy).

Whether you've published the true history earlier is irrelevant to that discussion.




Let me try again. I'm advocating that you use rebase to improve the quality of your changes that will be reviewed before merging or even before being reviewed at all.

If I make three commits and then realize that I should have included something in the first commit, I use rebase to create a new sequence of three commits that has the corrected version of the first commit. I haven't shared those commits with anyone, this is just work that I've done locally.

Are you seriously advocating that creating a pull request with: (A, B, C, A-fixup) is better than using rebase and then creating a pull request with: (better-A, B, C)?

You think that second case is "lying" because I didn't show the intermediate step that included the mistake?


Yes, it is lying.

You can mitigate most of the damage if it is convincing enough (for example, go through B' and C' and make sure everything still makes sense at each point), but realistically nobody is going to do that, because it's pretty inefficient way to spend your time. And even then, you're still removing context (unless you're just fixing a typo).

> I'm advocating that you use rebase to improve the quality of your changes that will be reviewed before merging or even before being reviewed at all.

That was clear from the start. But the fact that X breaks Y doesn't imply that Y is a good idea when X doesn't apply.


Throwing the word "lying" into an argument like this counts as name-calling and flamebait in the sense that the site guidelines use these terms. It leads to distracting, shallow, and therefore more boring conversation. Would you mind reviewing the rules and please not do that? Let's stay focused on exchanging what we're curious about.

https://news.ycombinator.com/newsguidelines.html


I really don't understand why you are choosing to use the word "lying".

Us mere humans make mistakes all the time. Typos, omissions, false starts, and so on. What is the value of throwing that raw set of events at a reviewer or complicating the understanding of the changes when viewed in retrospect from the future? What is the reason you call curating the work into a more polished form "lying"? Why do you think the time spent being intentional about changes isn't valuable when compared to the time spent by a reviewer (or your future self) to sort through the flotsam and jetsam of your intermediate work?


But reviewers in most Git workflows mainly look at PRs. Then if as a reviewer you want to see how the sausage was made, you can zoom in on the commits, including all the messy reality of how the work was done. In some circumstances you might of course want to hide this, but in an open and safe collegial environment this lets the reviewer understand your thought and work process.


The reviewer also won't see all the things the author tried without ever committing these states. They don't need to see all the messy steps, or they would have needed to look over the authors shoulder all the time.

I rather review the final patch series with changes in logical order and not necessarily in the order the code was written or with intermediate work that was later reverted or changed again. I do look at commits, because also every commit message counts and is supposed to explain the individual change.


Sure, it's not a perfect record. But as in most things, perfect is the enemy of good.

The messy reality is valuable, when talking with your teammates about how the work was done and what kind of bumps were along the way. It's not about looking over their shoulders, it's about using data to develop together as a team, eliminating hinderances, etc - if you have the mutual trust to do that. And of course you yourself can go back and look for patterns of mistakes or problematic areas in code based on your history.

Like I said, to judge the change its, the whole PR diff is usually the most useful unit of inspection when you just want to see what happens. And if it's a big pr, you can of course always merge child PR's or branches against the big PR/branch, and look at the merge diffs.


Another formulation of the "learning as a team" idea-

The science principle of publishing your experiments, including failed ones, has the same benefits in sw engineering: others can build on your failed attempts, or save time by not replicating them.


I'm using it to differentiate between summarizing (removing steps between A and B) and modifying (introducing new steps, reordering them, or editing them).

You can do it in a way that isn't harmful (as I mentioned earlier), but good luck getting a team to actually stick to that. It also doesn't help that pretty much no tooling encourages doing it properly.


Are you lying to your co-workers when you draft an E-Mail to them, read it over, and decide to delete a paragraph or write it again from scratch? If your E-Mail client automatically saves drafts that's basically the equivalent of "rebase".

I made a typo when writing this reply, and pressed backspace to correct it. Is use of the backspace key lying?

I think you're placing a value on "history" that doesn't map onto all users of "rebase", or E-Mail client drafts. A lot of advanced users use it as the equivalent of "save" in an editor, sharing all those intermediate states is more noise than value v.s. crafting a sensible patch once you figure out what you want/what change to make.


No, if you only merge changes then you're just summarizing.

The true problems begin once you start creating commits that represent repository trees that you never tested or reviewed, for example by editing past commits (invalidating any testing you've done of commits after that point), deleting past commits (aside from squashing an unbroken sequence of commits, or deleting them if the squash would result in a no-op), reordering commits, or rebasing commits.


You're assuming that commits are tested before they're made, and that rebase invalidates this. I don't test most of my commits, just like I don't proofread an E-Mail after every word I've written. I do that later.

But yeah, the history you push to a canonical branch should generally be made up of commits that have all been tested in isolation. The rebase command doesn't make this worse, but better, e.g. with "rebase -i --exec='make test'".

I also prune out history of some false steps taken. Have you never written a program and done something like "I'll use a hash here <save><compile><test>, no actually a list makes more sense <save><compile><test> ...". Those intermediate steps are commits for a lot of advanced git users.

Sharing all your mistakes-as-you-go-along with the world doesn't help anyone, I'd typically be sending you a 100 patch merge request for some rather trivial change instead of 1-3 sensible commits.


That kind of rebase is just to refresh your work against updated masters etc. Of course you have to test against the refreshed (rebased) work again!! That doesn't mean you can't rebase. It means you can't randomly push untested work. If you know what rebase means you will understand that there are very likely new interactions with your code and you have to test your updated changeset. Exactly the same as if you merge. You have to retest the resulting tree.


A merge of a branch with N unique commits creates one new, yet-to-be-tested commit/tree. A rebase creates N. I doubt that it's common that people replay all the new history after a rebase and test each new commit/tree.


Maybe "lying" isn't the best term to use here, because it seems like you're using it to mean "not providing all information in a way that is morally bad." Of course you're not providing literally all information. Heck, I conceal a lot of information about my development process by testing and changing code before I even make a commit. But I think that's preferable to, for instance, providing a video screen capture of my entire development process for review.


Yea. Moreover, I would often ask my coworkers to rebase their code if there’s commits like “oops, missed a comma” because it distracts from the main point when you read the commit history.


> realistically nobody is going to do that, because it's pretty inefficient way to spend your time.

Of course you do! And it's not am inefficient use of your time, because it helps reviewers now, and yourself when you're bisecting later.

> And even then, you're still removing context (unless you're just fixing a typo).

You place that context in the commit message.


IME this is a very rare level of sophistication in use of rebase.

And how do you detect that you forgot / was too busy to do it, when you go back 6 months later? It's fragile, "fail-open".


You would be surprised. For example, this is a guide my colleague wrote to describe his git workflow:

https://github.com/tianocore/tianocore.github.io/wiki/Laszlo...


I disagree.

Say upstream is at A.

I clone it in my local work-space, and make a few commits over the course of a few days. So my local is A B C

During this time other changes have been merged into upstream, so upstream looks like A D E

I now have two options. I can try to merge from upstream or rebase off of upstream. Merging introduces a messy commit history that quickly becomes difficult to follow. Rebasing removes my local commits, applies the changes in upstream, and then re-applies my local commits.

So after rebasing my local is A D E B C. There are no messy merge commits. And ideally, I can squash my local changes into a single feature commit, so upstream ends up incredibly tidy.

At no place in this process is there any dishonesty or lying. I haven't changed the history upstream, which is the source of truth. What's the issue here?


> So after rebasing my local is A D E B C.

No, your local is now A D E B' C'. Commits aren't just a diff between two tree snapshots, they are tree snapshots.

Hopefully you test and sanity check C' before submitting for review, but it's very unlikely that you're going to give B' the same treatment, making it more difficult for people to understand the history in the future (as well as breaking `git bisect`).

And even if you do, are your coworkers going to? Consistently? No CI tool that I'm aware of will enforce this for you.

> There are no messy merge commits.

No, but the underlying messy workflow is still there. You've just swept it under the rug for the sake of aesthetics, at the cost of future comprehension.

> At no place in this process is there any dishonesty or lying. I haven't changed the history upstream, which is the source of truth. What's the issue here?

Those are completely orthogonal concerns. You're presenting a false version of the repository state.

The common mantra of "don't rewrite public history" is about not creating a mess of duplicate commits, it doesn't imply that rewriting history is fine as long as it's not public.


But you lie all the time, by that definition! If I write code, make a mistake and press Ctrl+Z before committing that code, I've just "rewritten" my history without my team mates being able to tell.

Your commit history is just a somewhat arbitrary recording of your code at certain points in time that you choose. Rebasing simply makes that less arbitrary, allowing you to document the way your code is built up in a structured way. Rather than having to decide on the spot whenever a certain combination of code is a good candidate for a single, atomic commit, you can make that judgment with the benefit of hindsight.


How do you feel about deleting commits to avoid reverting them on such personal branches? Also, what about doing it to fix commit messages (maybe because they were accidentally written in a language that was not agreed on for the project)? What about splitting a commit with a generic "lots of semi-related things" commit message into multiple, more focused commits?


do you want to know all my wrong tries to make a thing work? why are you sure that all the commits i did are not nonsense? i look at the history as a way to 1) divide my work into reusable pieces of changes 2) document my changes to read for other developers.


> do you want to know all my wrong tries to make a thing work?

Yes. A failed attempt is still a useful signal that people shouldn't try to simplify back to that way in the future (and why not). It's also a useful starting point in case the reasons it failed no longer apply.


It's very rare that "failed attempts" are a useful signal. When it is the case, it's better to document it (eg. as part of the commit message, PR, or the dev documentation itself).

Commit histories littered with commits that get back-and-forth reverted are frickin unreadable though. Extremely annoying to bisect, painful to comb through when looking for changes, noisy in git blame, etc. There's a ton of downsides for what in practice is very rarely even an upside.


I agree with you, yet I don't understand why you were downvoted. I think git-rebase proponents haven't really ever worked in a professional environment particularly with several co-workers. I as a project manager would not trust a "rebaser" and I would question the time he spent to rewrite the git history. It is granted Merge and rebase need the same amount of reading the code and merging the differences. However with git-rebase there is the added cost of beautifying the history. Which means at least 2 drawbacks : one is the cost the other is more about memory. About cost, what is the point of rewriting the history when you have a (great) tool to janitor it. Then the git history automatically reflects the project history. If the git history is rewritten how would people remember the order of commits in case bugs occur. When did the bug happen ? Who should correct it ? IMHO those questions are more fundamental than a straight line of bullets in gitk.


I can't help but feel like you don't understand a work flow that utilizes rebase in a responsible way. In particular you seem to think that the rebase is going to affect the history of a released version of your software (When did the bug happen?).

Nobody is suggesting that rebase be used to change the history of a released or published branch (master, develop etc.). If that is your concern and the reason for you not trusting a "rebaser" then you are simply mistaken, you are arguing against an imaginary workflow for which no one is advocating.

Rebase should be used only to curate the commits on a feature branch and to keep the feature branch synchronized with the upstream branch.


I have a very simple solution to this: your feature is at most 1 or 2 commits (after squash) and they can be ff merge when your branch is done. Or it’s too big.

The exception to this is when a feature becomes more involved and has several logical steps, or any kind of history worth providing. This should be rare and when it happens, use merge commits to preserve history.

Not polluting the history with N trivial branches for every 1 branch that needs historical context, is a benefit of this.


In that case I assume you have also removed the backspace key from your keyboard?


[flagged]


Personal attacks aren't ok here, regardless of wrong or annoying another comment is. Would you mind checking out the site guidelines and taking the spirit of this site to heart? We'd be grateful, since that's the only way for it to remain interesting.

https://news.ycombinator.com/newsguidelines.html




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: