I'm a fan of whys. Even for yourself. I have code I have to maintain that I wrote 10 years ago.
But it can make for long commit messages. Why also includes what alternatives you considered, and why you didn't use those.
I also like to have a digital "worksheet" for each change I do, where all my thoughts and research goes. So if all else fails, I can reference that. But no-one else can, so I like to transfer as much knowledge as possible to the commit message. At the same time, some of these go on for 5 or more pages. They also tend to be very messy. I'm not sure if it's all appropriate for a commit message.
It's an issue when dealing with management or clients. They can see long commits as a problem with someone with too much time on their hands.
The whole point of a version control system is that it contains everything related to the code. On lots of web dev and game projects you also commit the finalized assets (the generated javascript from coffeescript, the compressed 3d textures, etc. etc.)
IMO most of your "why" ought to be code-comments, not commit-messages, because most of the time people say: "Why is this code this way" as opposed to "Why did this specific transition occur in the past".
There are man times when I have seen a crappy piece of code that I wrote a while ago, and decide to "tidy it up". The I test it, then remember there is a strange edge case that required me to write it the "crappy" way rather than the clean way. It always gets commented the second time.
I'm looking at a commit message from one of my clients today, which is just "Not, new bins, cant stop bins" and a check-in of several hundred dlls generated as build artifacts. Six days later another developer at the client checked-in a commit that removed all of the dlls, with the message "remove BIN folder".
Some other examples: "Fixes to make work", "Left Over", and their most common message " ".
They're nice guys to work with, but their VC habits are awful.
I have to admit, I've made number of commits with "." as the message.
No excuse, beyond it usually being a minor change well documented in the code ( I always write comments ) and me being utterly buried under work... You know, start the day with 20 things to do, crack off 4 of them and have 23 things in queue at the end of your day.
The "." was sort of a placeholder for "Fuck This, I'm ready to quit." ( and I did eventually )
That's when I prefer to use `git amend` or rebasing. I'll make "WIP doing things" commits, and then later squish a few together before making the code more public.
I could see the latter being something like "Apply coding standards", where somebody went in and fixed all the lines that went over the right margin or had wonky whitespace.
Of course, we can debate over and other whether mass-correcting existing files is actually helpful (one unofficial rule we have here is "only use the auto-formatter on code you've personally worked on or have taken over responsibility for", because it affects the blame history).
Also, I changed my name last year, and when I did that, I ran a mass find-and-replace to correct my credit in every Javadoc I'm credited on (I particularly detest my deadname, and I want it dead and buried), with the commit message being something like "Correct my credit to match my new name".
I can one-up you there: one of my "initial commit" messages was a dedication to my cat, who passed away a couple of days before I checked the code in. Better yet, we released that project as open-source, so that commit message is still floating around Assembla for all to see.
I've made that commit before. It was actually an initial import of a from-scratch rewrite that happened to have utility libraries and such in common. Delete everything, drop in the new files you've been working on elsewhere, make that a commit. Not everything changes, because some things are the same by coincidence. Otherwise, it's just a discontinuity.
Eh, uninformative commit messages aren't a sin, they're a trade-off. Right now, most of my commit messages are one word. Why? Because at the moment nobody cares about my commit messages because nobody cares about my project because it doesn't yet do anything useful. Getting to the point where it does something useful has higher priority than writing long messages nobody will read. Once the commit messages have an audience, then I'll put work into writing better ones.
It's a self fulfilling prophecy though. If you want to be important you have to act important.
Yes, nobody cares about your commit messages until there's a bug or a major refactor. By then, if the commit messages suck, it's too late to do anything about it.
"Programs are meant to be read by humans and only incidentally for computers to execute"
Is never more true than when you're trying to fix bugs or make improvements. A project you can't improve is a dead project.
I once saw a coworker commit a reshuffling of a project's directory structure with the message "YOLO". We made fun of him for a while for that.
(edit: He made this change after a lot of discussion with a bunch of people, and he also sent out a mass email describing the changes to the structure, so he wasn't totally being irresponsible there.)
This choice makes no sense but.... hmm, really? I think I'd rather have good commit messages that explain the "whys" that went into the code.
Code typically isn't rocket science. It's the human knowledge that goes into it that's irreplacable.
Example 1: OK, you're using a third-party CSV parser instead of the one built into the standard library. Why? If your code is crap but well-documented, I can read what you were thinking: "Using non-standard CSV parser because the standard one chokes on files bigger than 2gb" At that point I can refactor your code, or perhaps see that this issue has been fixed in a newer version of the standard library. Or maybe I realize that you confused gigabits and gigabytes and you made a bad choice in the first place, and I realize I can safely remove this dependency. But if your code is tight but undocumented... I would have no idea why this third-party library is being used unless I do some painful trial-and-error that still might not definitively answer the why.
Example 2: You inherit Mary's code. It calculates commissions for our salespeople. The code is sloppy and convoluted, because the sales guys change the commission formulas every month... and these changes have been happening for over ten years, often on very short notice, often contradicting basic assumptions made when the software was originally architected. But Mary documented every change. Which is good, because the fucking sales guys sure don't. Her code is literally the company's only coherent record in the entire company of the commission process. Remove her comments and commit messages, and none of the code would make sense, even if it was tightened up into a sounder codebase of seven modules with 300 LoC each instead of ten modules with 500 LoC each.
So yeah. Totally fictional choice but I'll take documentation every time. Code is just code, I can fix it.
(Both those examples are fictional, but I've been coding professionally for nearly twenty years and I've seen variations of them countless times...)
In both of your examples, I believe comments in the code are the real winning strategy, rather than the individual commit-messages.
99% of the time what you want is to understand the current code--or at least code at a specific past point in time--as opposed to every transition that occurred.
For the CSV parser, I'd rather see a comment ("/* We use this for >2gb support */") or a test case ( testOverTwoGigsParseable() ) would be a lot more useful than any level of discipline over commit-messages.
For Mary's commission-calculator, it sounds like nobody has access to good "whys" anyway, because they boil down to "salesguy X insisted on it". Instead, the commits are functioning as an auditing/blame tool.
That statement makes no sense. It's not like anyone ever has to make a choice between making sound decisions while coding and making sane commits and commit message.
I had a local git repo with a month's worth of commits. I hadn't pushed any of them so they were all _just_ local (though the head was constantly being uploaded to an online store).
I got a new computer and when I was getting rid of my old one, it didn't occur to me to push all the commits or save the git repo. So, my next commit consisted of all the changes for that entire month.
Sounds like he lost the local repo, and only the actual final state of the files was backed up. Thus, a single new commit representing everything at once.
Depends on how you have things set up. I once worked at a company with SVN/Trac integration. If you put "Refs #27" in a commit message, it would actually add a link to the commit as a comment on the ticket. If you put "Fixes #27", it would do that and close the ticket. Our system was also set up so you had to ref or fix a valid ticket in order to commit.
Scratch that, on MOST projects that's the case.