Collisions definitely do matter for git security: many people pin explicit git hashes for their dependancies, and thus they can be tricked in running malicious forks. This requires placing a chosen commit in the git repo (so unlike second preimage break it does not mean that you could attack repos you have no control over) but that's not an unrealistic threat model overall.
1bis) You can use an existing collision to create new collisions. People seem to think you need to generate all the work again from scratch; this is not true. See PoC||GTFO for proof by example.
1cis) The files do not need to be gigantic. See PoC||GTFO for proof by example.
2) You can do the collision in advance, and publish the malicious version later. What it accomplishes is that the concept of "this Git hash unambiguously specifies a revision" no longer works, and one of them can be malicious.
3) The standard should be "obviously safe beyond a reasonable doubt", not "not obviously unsafe to a non-expert". By the latter standard, pretty much any random encryption construction is fine. (The examples I gave use MD5, not SHA-1, but that's a matter of degrees.)
4) SHA-256 was published years before git first was.
It's just a subdivision; it might as well have said 1a, 1b... -- but "bis" and "cis/tris" (and possibly tetrakis) tend to emphasize that they're addenda, not equal points.
The possible attack is to prepare 2 versions of a commit, both resulting in the same commit id. Then later on, after the project is successful/etc, swap out the commit with the second version, while keeping the other commits intact.
Granted, the file that the commit touches would need to be not touched in other commits. That's not out of question in a typical software project - maybe a file in the utils folder which is only written once and never changed?
> I don't think it's possible to create a collision that's also executeable code
You can include an unreadable binary blob in the commit. Tweak the blob to find the collision while keeping the code the way attack requires.
What's the method for doing this? Does a "git push" replace objects with identical hashes on the remote? Or a "git pull" replace identical hashes on the local repo?
I suspect finding a hash collision is only the first difficult part of actually pulling this off. You may need direct write access to the file system of the target. And even then everyone else that has already fetched the repo may not be impacted. At which point collisions becomes moot because you can rewrite the entire git history however you want.
The history teaches us: If any system isn't hardened against something, we can assume it's possible. If Git server isn't specifically hardened against that, it might still be tricked to update the file by adversary client. Or attacker can temporarily add hooks that will replace the file on server. Or integration testing system might have write access to the server repo.
> Granted, the file that the commit touches would need to be not touched in other commits.
That's not how git works. The commit contains the entire tree. You could prepare two separate repositories such that `git checkout deadbeef0001deadbeef` in one checks out the linux kernel and in the other checks out ILOVEYOU.exe.
You're right. Commit id points to a commit object, that points to a tree object and subsequently to individual blob objects. Then it is sufficiently harder, you need to find a collision between 2 blob objects, both of which are executable and don't look suspicious.
The files don't need to be gigantic. You could, for example, have a binary config file which in one colliding version encodes a potentially dangerous debugging setting, e.g., "allow_unauthenticated_rpc = false" but in other has it to "true".
The part about the 2nd pre-image chances being effectively zero may be true, but the nasty cases described upthread don't need a 2nd pre-image. (You could do a lot worse with one, granted!)
In addition to the attack described in a sibling comment, when a hashing algorithm has been broken in some way, it is safe to assume that other more advanced collision attacks will be soon discovered.