Collisions definitely do matter for git security: many people pin explicit git h...

y4mi · on Oct 12, 2021

What is the thread model though?

I don't think it's possible to create a collision that's also executeable code which adds a security hole or anything.

So what exactly would they achieve with the collision?

And how do they push these gigantic files that have the hash collisions to a server? The upload time would be significant.

lvh · on Oct 12, 2021

1) People systematically underestimate the possibility of creating collisions that still do something "interesting", like being polyglots (files that can be interpreted in multiple formats, executable or otherwise). See PoC||GTFO, specifically anything by Ange Albertini, for examples; grep https://github.com/angea/pocorgtfo/blob/master/README.md for "MD5". I specifically recommend this writeup: https://github.com/angea/pocorgtfo/blob/master/writeups/19/R... .

1bis) You can use an existing collision to create new collisions. People seem to think you need to generate all the work again from scratch; this is not true. See PoC||GTFO for proof by example.

1cis) The files do not need to be gigantic. See PoC||GTFO for proof by example.

2) You can do the collision in advance, and publish the malicious version later. What it accomplishes is that the concept of "this Git hash unambiguously specifies a revision" no longer works, and one of them can be malicious.

3) The standard should be "obviously safe beyond a reasonable doubt", not "not obviously unsafe to a non-expert". By the latter standard, pretty much any random encryption construction is fine. (The examples I gave use MD5, not SHA-1, but that's a matter of degrees.)

4) SHA-256 was published years before git first was.

whoisburbansky · on Oct 12, 2021

What do you mean by the `bis` and `cis` suffixes to your entry labels?

lvh · on Oct 12, 2021

It's just a subdivision; it might as well have said 1a, 1b... -- but "bis" and "cis/tris" (and possibly tetrakis) tend to emphasize that they're addenda, not equal points.

schoen · on Oct 13, 2021

It should normally be "bis" and "ter".

The Latin for "once, twice, thrice, four times, five times" is "semel, bis, ter, quater, quinquies". ("Bis" and "ter" are the only really short ones.)

It's moderately common in European standards and bureaucracy to use "bis" and "ter" for "version/revision 2" and "version/revision 3", respectively. For example https://en.wikipedia.org/wiki/List_of_ITU-T_V-series_recomme...

lvh · on Oct 13, 2021

Huh, good point; I wonder if my mind mixed up the org chem with the numbers (likely) or if that's some kind of unique Belgian affectation.

whoisburbansky · on Oct 12, 2021

Ah, gotcha. I thought I recognized the prefixes from Organic, but I don't think I've seen it used like this here. Neat!

noway421 · on Oct 12, 2021

The possible attack is to prepare 2 versions of a commit, both resulting in the same commit id. Then later on, after the project is successful/etc, swap out the commit with the second version, while keeping the other commits intact.

Granted, the file that the commit touches would need to be not touched in other commits. That's not out of question in a typical software project - maybe a file in the utils folder which is only written once and never changed?

> I don't think it's possible to create a collision that's also executeable code

You can include an unreadable binary blob in the commit. Tweak the blob to find the collision while keeping the code the way attack requires.

deckard1 · on Oct 12, 2021

> swap out the commit

What's the method for doing this? Does a "git push" replace objects with identical hashes on the remote? Or a "git pull" replace identical hashes on the local repo?

I suspect finding a hash collision is only the first difficult part of actually pulling this off. You may need direct write access to the file system of the target. And even then everyone else that has already fetched the repo may not be impacted. At which point collisions becomes moot because you can rewrite the entire git history however you want.

rini17 · on Oct 13, 2021

The history teaches us: If any system isn't hardened against something, we can assume it's possible. If Git server isn't specifically hardened against that, it might still be tricked to update the file by adversary client. Or attacker can temporarily add hooks that will replace the file on server. Or integration testing system might have write access to the server repo.

sirclueless · on Oct 13, 2021

> Granted, the file that the commit touches would need to be not touched in other commits.

That's not how git works. The commit contains the entire tree. You could prepare two separate repositories such that `git checkout deadbeef0001deadbeef` in one checks out the linux kernel and in the other checks out ILOVEYOU.exe.

noway421 · on Oct 13, 2021

You're right. Commit id points to a commit object, that points to a tree object and subsequently to individual blob objects. Then it is sufficiently harder, you need to find a collision between 2 blob objects, both of which are executable and don't look suspicious.

DSingularity · on Oct 12, 2021

That one is nasty.

madars · on Oct 12, 2021

The files don't need to be gigantic. You could, for example, have a binary config file which in one colliding version encodes a potentially dangerous debugging setting, e.g., "allow_unauthenticated_rpc = false" but in other has it to "true".

OJFord · on Oct 12, 2021

A denial of service of sorts? (Something broken and unusable is delivered instead, as distinct from something usable but maliciously so.)

I agree that the chances of ever getting a second pre-image that not only makes sense, but does so in some malicious way may as well be zero, surely?

lvh · on Oct 12, 2021

The part about the 2nd pre-image chances being effectively zero may be true, but the nasty cases described upthread don't need a 2nd pre-image. (You could do a lot worse with one, granted!)

occamrazor · on Oct 12, 2021

In addition to the attack described in a sibling comment, when a hashing algorithm has been broken in some way, it is safe to assume that other more advanced collision attacks will be soon discovered.