Well let's look at how this actually played out. - Defendant was in fact sending...

ajsnigrutin · 2024-10-30T23:37:24 1730331444

> Recall that the standard for issuance of a warrant is 'probable cause', not 'mathematically proven cause'. Hash collisions are a possibility, but a sufficiently unlikely one that it doesn't matter. Probable cause means 'a fair probability' based on independent evidence of some kind - testimony, observation, forensic results or so. Even a shitty hash function that's only 90% reliable is going to meet that threshold. In the 10% of cases where the opened file turns out to be a random image with no pornographic content it's a 'no harm no foul' situation.

But do we actually know that? Do we know what the thresholds of "similarity" are in use by google and others, and how many false positives they trigger? Billions of photos are processed daily by googles services (google photo, chat programs, gmail, drive, etc.), and very few people actually send such stuff via gmail, so what if the reality is, that 99.9% of the matches are actually false positives? What about intentional matches, like someone intentionally creating some random SFW meme image, that (when hashed) matches with some illegal image hash, and that photo is then sent around intentionally.. should police really be checking all those emails, photos, etc., without warrants?

anigbrowl · 2024-10-31T01:02:15 1730336535

Well, that's why I'm asking what threshold of certainty people want to apply. The hypotheticals you cite are certainly possible, but are they likely?

what if the reality is, that 99.9% of the matches are actually false positives

Don't you think that if Google were deluging the cops with false positive reports that turned out to be perfectly innocuous 999 times out of 1000, that police would call them up and say 'why are you wasting our time with this?' Or that defense lawyers wouldn't be raising hell if there were large numbers of clients being investigated over nothing? And how would running it through a judge first improve that process?

What about intentional matches, like someone intentionally creating some random SFW meme image [...]

OK, but what is the probability of that happening? And if such images are being mailed in bulk, what would be the purpose other than to provide cover for CSAM traders? The tactic would only be viable for as long as it takes a platform operator to change up their hashing algorithm. And again, how would the extra legal step of consulting a judge alleviate this?

should police really be checking all those emails, photos, etc., without warrants?

But that's not happening. As I pointed out, police examined the submitted image evidence to determine of it was CP (it was). Then they got a warrant to search the gmail account, and following that another warrant to search his home. The didn't investigate the criminal first, the investigated an image file submitted to them to determine whether it was evidence of a crime.

And yet again, how would bouncing this off a judge improve the process? The judge will just look at the report submitted to the police and a standard police letter saying 'reports of this kind are reliable in our experience' and then tell the police yes, go ahead and look.

ajsnigrutin · 2024-10-31T12:22:08 1730377328

> Don't you think that if Google were deluging the cops with false positive reports that turned out to be perfectly innocuous 999 times out of 1000, that police would call them up and say 'why are you wasting our time with this?' Or that defense lawyers wouldn't be raising hell if there were large numbers of clients being investigated over nothing? And how would running it through a judge first improve that process?

Yes, sure.. they send them a batch of photos, thousands even, and someone from the police skims the photos... a fishing expedition would be the right term for that.

> OK, but what is the probability of that happening? And if such images are being mailed in bulk, what would be the purpose other than to provide cover for CSAM traders? The tactic would only be viable for as long as it takes a platform operator to change up their hashing algorithm. And again, how would the extra legal step of consulting a judge alleviate this?

You never visited 4chan?

> But that's not happening. As I pointed out, police examined the submitted image evidence to determine of it was CP (it was). Then they got a warrant to search the gmail account, and following that another warrant to search his home. The didn't investigate the criminal first, the investigated an image file submitted to them to determine whether it was evidence of a crime.

They first entered your home illegally and found a joint on the table, and then got a warrant for the rest of the house. As pointed out in the article and in the title... they should need a warrant for the first image too.

> And yet again, how would bouncing this off a judge improve the process? The judge will just look at the report submitted to the police and a standard police letter saying 'reports of this kind are reliable in our experience' and then tell the police yes, go ahead and look.

Sure, if it brings enough results. But if they issue 200 warrants and get zero results, things will have to change, both for police and for google. This is like saying "that guy has long hair, he's probably a hippy and has drugs, let's get a search warrant for his house". Currently we don't know the numbers, and most people (you excluded) believe that police shouldn't search private data of people just because some algorithm thinks so, without a warrant.

anigbrowl · 2024-10-31T19:51:08 1730404268

The idea that police are spending time just scanning photos of trains, flowers, kittens and so on in hopes of finding an occasional violation seems ridiculous to me. If nothing else, you would expect NCMEC to wonder why only 0.1% of their reports are ever followed up on.

a fishing expedition would be the right term for that

No it wouldn't. A fishing expedition is where you get a warrant against someone without any solid evidence and then dig around hoping to find something incriminating.

You never visited 4chan?

I have been a regular there since 2009. What point are you attempting to make?

They first entered your home illegally and found a joint on the table, and then got a warrant for the rest of the house. As pointed out in the article and in the title... they should need a warrant for the first image too.

This analogy is flat wrong. I already explained the difference.

most people (you excluded) believe that police shouldn't search private data of people just because some algorithm thinks so, without a warrant.

That is not what I believe. I think they should get a warrant to search any private data. In this case they're looking at a single image to determine whether it's illegal, as a reasonably reliable statistical test suggests it to be.

You're not explaining what difference it makes if a judge issues a warrant on the exact same criteria.

maronato · 2024-10-31T03:05:14 1730343914

As many others have said, Google isn’t using a cryptographic hash here. It’s using perceptual hashing, which isn’t collision-safe at all.

anigbrowl · 2024-10-31T03:07:04 1730344024

Did you read the whole thing?

and a more detailed examination of common perceptual hashing algorithms (skip to table 3 for the collision probabilities): https://ceur-ws.org/Vol-2904/81.pdf

And there was a whole lot of explanation of how probable cause works and how it's different from programmers' aspirations to perfection.

maronato · 2024-10-31T14:05:03 1730383503

The table only proves the point. The lowest probability in the table is 1 in 100_000. Most others are 1 in 100.

28 billion photos are uploaded every week to Google Photos[1]. That’s at least 280k false positives per week.

Should we really be performing 30 search warrants on innocent people per second?

[1] https://blog.google/products/photos/storage-changes/

anigbrowl · 2024-10-31T19:36:32 1730403392

Do you have any evidence that this is happening? You don't think someone would have noticed by now if it were?

And as I pointed out, we're not talking about a search warrant on a person, we're talking about whether it's necessary to get a search warrant to look at a picture to determine if it's an illegal image.