Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How long will it take to hash all 4b ip addresses with your salt? I guess if you tune scrypt to 100ms, that would be 400m seconds, or about 100,000 hours of machine time. If you buy machine time at 1 cent per hour, that would be $1000, spread across 1000 machines it will take about 4 days total. Not good.

Log anonmimization is hard. I wonder if a probabilistic approach would be better, because it's deniable. Something like a bloom filter. You can tune the false positive rate to match that of a human admin making an error, so business impact is less pronounced.

Yet another option I'm thinking about is an external service that takes 32 bit IP address and returns a 256 bit handle, which you would then use for logging. The service would be rate-limited to prevent enumeration, so those 100k hours would have to be spent sequentially, turning 4 days into 4 thousand days.



Not good for a serious agency, but it stops the lower level ones that aren't going to do this.

We use these hashes to fight off spammers. A bloom filter won't do us any good because it will provide false positives.

An external service will be legally subject to subpoenas, so pawning it off doesn't solve the problem either.

You're right. Log anonymization is hard.

The only really good way to solve this is to throw this information away, which is also legal. Because of the spam considerations, we need this for now. I might decide it's not worth it though and just stop storing even these hashes.


> An external service will be legally subject to subpoenas, so pawning it off doesn't solve the problem either.

How about a distributed onion route of external corporations, such that N subpoena hops must be followed to get to an unhashed IP? Serious investigations would do the legwork required and catch a serious bad guy, but nuisance clowns would be stopped from going too far. Kind of like a legal scrypt().

Instead of services, it could just be a reciprocal arrangement where everybody participating holds some of everybody else's data.


You don't even have to chain them, you can execute requests to 128 different services in parallel, and then XOR the result to obtain a single IP address "handle" for logging purposes.


If your scheme becomes prevalent, someone will create a service that for $10k will brutforce $1k worth of hashed IP addresses. $10k and 4 days is comparable to a legal bill, so you're not deterring anyone, but the most casual snooper.


But if it isn't prevalent (which is likely to remain the case for the near future), and the regional agencies issuing subpoenas are technically pretty clueless (which I think they mostly are), then it will be effective. That's pretty near the best you can do, as long as you need to log IPs in some form for spam prevention.

Incidentally - a slightly more effective solution might be to put the 'did this IP visit recently' function in a secure microprocessor, rate limit it so it can't be bruteforced at more than a modest rate (in case of DDOS you can always temporarily stop using it), and throw away the keys to reprogram it. That really will stop everyone but the NSA, but it's about a million times more difficult and expensive...


There are already services that do this https://cloudcracker.com/ (wpa not scrypt but hopefully you see my point)


> How long will it take to hash all 4b ip addresses with your salt?

Go IPv6 only?


100ms would be a bit much to spend on every single HTTP request. Even cached...


make it longer and do it async? then maybe the first request gets thru but the rest are dropped. #justthinkingoutloud




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: