Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If they're using the same method that they use with Google, then this is not the case: https://developers.google.com/safe-browsing/v4/urls-hashing

The URLs are canonicalized first and then the full URL is hashed.



That's true, but if you scroll down further it mentions that they try several permutations of the url, up to 30. So while grabbing 1 of the 2^32 buckets won't say much, grabbing 5 buckets (once for each permutation) in a specific order may very well indicate that you browsed a specific url.


That’s on the client, not on the server. The server would just provide the responses for the specific hash prefix. While I agree that, in time, it would eventually get good enough to categorize people, all it would take is a change in the hash and most of the old model would be trash.


>That’s on the client, not on the server.

But that required by the system to work. Otherwise you have to choose between having really poor granularity (if you only check the domain), or really large black lists (if you only check the whole url). Point is, there's no way for this system to work effectively and be anonymous.

>all it would take is a change in the hash and most of the old model would be trash

Why would the hash change?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: