Hacker News new | past | comments | ask | show | jobs | submit login
Password checkup: from 0 to 650k users in 20 days (elie.net)
73 points by ebursztein on April 3, 2019 | hide | past | favorite | 15 comments



Troy Hunt already nailed the perfect service and API. Google's solution here is not documented, and clearly not as usable as Troy's [0] (anyone with openssl(1)+curl(1) can check if they've been pwned right from the CLI [1])

[0] https://haveibeenpwned.com/API/v2#PwnedPasswords

[1] https://gitlab.com/moviuro/pass-hibp/blob/master/hibp.bash


Why not team up with haveibeenpwned that has been around for years? Seems like this is doing the same thing


[flagged]


They try to do a decent job. They don't send your password to Google. Given them credit where it is due. Unlike, say, another sv company who decided to ask the user's email password to auto-verify emails.


This is a reference to Facebook https://news.ycombinator.com/item?id=19559617


How does this compare to https://haveibeenpwned.com/Passwords ?


It does seem very similar to the implementation of HIBP's Pwned Passwords by Junade Ali[0]. I haven't delved into the nitty gritty details/differences between the two, but they do seem to use similar techniques to guarantee k-anonymity.

A key difference, at a glance, is the inclusion of usernames to be paired with the leaked passwords.

[0] Junade Ali's write-up https://blog.cloudflare.com/validating-leaked-passwords-with...


Wow..that was the longest article I have ever seen that has absolutely nothing to do with the title.


First they create a lookup table of encrypted (blinded) hashes of each (username, password) that they've found on the darknet, and index this table by the first two bytes of the unencrypted hash of the username and password.

   // H = Argon2(username + password)
   Lookup[H[0:1]] = H^b; 
The hash function is Argon2 run with time cost 3 and RAM cost 256MB. They claim a 100M record database took 1200 compute days to process, and the actual database is 4 billion records, so presumably they spent 48,000 cpu-days initializing the database (but they don't state that explicitly). One run of Argon2(3, 256) takes about 1 second. [1]

H[0:1] is a 16-bit hash prefix of the (username, password) which is used to query the dataset. That means each query will select for approximately 1 in 2^16, or for a 4 billion large set should be expected to return about 60,000 results for each query. They state the query actually returns ~1MB of data for each lookup.

[Side Note: I like that they have chosen to use a just a 16 bit prefix, versus the HIBP prefix which is 20 bit, which to me is a little too selective if someone is pre-screening an online attack].

Here's where it gets a little neat.

You send the first two bytes of the hash of your username and password, along with an encryption of your full hash, call it 'H^a'. The server returns all the H^b for your prefix, along with your (H^a)^b which is H^ab.

When you get back your H^ab along with all the cracked H^b, you can unblind H^ab back to H^b using your 'a' (which is random, ephemeral) because the EC encryption is communicative. So cool. Now you have your hash (which Google may have never seen before!) in the form of H^b, and Google never saw the plaintext.

Essentially Google remotely encrypted your plaintext with their key, and you never saw their key, and they never saw your plaintext. But this allows you to now check if your hash (in the form of H^b) is in the set of ~60,000 H^b that Google returned. If it is, then they have your username and password in their dataset.

This is different than HIBP because Google is taking the risk of holding the actual username, password tuples in the form of... essentially... a keyed hash. Troy was explicitly not willing to take that risk.

This also lets anyone in the world essentially perform an online attack against Google's 4 billion record database however fast they can run the Argon function on any candidate {username, password} values that they might want to test, plus or minus any additional rate limiting. The blog post says they rely on the Argon2 for the rate limiting.

But I was just able to use this to make 20 guesses against my dad's password (based on his email address) before finding a match, meaning that password has been leaked at some point, and may still be in use. Of course it had my brother's name in it.

If services like this become pervasive, it may be a valid argument for not using per-service usernames, e.g. plus addressing, assuming the canonicalization doesn't strip that out, because it lets attackers use the service to target not just your (username, password) but effectively (site, username, password) directly. However, their dataset is, after all, leaked/cracked passwords, so your creds are already up for grabs at that point, and I'm not sure why attackers would use Google's service versus just building their own database from the available sources.

[1] - https://gist.github.com/Indigo744/e92356282eb808b94d08d9cc6e...


What's the benefit of doing it this way instead of how Troy does it?


Troy tells you that someone, somewhere, used that same password and it was cracked.

Google tells you that your username password combination specifically was cracked.

One is a yellow caution flag. The other is a 3 alarm fire. Both are useful!


This post right here is why I still read HN. Thanks!


If anyone wants to do something like this offline in an AD environment at work, Safepass[1] does some pretty cool things to get better password coverage.

[1] - https://safepass.me/


Question is, how do you turn this into a profitable outcome without alienating users?


I think in general Google wins the better (more trustworthy and low friction) internet experience of random users is. They can profit from increased internet usage and trust driving additional traffic to paid-search and AdSense sites.

Seems it could be an overall win for users and Google.


this is a story 1) that has nothing to do with its title, and 2) is about a team who WORKS AT GOOGLE got 600k chrome extension users in the first few week and how they “secure” the app from their own company learning your password.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: