Hacker News new | past | comments | ask | show | jobs | submit login
The Legitimisation of Have I Been Pwned (troyhunt.com)
277 points by weinzierl on March 21, 2018 | hide | past | favorite | 69 comments



Seeing it being recommended by government officials in multiple countries made me realize that the website isn't localized.

I think the website should be translated, together with the domain name. Actually, "pwned" already sounds like something only those who "live and breath tech", to quote the article, would be expected to be familiar with.

This unfortunately raises a serious question of trustworthiness, with regards to phishing issues. When password.com, mypassword.net and passphrase.org are valid domain, how would you know whether to trust mypassword.com?

The bandwidth costs could also be prohibitive. To alleviate this issue, I would like to see the code being opened up (for perf. improvements), and distributed hosting encouraged, databases entrusted to a few trustworthy entities, with everything sponsored by a nonprofit entity. This doesn't solve the trustworthiness of the various entities, though [unless this is done on governments'websites?].

Of course, there is always the possibility of a rogue government/ISP/hoting service MITM the connection, even issuing false certs... But a rogue actor might as well MITM the target web site directly.


> I think the website should be translated, together with the domain name. Actually, "pwned" already sounds like something only those who "live and breath tech", to quote the article, would be expected to be familiar with.

Not sure about that. I feel like translating the domain would made the phishing/trustworthiness issues more difficult - as is, there is a single canonical domain name which, at this point, has been well-publicised. Verifying that any particular translated domain for the myriad possibilities is real, especially when they may be less widely publicised/used due to having smaller userbases/fragmenting the userbase, seems tricky.

I also not sure how much of an issue "pwned" is - a lot of successful services use words that _no one_ would be familiar with or that are entirely made up.


Yeah, Pwned is fine. I mean what's a Google? That isn't a word!


Well strictly speaking it's a typo of a word.

https://en.wikipedia.org/wiki/Googol


It's entertaining that Google are and always have been typo-squatters, but "pwn" has been a typo since the very first time it was typed.

Now since there is no red squiggly line I am wondering whether I added "pwn" as a custom addition to my local dictionary...


Seems to be standard in the macOS spellcheck dictionary. Some traditional dictionaries now contain it as well.


I was amused by the aside late in the article that Troy has been collecting weird misspellings of pwned, mostly thanks to media articles, and buying those domains for redirects to be helpful. Personally, I'm now in love with https://haveibeenprawned.com; it's a like a little tiny modern internet reinterpretation of Kafka's Metamorphosis.


> Actually, "pwned" already sounds like something only those who "live and breath tech", to quote the article, would be expected to be familiar with.

I work in academia and a recent weekly newsletter included an article about password security and included a link to HIBP. A few hours after the newsletter came out, we received an email from a faculty member asking what it meant that he had been "pawned" because the site said he had been.

While it got a few private chuckles from us, I felt obligated to point out that outside of our circles, most people have probably never heard the techie-slang term "pwned" in their lives.


If the use case is only checking YOUR OWN email address, then why not require a verification of the email before providing the results? E.g. you enter your email and get a results link sent to this same email address. Because now HIPB appears to be an interesting OSINT tool for gathering information about people, specifically for finding out what "shady" services did they ever register at.


You need to register to see the leaks from "shady" services.


Shadyness is relative. I felt quite ashamed of having an account on some of those sites that HIBP considers "non-shady".


If someone has your email address, it's often not very hard for them to determine whether or not you have an account on a particular site. (Try to register with that email, and if the sign-up form complains, then you probably do have an account there.)

That said, if you're still concerned there's always https://haveibeenpwned.com/OptOut


Would be neat if one can opt-out a domain. I.e. I have a domain that's essentially a catch-all for signing up with per-site unique emails.


Can't hurt to send Troy an email. Probably would be a useful feature for companies, too. Could be achieved relatively easily w/ DNS validation.


There is a big difference between checking every service one by one and scanning them all at once.

OptOut is indeed what I needed, thank you for the idea.


If you are looking for an "offline" way of checking your (windows) passwords against the HIBPv2 dataset, I encourage you to check my current pet project out: https://safepass.me

It's an active-directory password filter that is free for home-use and dirt-cheap otherwise.


https://github.com/mihaifm/HIBPOfflineCheck - KeePass plugin doing the same thing


Also https://jacksonvd.com/checking-for-breached-passwords-ad-usi... found in https://haveibeenpwned.com/API/Consumers offers active directry integration both offline and on.


You're not comparing the same thing. One is a commercially supported product... that ships HIBPv2 in a ~400MB bundle, the other one a PoC... that suggests that doing a binary search over a 30GB+ dataset each time there is a password change is a sane thing to do.

https://github.com/JacksonVD/PwnedPasswordsDLL/blob/master/P...

I wouldn't recommend to anyone to seriously consider deploying that code in production!


I don’t think GP was offering a comparison so much as a related item to check out.

Why would you not recommend code that checks the hashed password against a local DB?

See also the “pwned-passwords-django” process here: https://www.b-list.org/weblog/2018/mar/06/two-new-projects


> Why would you not recommend code that checks the hashed password against a local DB?

I would. In fact that's why I have created a product to do exactly that in an efficient way...

Doing the checks online has too many drawbacks:

- availability: what do you do when the service/API is down or you can't reach it?

- determinism: what works today might not tomorrow

- security/privacy/anonymity: ...

I am just uncomfortable with naive code that makes it barely practical:

- if the dataset isn't pre-processed properly, binary searching through it won't lead to the expected results (and that's not always obvious)

- distributing a 30GB+ file on all the DCs

- binary searching through the dataset at runtime means seeking through 30GB... with a O(log n) complexity... in practice that means a very slow response time that gets exponentially worst with load.

If you pre-process the dataset you might as well do it "properly" and make it usable :p


I see the API and I get why charging for access would be bad but what about donations?


you can donate on HIBP: https://haveibeenpwned.com/Donate


I wish there was a way to prove ownership of a domain and be able to search on *@example.com for those who use unique emails for every service to track accounts that got hacked or companies that sold my personal info.



So I just tried this, it's a very different proposition than normal HIBP, and much more useful to people like me who don't just have a single address and give that out everywhere.

HIBP reports that zero of my addresses have been pwned. Which is weird because those addresses include several that have been pwned in well known cases, and several more where there's no public "We had a data breach" type report but it's clear that somebody did in fact lose all my data.

That's pretty disappointing, it suggests HIBP doesn't have very much of the breach data that really matters, and so many unsophisticated users who get to the site are probably misled.


Did you try some of those emails in the standard HIBP? There is a difference between a site being hacked and your details posted publicly and a company you signed up with selling your information. The former is on HIBP the latter is not.l


Thanks!


Troy: you might want to use the standard emails (eg, admin@ rather than dns-admin@) for domain ownership - I'd worry that dns-admin@ might be available to users at some domains.

Also maybe apply for a .well-known to HIBP file uploads?

(It's Mike from CertSimple here BTW)


dns-admin@ is probably from the whois record of the domain you checked. The predefined list is only the last 4 addresses that are shown: security@, hostmaster@, postmaster@ and webmaster@.


Nope, the domain's whois doesn't use dns-admin. From the Baseline Requirements for domin control:

> 3.2.2.4.4 Constructed Email to Domain Contact

> i) sending an email to one or more addresses created by using 'admin', 'administrator', 'webmaster', 'hostmaster', or 'postmaster' as the local part


If your personal info is sold though there’s nothing really actionable you can do about it, the damage is done. Not like changing a password.


I mostly use myname+someservice@gmail.com, is it possible to search for those somehow, without listing them all?


I have found this site really useful for my addresses and had set them to be alerted. Just a couple of days ago my Skype account was 'hacked' (probably because I'd stupidly used an old password and not changed it). What concerns me is just how little companies care - Microsoft have been awful at responding to it and keep pushing me to an automated system, which isn't working for me to get recovered.


I have always been worried of entrusting Have I Been Pwned with my email address, fearing some nefarious use.

Am I being overly cautious?


Every entity you've ever emailed has your email address. As well as any entity that has hacked/compromised any one of those recipients.

Don't expect it to be even semi-secret.


I have seen similar comments here before that suggest all email addresses are "public information". I am not sure I understand or agree with this perspective.

What if the address was only used to receive mail, not to send it?


If you can receive email to an address, then the sender knows your address. Either you had a"blonde moment" or else I'm misunderstanding what you mean wrt sending vs receiving.


The parent comment referred to "every" recipient knowing ones email address and "anyone who has hacked those recipients". The idea seems to be that all email addresses become "public". But this seems to assume things about the user and how they use the email address.

What if a user has an email address that she does not use to send mail? She does not use this account to send mail. She does not need to send reply mail. She has other email accounts she can use if she needs to send mail. If she only shares this address with one sender, then is this not at least a "semi-private" email address?

Back in the early 1990s when email accounts started to become easy to obtain, I routinely had accounts where I only received mail. I only gave these addresses to one or a few senders. I did not use these accounts to send mail.

I do not use email as avidly as I did back then, and maybe things have changed, but I would be a little surprised if today when a user sets up a new email account she starts immediately getting email1 because "all email addresses are public", before she has given this address to anyone. The address is "semi-private" unless and until the user decides to disclose it widely. She may choose not to do that.

1 Except maybe a welcome email from an email provider.


I’d say for your email that yes, you’re being overly cautious. For Troy’s recently launched https://haveibeenpwned.com/Passwords I probably wouldn’t use that myself, but he makes a reasonable case for why it’s secure.


There's a 100% safe API for password search where you SHA1 your password and query the first 5 characters only.

Even if you don't trust that form to behave as promised, you can do the query yourself.

https://haveibeenpwned.com/API/v2#SearchingPwnedPasswordsByR...


I consider the security of the 5 char API acceptable, but it still leaks 20 bits of information about the password to HIBP, so it's certainly not 100% safe.


20 bits of the password hash. I can't think of a way to use that information maliciously, can you?


It depends on whether the hash is actually in the database or not. If it is, then one of the hashes returned corresponds to your password. But the hashing is done by HIBP itself, so a hypothetical evil Troy could determine the actual values of those passwords. If he determined who you are, perhaps by correlating requests with email submissions on the main HIBP site, he could then try to access your account on another site with each of those passwords, in the hope that you reused the same password on multiple sites. The docs say:

> On average, a range search returns 478 hash suffixes

which is low enough that one could potentially try them all in a reasonable amount of time, even taking rate limiting into account.

...However, the leaks that go into the database typically contain username/password pairs, not just passwords. So if your password is in the database because your account was pwned (as opposed to the account of someone who happened to pick the same password as you), and the username is reasonably identifiable, anyone who downloaded the original leak could do the same thing, except knowing exactly which password to try rather than having to go through 478 of them!

And of course, the whole point of the password lookup is to inform you that your password is compromised and you need to stop using it. If you’re diligent, evil-Troy would only have a rather limited window to attack you before you changed your password on the relevant sites following a positive result. That is, assuming the API is honest and returns all the hashes it knows… In theory it could hold some back.


I probably trust Troy with my passwords more than I trust myself with them.


Thats how low the trust in passwords has got nowadays brings up the question: why are we still using passwords ?


What is a good alternative?

A device of any kind isn't good enough (eg. at border crossings or when lost otherwise). Biometrics can be a substitute for a user name but never for a password.


I did write a Whitepaper and Demo: UX for Authenticated & Verified ERC20 Payments Using MetaMask and EthSigUtil.

This can be applied to identity and digital ownership of any property. This solution moves security to the edges; that is, the sole owner of the property holds the keys to sign away their rights to the data through digital signatures.

This removes the need for a central authority entirely.

https://steemit.com/ethereum/@emmonspired/whitepaper-and-dem...


Most sites are either high enough value that they should do the hard things, or they should outsource identification in some way. It is ridiculous that low-value sites use passwords, especially where cookies by themselves would suffice.


Outsourcing authentication is tricky. Some of most likely sources are ones that many of us don't particularly trust.


Sure, outsourcing is tricky, but apparently so are passwords? One failure mode it doesn't have is "a bunch of passwords got pwned and my company is at fault".

I do agree that the current schemes such as OAuth2 have odious implications. I don't think that the design space is completely explored, however. By relying solely on emailed and browser-stored tokens with judicious lifetimes, a site could outsource to email providers in a pretty secure way.


Yeah. It's an email address, not a credit card number. Don't give the site your password(s) of course, but email address I wouldn't consider something that needs keeping safe and hidden


Is leakedsource actually gone?

https://leakedsource.ru seems to be up and running - or is it different people?

I mean, I sort of hate both these and prefer to just download raw data dumps straight from their source. But maybe that is legally questionable?


For the sites monitoring for outside breaches of their user accounts, doesn't that indicate that they are likely keeping cleartext passwords so they can rehash them to match the hacked databases?


I'd presume that are checking passwords that leaked in plain text or with weak hashes, or potentially just matching user names and airing on the side of caution. I can't imagine most companies proactive enough to check leaked data would choose to store in plain text.


No, they only match against leaks where the plaintext passwords were available (either directly or due to weak hash algs).


The page makes my Firefox miserably crash… (v59.0.1 stable, linux)


One of the examples he showed was a breach of a site that used "vBulletin" as the password encryption. I looked up what the hell this was and found this: https://www.vbulletin.com/forum/forum/vbulletin-sales-and-fe...

It's basically MD5(MD5(password)+salt). It was unclear if the salt was global to the entire DB or different for every record.

Surprise, yet another case of the blind leading the blind in PHP-land.


This may surprise some people, but doing stuff like MD5(MD5(password)+salt) was seen as a rather good way of storing passwords back in the day, and probably more secure than 99% of the password storage in 2005.


As I understand it, the problem is that many people use the same password on every forum. When they do this, an unscrupulous webmaster can use the hash to login as these members on other forums they post at. The double-hash in vB eliminates that potential issue and adds another layer of protection to forum members.

...


That's from 2005, and it's one of many forum software available at that time.

> There's a new release of PHP that adds several features, including a new sodium extension making PHP the first programming language to adopt modern cryptography in its standard library.

Surprise, yet another case of the blind leading the blind outside of PHP-land.

http://www.i-programmer.info/news/98-languages/11368-php-72-...


that thread was from 2005... 13 years ago... would you like to quote something a little more recent???


Sure. MtGox, a PHP site, got hacked just a handful of years ago because its creator decided to write his own encryption software.


got a source on that?



I think the authors exuberance and self congratulatory sentiments seem misplaced. From the article:

>"HIBP is Becoming the "Go-To" Resource for Protecting Accounts"

No "protecting" accounts" is something that is done internally at the source by companies storing user data. HIBP is offering a service that lets companies do CYA(cover your ass.) Companies can subsequently claim that they "notified" customers by telling them to check an external website. Responsibility is transferred to the customer and a third party.

>"What that means for the industry is "a rising tide lifting all boats"; it's becoming more legitimate for all those doing the right thing with the data."

No, doing the right thing with data is either protecting it or not storing it in the first place. HIPB is reactionary at best. The danger is that these companies with poor to no security practices continue to make no structural changes themselves. They simply use HIPB as a crutch do the least amount of actual work to protect data. This is supported by the following statement:

>"Oftentimes, the first a company knows of a data breach is when I send them their data."

It's hard to read that sentence and believe the author's assertion that "the industry has cleaned a lot/"


This kind of service should be ultimately provided by the government as part of cunsumer protection or police services.


Which government?


It is a great service but I really wish there was a good way of keeping it honest so we don't have to rely on trust so much. I wouldn't trust it with my passwords.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: