I am selling a database with ten billion phone numbers. 1.25 GB file with each number compressed to a single bit. You can compare the clubhouse database against mine to determine which numbers are not in their set.
Knowing which numbers are capable of receiving SMS and which aren't has some value.
Especially in a world of number portability where you can't just say "oh, that's an old number, it must be POTS".
But I guess, here, if a number is from your contact list, it may still be POTS.
But at least you have higher assurance that it's an active user. If you wardial one day, you quickly find out how many numbers never lead to a human for various reasons. In theory, some of these are trap numbers and quickly flag the caller as suspicious, but I doubt it.
"Knowing which numbers are capable of receiving SMS and which aren't has some value."
This isn't difficult - I wrote a shell script named "lookup" that will give me background info for any phone number I feed it and tell me what kind of number it is, what carrier it is, who it belongs to, etc.:
The Local Routing Number provides this value in the USA, and multiple carriers (eg:Twilio) offer daily deactivation reports from the cellular carriers so you can tell which numbers are unroutable.
Great. It’s the weekend and I can theoretically now stop thinking about software, and yet here I am thinking of ways to efficiently compress lists of phone numbers
The Kolmogorov complexity of the set of all phone numbers is pretty low. All phone numbers with a few missing is also pretty low.
In fact, I now wonder if you can even compress the 3.8b phone number set to less than 1 bit per phone number. It should be pretty doable since a significant chunk of the number space is not valid.
I have even better - for every country, just covering all their operator's prefix and then 99999-9999999 numbers in that range. Definitely the biggest dataset around, and bigger is alwyas better, right?
The 3.8B numbers is really meaningless, in isolation. This is the problem of plenty - 10K numbers with a very specific profile might be a lot more valuable. The real worry would be the info on the relationships between the numbers (which number is connected to whom). This leak seems to have a count of relations rather than the actual connections.
Well the facebook data that was published everywhere earlier this year could hold some value when combined with this one: While the facebook data is somewhat outdated, I'm pretty sure you'd get millions of people with relevant and up to date information.
According to the Tweet, the leaker provides a claimed data sample that is a list of phone numbers without any additional information.
A list of 3.8 billion phone numbers that simply exist is useless. The leak would only have value if the numbers were associated with some identifying information.
If it’s really only phone numbers, I wonder if it’s a leak or if someone brute-forced all possible phone numbers against a ClubHouse API that leaked information about whether or not the number existed in their database.
Because they encourage users to upload their contacts so they can connect them on the platform. At one point when it was invite-only these uploaded contacts were the only way to invite friends.
Last I heard, they had around 10M users. Since they employ the, what I would consider, dark pattern of heavily encouraging folks to upload their contact list, that comes out to an average of 380 people per person. Given the Clubhouse user base demographics, I find this at least plausible.
I'd say it's even more of a dark pattern than that. They didn't encourage me to "upload my contact list" but rather "give access to my contacts" (or something like that) Perhaps the difference is trivial in how it's coded yet even though I've removed their access to my contacts, they still have my contacts. I think they should have to delete them whenever I remove their access, or not even upload them in the first place but just read them when necessary.
Also, some apps seem to do this with photos, asking for access, does anyone know if these apps also upload all of one's photos once the user grants permission on iOS?
> does anyone know if these apps also upload all of one's photos once the user grants permission on iOS
That would eat up a lot of bandwidth. I suspect someone would notice it. An app could extract a lot of information from the metadata though, assuming it had access (I'm not sure how permissions on iOS work currently). It could also potentially run facial recognition algorithms locally (not sure how well that would work in practice though).
I really like that point about the bandwidth and also about the metadata and facial recognition.
I guess I just wish we had more insight into what info companies take and how, permissions on iOS and Android seem to be getting more granular and yet still seem quite broad to me.
I’m particularly fond of iOS’s new “selected photos only” setting, but apps really don’t support it well in general (so I chose not to use them anymore). Instagram used to support it decently well, but in a recent update they removed the “select more” button and my usage of Instagram has dropped dropped dramatically since.
I mean, I like it in theory, however I find it can be really cumbersome. I don't see why they can't just have me open my "pick a photo" browser on iOS without needing access to the photos. Seems odd that choosing photos from the OS can't just be the default option.
When an app first requests access to photos, it’s one of the options listed in the system permissions dialog, so it’s virtually the default. The problem isn’t that, it’s that once you’ve picked the “selected photos only”, apps can choose to make it a pain to pick additional photos if they don’t add a UI element for it. Given that Instagram had it before and then removed it, I can only assume that the real reason is to try to coerce users into granting all access (nice try FB, but not going to happen for me!).
Oh wow I didn't know this. From what I see on iOS, IG still lets me Manage>Select more photos, whereas WhatsApp has a tiny "You've given WhatsApp access to only a select number of photos. Manage" at the top.
So now I've set all to Selected Photos and will just click manage and add extra photos when I need them. So much easier than I had thought, thank you!!
> From what I see on iOS, IG still lets me Manage>Select more photos
Weird! That option is missing from mine as of about a few weeks ago when doing a normal post. Stories’s picker gives me the option to “Manage”, but no where can I find the option for normal posts as of the last app update. Would you mind sharing a screenshot? I’d love to see if our UIs are different in some way. My contact info is in my profile here if you prefer to share privately.
> I'm pretty sure that would qualify as the number being "made up".
Not necessarily. Let me give you an example, if there’s other metadata included with a specific contact list entry, it would be valuable to have duplicate numbers, as that extra metadata could then be leveraged potentially.
they didn't "validate" anything, they just opened the csv. also i'd be interested in their take on the second column, that looks like clubhouse's scoring system (which they ran without telling anyone, likely for marketing purposes, according to this* article). if so, you can in fact tell which numbers are more significant than others.
Hmm, so the "highest" numbers would be publicly-knowable numbers anyway (because they are the numbers to dial and contact the government/customer service of a private company).
If this is only a list of numbers and their relative popularity, the best you can do is accusation of adultery (and even in that, you could say that you're "popular" because coworkers also store your numbers).
Enough phone numbers for half the population of the world? Cool story, bro.
I refer here to the aspiring salespeople, not the person reporting it. I suspect this list will be available for free on the dark web within a couple of months. Much as I like to collect interesting data this doesn't seem useful.
I wonder how feasible a business model it is to collect all the data from all leaks which make their way to the internet, massage the data a little bit, and sell it as a brand new "hack" of some popular service. You can probably do this a few times a year without a problem.
Why fake a new data leak at all? It's likely to be illegal either way. Depending on the quality of your work I suspect it would be easy to find buyers for aggregated and cross validated data sets on the black market.
For that matter, I have to assume that the shadier businesses silently make use of publicly available leaks. The data is just too valuable to ignore depending on your business model.
Clubhouse does the classic “share your contacts with us to find your friends here” thing, but it sounds like they just upload your entire list into their database instead of doing anything remotely privacy aware. I’m mostly curious how much else they uploaded with the numbers - is this name + number + email etc? And if this dump is just numbers, do Clubhouse have the rest somewhere else?
Are people really that stupid to give some mobile app company access to their contact list? On iPhone you have to explicitly give permission, I presume on Android as well. I find that hard to believe everyone is doing it.
Many apps will refuse to work if you don't allow access to your contacts, so people just give in and allow it.
Google is the biggest abuser in this area just grabbing all your contacts and linking them to your Google account once you add any Google account (like Gmail or Youtube) to your Android device.
Whats the point of that? You dont need to exchange phone numbers for telegram just the @username and only one side needs to know the others username.
And once you have a chat with someone both can share their own contact directly in the chat with 2 clicks and add it with 2 clicks as well.
(which is still rather useless because there is no real benefit from adding someone as contact. But I guess if you want to store number then this is easy)
You're thinking like a technically enlightened person -- if not an engineer -- who prioritizes efficiency and control.
You're not thinking like a "normie" goal-oriented user, who doesn't care about understanding the system, and for whom the shortest path to achieving their goal generally passes through saying "sure, whatever" to any requests the app makes.
You put the @username (the @ is optional) in the search field then click on the user and send a message. Once a message is send the chat will obviously stay in the chat list. Same for public groups/channels.
Alternatively you can share/click a link with the format t.me/username to skip the search part.
Afaik WhatsApp (on Android at least) requires you giving access to your contacts. So roughly speaking a huge chunk, probably the majority, of smartphone users shared their contact list to at least one company, which strictly speaking might not even be legal in many cases.
After all that's how WhatsApp populates its contact list, it looks which users have each other's phone numbers. That way it doesn't need a user login and friend/contact requests, but in return you give up your privacy.
Everyone doesn't have to. If one person with your number gives up their contact list, they have yours. I'd guess about 10-12% of the populace would have to cooperate.
Actually now that I look into it again, it looks like since the middle of March of this year it's even possible to invite others without sharing your phonebook.
If the seller doesn’t get caught due to the purchasing methods and general routine OPSEC, then its just another example of the Fed reliably monetizing everything, meaning there will always be a buyer and everyone should sell more.
That's what law enforcement does all the time: when there are illegal goods for sale, and a chance to catch the seller, they will go in, make the purchase and arrest the seller.
Sorry for the stupid question, but isn’t it illegal to buy illegal stuff? How does the police get away with that?
For instance in Denmark it is technically illegal to buy stolen goods, even if you genuinely aren’t aware of it being stolen. Im sure this applies to most countries.
LEOs often seem to be exempt when acting in an official capacity. I’m not sure what the restrictions are—do they need a court order in a situation like this?—but LEOs are definitely allowed to break laws and buy illegal wares.
Illegal is defined by law and laws applied to a subset of people.
What do you think the police does with illegal substances? Not confiscating them because "owning" it is illegal? No, the police does not take ownership the state does and the laws do not apply to the state. There is nothing out there in the world that is illegal for everyone to handle. not drugs, not nukes, not illegal media etc. someone has to have the right to handle it somehow.
This would not be a classics sting operation. The seller already committed the crime(s) by offering it.
Sting operation usually are the reason someone could commit a crime by creating a bait crime opportunity.
Let's play devil's advocate here and assume I am the dude selling the list.
I would ask for monero and would not care if the FBI is the buyer. The most they can do is to watch exchanges where monero is exchanged versus dollars or other cryptocoins. Then do this a few times over and start buying goods with those then sell the goods on Amazon/eBay for hard $$$. Small amounts and even with 50 cents at a dollar is still worth it for one person.
I've wondered about the feasibility of using state run lotteries for laundering in a cash based criminal enterprise. The known odds of low cost/return scratch-offs and the need to only account for claimed winnings would make it tempting... if it wasn't so labor intensive.
I don't think it would be a good idea, given that you'd have to claim the winnings. It might work once or twice but not over and over again.
Additionally in most cases I'd think the lottery odds would be lower than the cost of traditional laundering (smurfing, through crooked banks, using cash based businesses like taxis etc.) Especially if you have to pay people to buy tickets.
> It might work once or twice but not over and over again.
Except for when it does: there are a bunch of people who have repeatedly jackpotted state lotteries, they're usually described as 'reclusive mathematicians'. But that isn't what I'm talking about. I just checked the TX Lottery Commission's site and it looks like scratchoffs would run, worst case, a 30% return. I can't be bothered to calculate the upper bounds, but I'd expect it to be 40%-ish. That seems good to me, I especially like that you can skip the part where you have to drive out to some hotel to meet an undercover Secret Service agent pretending to be a Wells Fargo employee responding to your help wanted notice in Soldier of Fortune.
I learned a long time ago that the most effective way to correct a vice is to play it against another vice, sloth being an easy goto. But in this case... I'm not a drug dealer, so I don't need to launder large amounts of small bills. But... if I wanted to launder a bunch of public ledger based crypto: instead of a using a loud and proud "bitcoin tumbler", I'd use something like satoshibet. Of course, that is likely why the original no longer exists - and I imagine anyone standing up a replacement (without a sufficiently invasive KYC implementation) would face similar hostility. Anyway, I expect that'll change when a state run satoshibet eventually emerges.
How realistic would it be to send (anonymous) mass sms messages with phishing or other malicious links to those numbers? I’m occasionally getting sms message with bogus sender info (i cannot reply, nor get contact info), always wonder how spammers pull that off so easily.
As a challenge, I try to takedown these things by reporting them to Google Safebrowsing, their SSL provider (if they have one), their host, their URL shortener, etc.
Though in Canada, I'm seeing them apply some cloaking measures so they don't get removed as quickly.
I think there's two streams of this:
1. a crooked telecom that has low-level access
2. buy a bunch of SIM cards and dump them into one of these aliexpress machines that has 16 wireless modems in them that let you do whatever you want:
I’ve been getting this since the FB hack (by “hack” I mean the recent bulk enumeration of 500m phone numbers that Facebook facilitated for an unknown party).
It's funny how the hacker who is selling stolen private data is also complaining about GDPR compliance and privacy. On the one hand, he's right that Clubhouse (if this is true) has done something bad, but the hacker is much worse.
They are done for this time. Leaking peoples' number who haven't even signed up yet because of their economy flame approach for literally anything, oh boy...
If you have enough cash and time you can legally create your own list of all possible numbers on the world. Pick a number, dial and see if it exists. Hang up to prevent further charges.
> create your own list of all possible numbers on the world. Pick a number, dial and see if it exists.
Let’s say you had the ability to do that 1,000x a minute using an automated dialer. Just in the US alone that would take you over a year to complete and how many of those numbers you verified changed active/disconnected status during that time?
(PS, I didn’t downvote you, just pointing out a problem with your theory)