Hacker News new | past | comments | ask | show | jobs | submit login

I did a very non-scientific and possibly irrelevant test. Together with a few other people we decided to "test" some smart assistants and phones+app combinations to see if any of the data they might collect leaks. We came up with some random but easily identifiable keywords that we'd normally never use in day to day conversation and consistently dropped them in fake conversations. I tested an Echo Dot with "Whirlpool washing machine". Others tested Google home, or the Android Facebook app.

Bottom line is that Amazon and Facebook somehow decided to show exactly the random stuff we fed them. None of the stuff fed to Google, Microsoft, or Apple ever made it out in an obvious enough way for us to notice it.

It's not something to draw a solid conclusion on, I'm sure the experiment had plenty of flaws but the degree of suspicion it raised was way above the noise floor and it was enough for me personally.




Are you certain that nobody searched for those keywords or entered them anywhere?

If I search for something, it immediately shows up in the facebook feed of the person I live with. We aren’t even friends on facebook.

However we share an IP address via NAT, and I’m sure location data has leaked enough to correlate us.

I have never yet come across an example of this where there aren’t correlating variables other than the always listening mic theory.


I'm as confident as I can be that the chosen keywords were never spoken in casual conversation (especially since none of us speak English around the house) or searched from that IP. Every mobile device in the household is permanently connected to my network via VPN. The entire Amazon usage is comprised of 2 browsers on 2 different PCs. The Echo was used just as an experiment (came as a freebie) and spent most of its time hearing me "randomly" drop the "Whirlpool washing machine" stuff in conversation. As far as I was told everyone else controlled the experiment just as I did.

We even tried to make sure the terms are "plausible" given all other data the companies may have had on us. Age, social status, etc. We picked things where we're comfortably but not too obviously in the target audience (no "energy drink for student gamer" type thing).


All the devices being on the same network would be enough to correlate them.

I can’t tell from your description whether this was adequately controlled or not.


The idea was that different households picked different devices/services/companies and fed each with a random but plausible keywords to see if this is actually covertly picked up and ends up coming back to us on some other channel later on.

I had an Echo that I installed in a spare room in the house and used for a short time exclusively to have these made up conversations next to it and keep talking about a "Whirlpool washing machine" without ever using the Alexa hot-word. The keyword really couldn't leave that room except via the Echo. After a short time to my surprise I started seeing this in my Amazon. I have no doubt that the Echo is (at least occasionally) listening and sending information without any indication that it does.

My friends tested their own stuff in their household with their own keywords in much the same way that I did. Google Home, Apple Homepod/Siri, Facebook, Microsoft Cortana. The only 2 people who saw their keyword pop up again were myself with Amazon and one other with Facebook.

I can't draw the conclusion that Google does not do this, maybe they just do it smarter. But I can certainly say I cannot under any circumstances give Amazon the benefit of the doubt.


Sounds good - I wouldn’t give Amazon the benefit of the doubt either.

So - are you absolutely certain that nobody in your household used the term ‘whirlpool’ in any text based online interaction?

For example - is it possible that you emailed someone while you were coordinating these tests and you or they are Gmail users?


I am as certain as I can reasonably be. Most of the "coordination" was actually done over a picnic when we came up with the idea, and then tweaked over several other social meetings but the actual keyword was picked in my head. There's only one Amazon account in the household, using a dedicated email address, the Echo was in an unused room, and the language of the household isn't English. I picked whirlpool because it doesn't sound like anything in my native language and I have no interest in the actual appliances. I'm sure that quite literally the only way for that word to be "served" back to me on Amazon is either the most incredible of coincidences, or the Echo leaked it.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: