Hacker News new | past | comments | ask | show | jobs | submit login
Brokers use ‘billions’ of data points to profile Americans (washingtonpost.com)
129 points by eplanit on May 27, 2014 | hide | past | favorite | 46 comments



Quick google search found this document detailing the various groups and clusters that one of these brokers uses.

http://reference.mapinfo.com/software/anysite_segmentation/e... [pdf]

Which cluster do you fit in?


Slightly tangential, but I feel like a few people on HN would appreciate this Vienna Teng song, named for the company that wrote the document you posted.

http://www.youtube.com/watch?v=0mvrKfcOnDg&feature=kp


I would assume that document to be satirical if it weren't so thorough.


Oh man, search for "married sophisticate" and read the "day in the life of" section for the group.

I wanted to kick someone's ass so badly after reading that.


For your convenience, the section parent refers to (on page 22 of the aforementioned pdf) reads as follows:

---

Name of protagonist: “Maria”

Wakes up... grabs the remote and flips on alternative rock radio. She gets up, goes to the kitchen, grabs a bottle of water and goes to the third bedroom that doubles as the workout room. She runs on the treadmill for 30 minutes.

Spends the day... researching a big liability case for the law firm where she works as a paralegal. She calls her husband and reminds him that it’s his parents’ anniversary and that it is a good idea to at least offer to take them out to dinner that week or cook a dinner for them in their gourmet kitchen.

Talks about weekend plans to... go camping to get away from it all. The weather is supposed to be perfect for it, finally cooling down after that heat wave.

Has a meeting with... the group of attorneys that is working on the case to discuss the fact that they were not going to have any case based on precedents of two similar cases.

Spends the evening... driving to the athletic club and playing tennis with her group of friends. Meets her husband for a late dinner at Jay’s Pizza. She has soup and salad with a Corona Light beer. Later they spend time online shopping for a new TV, hopefully one of those nice flat screens with HDTV and a Blu-ray player.

Goes to bed at... 11:30 p.m. after a half hour of pilates, watching... Scrubs reruns

---


There's something really unsettling about those sections but I can't quite put my finger on it


Maybe I'm wrong, but I think it's the way her life is built around buzzwords, brands and hopeless consumerism.


It's not her life. It's her life through a marketer's lense, or to make it even more precise: It's the kind of life a marketeer is trying to 'sell' to his clients :-)

e.g.

--

I woke up this morning at 7 am. I made breakfast using fruits shopped from the local super market. My tendency for fruits shoes that if I had a change to get fresh ones, I would probably pay more for them.

Then I started studying on my macbook air. Using my iPad now and then.

At 12:00 I went at the university near my house, to eat because the meal there is healthy and extremely cheap even compared to fast foods. I'm probably poor but then again I order expensive food every weekend on-line spending at once what I'd spend in a week eating at my university. When the university is closed and I have no other option so it's probably a choice that could be 'channeled' somehow.

At 17:00 I went for a run wearing an iPhone, a pair o nikes and an old shirt. I probably will need a new shirt soon.

I talked to my girlfriend via skype and then continue studying until 21:00 on my computer. Then I started browsing computer related websites and a couple of mailing lists. I'm interested in programming, politics and sports. I twitted, I don't have a facebook account. A quick look at my blog will show that online privacy matters to me.

Late at night I downloaded a book on iPad and started reading for 45 minutes before finally sleeping at 00:15

--

... That's not my life :-)


I'm not sure, I got the same feeling from the other profiles.


That's because the vast majority of "western civilization" (and not only) revolves around marketing lies.

We're all living in Amerika.


Ist wunderbar!


Wow, that's incredibly interesting -- lots of info to digest. I had to download that and plan to parse some of it, there's actually a lot there.

Any idea where this data/info/rhetoric comes from, or what it's main purpose or intended usage is?

Thanks for the link!



> “Consumers can’t manage this process by themselves,” Brill said. “It’s too big. It’s too complex. There are too many moving parts.”

As someone who works in ad-tech, for a company that uses some (anonymized) data from brokers: I'm constantly astonished by the complexity of the industry. It's certainly too complex for a consumer to figure out. I often have a hard time remembering who talks to who and which companies go where. Just take a look at this notorious (in ad-tech) infographic produced in 2010:

http://www.adexchanger.com/wp-content/uploads/2010/09/LUMA-D...

That's probably missing half the players and data flows, and it's also 4 years out of date. Things haven't got simpler in that time.


It certainly gives more examples of data brokers than the article.


You're going to throw about 200 brands at me and call it complex? Is this a joke? Have these people heard of computers?

It's a flowchart. It looks like you haven't even tried to create a list of players. If it's less complex than SQL double-hop then it's not complex.


I didn't create the chart. The point I was trying to make is that each of these companies is a player in the industry and has either tracking data of their own and/or they're connected (directly or indirectly) to various data brokers. The chart greatly simplifies reality - it it represented all the interconnections between individual companies it would be vastly more complex. It can't do that because most of these connections are known only to the companies involved.


I don't understand the downvotes.

Anyone of average intelligence will have a pretty good understanding of the process after reading a 2-page summary[1]. The infographic the "grandfather" posted is designed as an inside joke. Yeah, there are a lot of companies you've never head of in the market, but all of them perform 5 or so different services. The market is very young, thus very fragmented and everyone comes up with their own nomenclature for differentiating purposes.

[1]econsultancy and Adobe have good whitepapers on the subject.


“The extent of consumer profiling today means that data brokers often know as much – or even more – about us than our family and friends, including our online and in-store purchases, our political and religious affiliations, our income and socioeconomic status, and more,”

And this is where the NSA/FBI will go to collect data in the future. If, miracle of miracles, we manage to get the government out of our daily lives, they'll just write NSLs to these guys and in one swoop pick up tons of information from dozens of sources -- including data about people we associate with, since no warrant will be required.

These guys were doing "interesting" things when I last looked five years ago. Who knows how bad it's gotten since then.

ADD: I've thought this thing through several times, and the only way forward I see is either outlawing any collection and aggregation of private data outside of that directly needed to provide service, not ads -- or making all information collected from any source publicly available to all. I don't see the middle ground. (Sadly)


I'm more worried about the people who don't need the NSL's to get access to this data. Banks, credit card companies, employers, etc. There's a huge potential for abuse that's more likely to affect your average person who isn't a political dissident.


It's good that we can worry about different things, but you're kind of making my point for me -- if the information is available commercially, all you have to do is 1) have a business relationship with the broker, and 2) punch in a credit card number to get it, no matter who you are.

What we're seeing is that the brokers aren't very keen on having everybody and their brother coming by to check out folks, so it's a quasi secret type of information. That's the worst of both worlds.

By the way, I keep hearing that phone companies have already set up e-com sites for Law Enforcement to retrieve phone tracking records (which do not require a warrant) Punch in your department's credit card number, type in the guys phone number, and get a record of everywhere he's been. Pretty cool stuff if it's true, but, as I note, folks who do this stuff aren't very keen on letting the rest of us in on the action.


I'm not disagreeing with you on the meta-point. I'm just saying, even if you're not afraid of the government you have reason to be afraid. As an aside, I think HN gets this message backwards. I think it strikes a stronger chord with people to talk about what their boss could do with this data than what LE could do.


Don't know, don't care.

I find debating whether the ship struck the iceberg on the left-hand side or the right-hand side a bit of a distraction from the effort underway to board the lifeboats.


Its relevant to convincing passengers the ship has a hole in it in the first place.


But it's not an either-or choice. In fact, the more approaches, the better. :)


Don't worry, it's already happening in big ways.

http://krebsonsecurity.com/2014/05/experian-breach-tied-to-n...


> I see is either outlawing any collection and aggregation of private data outside of that directly needed to provide service, not ads

That's close to what Germany does, if I understand their data privacy laws correctly.

On the other hand I can imagine how it would go in the libertarian-style U.S.: "If statistical data analysis is outlawed then only outlaws will be able to crunch numbers!"

Er, they'll probably think up a better catchphrase. But I think you see the point, it's hard to enforce laws which are inherently unenforceable without massive intrusion.


I think what you mean is "If selling private information is outlawed, then only outlaws will sell private information."

Since I don't need private information about you to defend my life, I'm OK with that.


I think there is a case to be made that a posture of restricting the use of this sort of data can be worse than encouraging a free for all. At least with a free-for-all kids can build interesting things no one would think of. The status quo essentially enforces that it can only be used by the powerful.


I agree.

In a free-for-all, people would be aware of what was being collected and they would take action. In what we have now, the fact that the information exists, it's extent, and what happens to it remains secret. I say get it out there.

In fact, it's wrong that a person can sit in a far-away city and spend ten bucks getting all the information they ever wanted to know about my neighbor because he shared it freely online, yet that same type of public knowledge is not available to the rest of the neighborhood for free. Freely-shared information has always been assumed to be local in nature -- you share it with somebody you are in proximity to. This is a good thing: keeps you from over-sharing.


It's almost like we're trying to ignore the Streisand effect when it comes to our personal communications.

It's not a perfect parallel, but you'd think internet-savvy people should understand how hard it is to keep the cat in the damn bag!


> It's almost like we're trying to ignore the Streisand effect when it comes to our personal communications.

There's a difference between trying to suppress published information and trying to prevent the collection of that data in the first place. The Streisand effect applies to the first of those.


Right, I guess I'm basically inferring a corollary to the Streisand effect. The Streisand effect is that once information is out, it cannot be suppressed; I'm going for something more like, the information will eventually get out, it cannot be forever suppressed.


“You’d think if there was a real problem, they’d be able to talk about something other than potential.”

When the lobbyist starts talking in these terms I know they are worried. Sort of 'prove we abuse it' challenges are common in the intelligence community.

That said though, what exactly would you want to be true here?


You know what I would love to see as a solution to dealing with this threat:

A clearinghouse where everyone could register to receive every report from every company collecting information on you, then a regulation that requires all companies collecting and brokering information about you have a legal obligation to send you all the information they have on you and keep you updated of changes, and a report profiling the customers that bought that info on you.

I doubt this market it going away, so the least we could do if require this industry to keep us informed of the information on us and how it's used.


I don't see exactly what the threat here is. How am I being harmed by someone choosing to show me an ad for something relevant to my interests than something that's not?

I'm not saying that I'm not being harmed, just that I don't see it. Are companies quoting me higher prices than if I was in the lowest-income bucket or hiding lower-cost alternatives?


Evidon [1] is what you are looking for. It's not exactly the way you describe it. More like an alpha stage of what you talk about. There are little boxes in the corner of website banners (served from the more reputable companies) which leads you to a page telling you why you were served that ad and giving you a list of companies who have data on you. You can through them and chose to opt-out. They have a timeframe to delete that data. Currently everything absolutely voluntarily. Google have their own variation of this called "ad choices" you can find it on youtube and other adsense properties.

[1] https://www.ghosteryenterprise.com/

PS: They are a commercial entity and develop the popular ghostery plugin, which does the same, but pro-actively.


I recall reading somewhere that the leading data point predicting car accidents was the driver's credit report, but this was decided to be private information in that case so car insurance companies were forbidden from using it. Where do you draw the line for this sort of information? It seems that PII (or Personally Qualifying Information in this case) is quite easy to find if there is someone trying hard enough to find it.


In the US it's common to use credit report information to price insurance, definitely not forbidden in any way. I'm not sure about the rest of the world. See

(1) https://www.statefarm.com/about-us/company-overview/company-...

(2) http://www.allstate.com/about/credit.aspx


Credit score isn't allowed in pricing of insurance for a lot of states, including California. It depends state to state though.


I think I'm more worried about erroneous correlations from collected personal data than the leak of the data in the first place.


A better way to say that would be: erroneous causation assigned from from mass correlation of personal data.

The automation to rapidly sift data and identify correlations is coming along nicely, but I fear that the capability outstrips the ability or desire to assess the correlations and figure out if they actually point to any sort of causality.


I once had a peek at a slide deck from another marketing services company, with very imaginative cluster names.

If you're in Australia you should wonder if you'd be filed under "Bogan Dreams". Or "Guns and Trucks" if you're in the US.


The different segments mentioned in the article reminded me a LOT of what the Cozzano's campaign managers in Neal Stephenson's Interface [1] did when segmenting the population to target the campaign messages.

As a consumer, it's a bit terrifying how little control I have over this data. It sounds like they have deeply personal details on people who have absolutely no idea what has been collected and sold to marketers.

[1] http://en.wikipedia.org/wiki/Interface_(novel)


This is no different than what President's Organizing for America did:

The tool kit was custom-built for the 2012 Obama re-election campaign. It digitally linked data on millions of American voters, including their email addresses, through Dashboard as well as through social-media sites such as Facebook and Twitter, to an army of staff and volunteers knocking on doors in the key swing states.

http://www.theguardian.com/world/2012/may/14/obama-digital-c...

http://www.theguardian.com/world/2012/feb/17/obama-digital-d...


> the system of commercial surveillance that draws on government records, shopping habits and social media postings

Also, your cable television viewing habits. Probably your Internet viewing habits (as seen by your ISP) and telephone habits as well. "Consent" in those cases are called "terms of service" in common parlance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: