Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Facebook Sues Data Geek, but That Doesn't Solve Its Privacy Problem (fastcompany.com)
23 points by cookiecaper on April 3, 2010 | hide | past | favorite | 20 comments


The 'innocent data compiler' is actually a HN semi-regular. waves hello

I've got lots of thoughts on all this, obviously, but I'm trying to collect them all into a considered blog post. I'm happy to answer any questions I can though.

And if you're interested in related code, you can check out my Google Profile crawler over at GitHub:

http://github.com/petewarden/buzzprofilecrawl


"... Facebook doesn't look totally evil though. According to Warden: 'From my conversations with technical folks at Facebook, there seems to be a real commitment to figuring out safeguards around the widespread availability of this data.' ..."

Hi Pete, I've been following your threads [0] with interest seeing how the social graph can be interpreted in code. The above quote in particular stood out though. I'm not sure if the "no evil" tag applies where you can access the official Fb API and developer program then expect good to come out of it. [1], [2]

  “That pulls the rug out from a whole policy &
   technology perspective that the point is to give 
   you control over your information - because you 
   don’t have control over your information.” 
   
   Hal Abelson
It appears that while Fb is trying to tighten up public leaked information, at the same time they allow access to the social graph API which is just as potentially more damaging to individuals.

[0] http://news.ycombinator.com/item?id=1199821 & http://news.ycombinator.com/item?id=1106859

[1] PJF, "Dark Stalking on Facebook", http://pjf.id.au/blog/?position=590

[2] http://www.boston.com/bostonglobe/ideas/articles/2009/09/20/...


TOS != legal right to block access to public data. Especially when it's crawled, where one is not expected to read the TOS. It's equivalent to having a document posted in the bathroom of a restaurant that says that by walking into the building you forfeit your right to see.

Too bad it didn't go to court, he could've countersued then. There could be definite personal damages because of all this, and because FB is flexing legal-might it doesn't have.

And I cry bullshit on FB trying to protect privacy. Why then is everything public by default when a new feature comes out? At every step of the embiggening process, FB has royally screwed over their users' privacy.


"And I cry bullshit on FB trying to protect privacy."

Absolutely. That's why I rarely use it; I've read their privacy policy (especially the privacy policy concerning facebook applications) and realized that if I wanted something to be just between me and my friends, facebook is not the place to do that.


I too am dubious of Facebook and avoid posting anything especially private there.

But being on Facebook, I also know that most of my friends are much less careful about this and I am happy that Facebook makes some effort to protect their rather foolish trust. It would be better for them to protect themselves but still...


By picking something interesting, which anyone can still do (and far more invasive uses as well), and making a big deal of it? This is a publicity stunt, more likely, because his work spread quickly, and normal people started to notice that this said things about them.


"TOS != legal right to block access to public data."

You seem to be arguing that Facebook should only be able to choose between "everything goes" or "hidden behind a contractual login wall".

However in the U.S. the Computer Fraud and Abuse Act requires computers to be used only with the owner's authority. The key is that authority can be conferred by means other than a formal contract, and instant monetary damages apply only to access control violations.

This is good! It lets you use public services without a barrier to entry or legal risk. If the provider does not like it, they have to suck it up until they find out who you are. Then they can order you to stop, and if you stop the matter is over.


Yes, because making him delete the data definitely makes the problem go away.


Well it might make the problem go away if the problem is well-intentioned research whose results are made available publicly.


I don't have the complete answer to this problem, but no discussion of Facebook opening up their data set is complete without mentioning that there has been a lot of work done on uniquely fingerprinting people on stunningly small amounts of data. In Facebook's position, I would in all seriousness say that I see no reliable way for Facebook to release this data in any form with the reasonable certainty (by legal standards) that the data will not be used in a privacy-infringing manner. I can imagine some ways, but I sure wouldn't be willing to guarantee any of them. It is possible and in some sense perhaps even likely that it is not possible to have both a nontrially-useful data set, and a privacy-respecting data set. Information theory is a harsh mistress.


If it is from public profiles though it doesn't matter, you could just go break privacy already by visiting a persons public profile.


A sufficiently large convenient aggregation of otherwise not-easily-obtainable public data becomes a privacy hazard. (I choose the word "hazard" with care.) Knowing that theoretically one could go find all fans of $PERSON_OF_INTEREST with enough work is one thing, being able to type one query into your data set and get the answer back in two seconds is another.


Meta: can we change the title? It is different than the title of the actual article, and "does evil" is a pretty subjective term.


Am sad to have seen this fulfilled.

Original headline was "Facebook does evil to innocent data compiler".


Whenever I do crawling, I do it from AWS and I set the User-Agent to Google's.

Also, if you're up against someone that threatens to sue, the latency introduced by Tor might not be too harmful.


If you're trying publish your data in a scholarly work, you aren't going to be able to conceal your identity or the origin of your data...


And after you publish you work, Facebook can't threaten to sue you if you don't delete your data. They just have to do it, which they probably won't. (And if they do, the world benefited from your work already, so the damage they can inflict is minimal.)

The idea is to keep Facebook from knowing what's going on until the last possible moment, so they can't interrupt you in the middle of something. Once you've published your paper, then they can know.


unless your scholarly work involves AGW ;) ... just joking


Where does it say they actually sued him? It just says they threatened to, and he complied with their requests.


It's possible that Facebook shut this guy down because they don't want competition when selling their data. The notion that this information is publicly and legally accessible may take away a revenue stream from them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: