Hacker News new | past | comments | ask | show | jobs | submit login
Yelp University Dataset (yelp.com)
48 points by adi92 on Sept 23, 2011 | hide | past | favorite | 8 comments



Why not make it available to everybody?

There's a lot of data mining talent in academia, but there's also a lot of us in the "real world" who follow what they do closely and add our own special twists because we've been doing data analysis for decades rather than teaching about it for decades.

Why keep the data under wraps? I can't see the data being all that valuable to Yelp's competitors, unless somebody wants to make a niche out of have stale data about university towns.


It might not be valuable to competitors, but I wonder if this data set would be useful for someone trying to write fake reviews that get past their filtering mechanisms.


Even making it available to institutions outside the US would be a start. I'm actually doing some work on ratings in the UK at the moment, and this dataset would have been really useful to test some theories I've been developing.

I agree totally on making it available to anyone who is willing to sign up to the terms and conditions.

I wonder why they didn't use Kaggle?


If I had to guess, the same guys who screwed over netflix, cost them $10M, then posted the smarmy bullshit below:

http://33bits.org/2010/03/15/open-letter-to-netflix/

In particular, note this clause from yelp's agreement:

"5. Restrictions

You agree that you will not, and will not encourage, assist, or enable others to:

A. display, perform, or distribute any of the Data, or use the Data to update or create Your own business listing information;

B. Use the data in any manner or for any purpose that may violate any law or regulation, or any right of any person including, but not limited to, intellectual property rights, rights of privacy and/or rights of personality, or which otherwise may be harmful (in Yelp's sole discretion) to Yelp, its providers, its suppliers, end users of this website, or Your end users;"

http://www.yelp.com/html/pdf/Dataset_Agreement.pdf

That said, reading the dataset agreement, I'm not sure I'd care to participate even if I could.

edited for formatting



I think the dataset you mentioned only contains the yelp social graph ie it doesn't have the reviews and stuff.


Doesn't look like it.


> you'll need to be associated with an academic institution to qualify for access




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: