There's a lot of data mining talent in academia, but there's also a lot of us in the "real world" who follow what they do closely and add our own special twists because we've been doing data analysis for decades rather than teaching about it for decades.
Why keep the data under wraps? I can't see the data being all that valuable to Yelp's competitors, unless somebody wants to make a niche out of have stale data about university towns.
It might not be valuable to competitors, but I wonder if this data set would be useful for someone trying to write fake reviews that get past their filtering mechanisms.
Even making it available to institutions outside the US would be a start. I'm actually doing some work on ratings in the UK at the moment, and this dataset would have been really useful to test some theories I've been developing.
I agree totally on making it available to anyone who is willing to sign up to the terms and conditions.
In particular, note this clause from yelp's agreement:
"5. Restrictions
You agree that you will not, and will not encourage, assist, or enable others to:
A. display, perform, or distribute any of the Data, or use the Data to update or create Your own business listing information;
B. Use the data in any manner or for any purpose that may violate any law or regulation, or any right of any person including, but not limited to, intellectual property rights, rights of privacy and/or rights of personality, or which otherwise may be harmful (in Yelp's sole discretion) to Yelp, its providers, its suppliers, end users of this website, or Your end users;"
There's a lot of data mining talent in academia, but there's also a lot of us in the "real world" who follow what they do closely and add our own special twists because we've been doing data analysis for decades rather than teaching about it for decades.
Why keep the data under wraps? I can't see the data being all that valuable to Yelp's competitors, unless somebody wants to make a niche out of have stale data about university towns.