Hacker News new | past | comments | ask | show | jobs | submit login

There are also legal and privacy concerns. I've worked on a few research papers where exactly one researcher had access to the data under a very strict NDA. And even they did not get full access to the raw data, only the ability to run vetted code against it and some subsets for development.

This is because the datasets were subscriber logs from mobile operators. They are both highly privacy sensitive and contain sensitive business knowledge. There is no way they will ever get published, even in some anonymized form.

Ultimately it always comes down to trust. You need to convince your peer reviewers to trust you that you have correctly done what you have claimed to have done. Of course, even when you publish datasets, you need to convince the peer reviewers to trust you that you didn't fake the data.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: