Hacker News new | past | comments | ask | show | jobs | submit login

It looks like the blog author completely missed the point of the statistical significance discussion going on. Most first-tier journals in the social sciences have an acceptance rate of about 5%. At the margins, the differences between acceptance and rejection could be having one more statistical significance result in the table than the paper that was submitted right before or after yours.

The problem with a 0.048 and a 0.052 is not a mathematical one but an interpretation one. Reviewers are condition to be very skeptical of non-significant results and use “under power-ness” as a grounds for rejection. As a result, we get publication bias and p-hacking.




You point out p-values can be troubling because there's all sorts of bad incentives that lead to p-hacking and publication bias. The author points out that even if those bad incentives didn't exist, p-values aren't all that useful to begin with. That's not "missing the point", it's just pointing out a different aspect of the situation.


I think you should reread the article because it's exactly what the blog author says. Blog author who btw is Andrew Gelman, not just some random guy on Medium, his blog is well well worth reading. Fighting bad stats in science is kind of his hobby/life mission.


Yeah, he had a major part in creating the stats discussion that is going on. I doubt that he has missed its point.


There is whole group of people in social sciences who are pushing for abandoning the null hypothesis testing methods. For the reason you mentioned, I cannot believe changing the test (or threshold of the test) would solve this issue. You ask people to find something significant or fit a model to a data — and tell them that's what matters — and they'll do it, either intentionally or unintentionally.

Machine Learning will (have) the same issue if the only thing that matters is hitting a certain level of accuracy given your model and data. This has been observed in Kaggle competitions over and over, you ask a group of people to find the best fit, and they'll, by learning your train, validation and test datasets.

As mentioned, problem is not p-value, or null hypothesis testing, the problem is journals who promoted the wrong incentive, and educators who were not aware of the consequences and propagated the wrong incentive (interpretation) to students.


> Most first-tier journals in the social sciences have an acceptance rate of about 5%.

Assuming the null hypothesis, that is precisely our expectation of finding something significant under the p < .05 rule. (That is, assuming that all papers try to falsely reject a true null hypothesis, then we expect 5% of the papers to be successful at that and get published.)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: