Oh I see I jumped the gun. The first result seemed close to the headline of the paper but its actually a bit different. The actual headline is from further down.
I'd still ignore it though. You see how the first 3 are under "pre registered analyses" and the significant finding is under "exploratory analyses"? That's another way of saying their experiment failed to find anything interesting so they combed the rest of the dataset to try and find something they could publish. Basically just classic p-hacking. Probably not malicious or intentionally deceiving, but p-hacking nonetheless.
If you throw your hands up and decide to check what's interesting on a 14x14 correlation matrix, you've just tested around 100 hypotheses without realizing it. If your significance threshold is 0.05, you should expect to have 5 false positives in there already.
I already knew you’d ignore it - in your rush to dismiss you missed the section of preregistered analyses did generate statistically significant results.
Exploratory analysis is not in inherently p-hacking. Publishing exploratory analyses as such is the proper action. There’s nothing in their methodology that suggests they analyzed a large number of possibilities and discarded the high p-values. (How would that even work in this case? Run through all possible conversation topics? All possible time divisions of speech?) The exploratory topics extend naturally from their preregistered hypotheses, from speech and income to speech relating to income and speech on calendar days when income is an issue.
Your cynical take on publishing negative results is unhelpful, as the accusation of bad faith and the straw man.
> I already knew you’d ignore it - in your rush to dismiss you missed the section of preregistered analyses did generate statistically significant results.
I'm open to being proven wrong but I've reread this section a couple times and I'm pretty sure all four tests in the primary pre-registered section and both tests in the secondary pre registered section are all non-significant.
(looks like there's actually two studies here. I've only read the first)
> Exploratory analysis is not in inherently p-hacking. Publishing exploratory analyses as such is the proper action. There’s nothing in their methodology that suggests they analyzed a large number of possibilities and discarded the high p-values.
Exploratory analysis is p-hacking. p-hacking is not (always) an evil unprincipled scientist trying to push a story. It's usually a scientist without a lot of statistical knowledge trying to see what's interesting and then letting their personal biases confirm coincidences as they appear because they want to find SOMETHING. You can publish these results, but you'd better be very clear that, you know, it's not very good. They're relatively conservative as they should be in the scheme of things here, and that's good. But look at the article and the discussion its generated. Clearly some people think its a trustworthy "scientific" result.
> How would that even work in this case? Run through all possible conversation topics? All possible time divisions of speech?
You take your data, you generate a correlation matrix of everything you've got, and point your finger at the values that look high. Then you test those hot spots and find significance. Very easy. You're implying here that in order to p-hack you need to check out every possibility. Not true.
They already tested ~6 ideas. Assuming they're independent of each other (to be fair, they're not), the likelihood of finding something significant with a p value below .05, purely by chance is already 0.95^6 => 26%. That is to say, naively, if you run 4 of these studies, one of them is expected to get a positive result even if they're all bogus. If they're allowed to do exploratory tests, they're now getting the freedom to cherry pick additional options. Remember, you have to count both the things they actually test AND the things they decided not to test after seeing the correlation matrix.
If they consider an additional 3 ideas, your studies now have a 33% false positive rate. At 15 ideas you're now at a 66% false positive rate. At about 40 extra ideas you have a 90% chance of finding something significant. If everyone's doing this, then your entire field is probably bogus.
It is extremely easy to unintentionally cheat and write off 40 insignificant ideas glancing at a correlation matrix. Pre-registering ideas is critical.
and I'm harping on correlation matrices because that's probably a lot of people's first idea, but there's plenty of other ways. That is the whole point of exploratory analysis.
You can see for example that one of their exploratory analyses is what happens to the relationships when we group by income? Well, you get to test another handful of ideas, I can guarantee you that much.
> The exploratory topics extend naturally from their preregistered hypotheses, from speech and income to speech relating to income and speech on calendar days when income is an issue.
That's not remarkable at all. That's the kinds of information that existed in their dataset.
> Your cynical take on publishing negative results is unhelpful, as the accusation of bad faith and the straw man
Well I did say "Probably not malicious or intentionally deceiving" so if anyone's pushing a bad faith interpretation here I'd say it's you.
I'd still ignore it though. You see how the first 3 are under "pre registered analyses" and the significant finding is under "exploratory analyses"? That's another way of saying their experiment failed to find anything interesting so they combed the rest of the dataset to try and find something they could publish. Basically just classic p-hacking. Probably not malicious or intentionally deceiving, but p-hacking nonetheless.
If you throw your hands up and decide to check what's interesting on a 14x14 correlation matrix, you've just tested around 100 hypotheses without realizing it. If your significance threshold is 0.05, you should expect to have 5 false positives in there already.