Statistically speaking, 'a significant negative impact' doesn't mean the same as it does in regular English. It doesn't say much about how large the difference is, only that they have enough data to show that there _is_ a difference. When an abstract summarizes that the difference wasn't much to talk about, that does mean a lot more than you give credit.
Now, judging from their test setup, and also from the very low number of testers, I find it very hard to agree with what it seems to be you're taking away from this article. Static typing has more benefit in large code-bases, with multiple programmers, and for avoiding hard to find bugs related to dynamic typing. This doesn't seem to be well reflected by the setup they had.
Also, don't think you should repeat the same comments throughout the HN articles, you are not replying to individuals, but to general readers.
In the first case, the use of a statically typed programming language had a significant negative impact
Considering the size of the sample, I'd guess the difference is rather large to considered significant. By looking at the numbers quickly, it seems to be around 25%. I'm a bit shocked, in fact, because in my own experience, the difference is much larger, but this experiment controls for language and my experience doesn't.
Yes, I was able to read that in both your comments, as well as in the paper.
I suspect you are not understanding my point, or what I said about the meaning of the use of the word "significance" when used in statistics.
edit:
Say you flip a loaded coin that is 50.1% likely to be heads. Now you want to test whether this is loaded, and flip the coin a certain number of times and count the outcomes. If the number of times you flip the coin is too low, you won't be able to say the coin is 'significantly loaded'. It might be either way, you don't have enough data. If you flip it enough times, you will be able to say something about it -- i.e. that either it significantly is, or significantly isn't loaded.
However, in vernacular English, you would still say that the difference isn't very significant. Who cares if it is 50% or 50.1%.
There is no reason to quarrel about the meaning of 'significance' in statistical testing vs. in natural language. Just look at figs. 4 and 5 in the paper, and see that mean times spent for the scanner task were:
And that the difference in statistically significant (p=0.04, Mann-Whitney U-test). Whether 5 vs. 8 hours is significant in the natural language sense, everyone can decide for themselves.
Yes, but I'd risk that a 25% deviation is significant when the sample is 49 students. The smaller the sample, the larger the deviation must be to be significant, but 25% is quite a difference.
Also, it's worth to notice they controlled for language - they used the same language in two flavors - to isolate the typing system difference. It's not a Lisp vs. C thing.
Now, judging from their test setup, and also from the very low number of testers, I find it very hard to agree with what it seems to be you're taking away from this article. Static typing has more benefit in large code-bases, with multiple programmers, and for avoiding hard to find bugs related to dynamic typing. This doesn't seem to be well reflected by the setup they had.
Also, don't think you should repeat the same comments throughout the HN articles, you are not replying to individuals, but to general readers.