Hacker News new | past | comments | ask | show | jobs | submit login

"If the population size is 140 then a 'sample' of 140 gives you a complete description."

That would not be what we think of as a sample, but rather a census.

"If the population size is 1.4e10 then a non-random sample of 140 is going to give very low confidence."

If the sample is non-random then the conclusions are unreliable, regardless of the population size.

I don't understand most of the rest of your comment, except as angry sputtering designed to avoid having to correct yourself. As to my playing 20 questions, since I can not imagine in general how the population size could be relevant to statistical conclusions drawn from a random sample, I was wondering if those who imply the opposite could explain it to me. Because it seems obvious that if you want to measure the salinity of the ocean, you can scoop up a cup of water from it and analyse that. You don't have to use a different size cup for different size oceans, or even know how big your ocean is. But if I'm missing something, I'd love to learn what it is.




> If the sample is non-random then the conclusions are unreliable, regardless of the population size

That's not true. You may still get meaningful data, but it's less meaningful (i.e. the required interval to achieve a given level of confidence becomes wider, perhaps significantly so).

But either way, I mentioned "non-random" to pile onto the ridiculously low sample size. Even a random sample of such a small size would have given a low-confidence result.

> Because it seems obvious that if you want to measure the salinity of the ocean, you can scoop up a cup of water from it and analyse that. You don't have to use a different size cup for different size oceans, or even know how big your ocean is.

You're expressing a "population" parameter that doesn't actually exist. There's no such thing as "salinity of the ocean"; it changes depending on where (and what depth!) you are at. Sampling the salinity of the water in the cup tells you, at best, about the water where you're at.

Now you could probably talk about things like "mean salinity of the oceans", but to determine good bounds for that you would have to sample. And to figure out how much you must sample, you do indeed have to have an idea of the total population size, even if it's just to determine that the population size is so much larger than the sample size that you can ignore the population size and simply use the standard error formula.

If the population size is not much greater than the sample size then there is an adjustment you should make (the finite population correction).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: