Hacker News new | past | comments | ask | show | jobs | submit login

"Bayesian" and "frequentist" are descriptions of interpretations of probability more than of methods of inference.

Briefly and inexactly: A frequentist says that a probability is the answer to a question of the form "if we repeat this situation many times, what fraction of the time will this happen?"; a Bayesian says that many other things, such as (idealized) subjective degrees of belief, behave like probabilities -- i.e., they obey the same mathematical rules -- and that anything that obeys those rules deserves to be called a probability.

There is such a thing as Bayesian inference, but its opposite isn't "frequentist inference" but something like "classical hypothesis testing". If you take the frequentist view of probability then you will reject questions like "how likely is it that this treatment works?" because either it works or it doesn't -- there's nothing for a probability to be a long-run frequency of. But of course that really is the kind of thing you want to know, so you'll look for other similar questions that do make sense, such as "If the treatment doesn't work, how likely is it that we'd get results as impressive as these?". Asking that question leads to "Neyman-Pearson hypothesis testing": form a "null hypothesis" (the treatment doesn't work; the two groups of people are equally intelligent; the roulette table is not rigged by the casino; ...), do some measurements somehow, figure out how likely it is if the null hypothesis is right that you'd get results as unfavourable to the null hypothesis as you actually did, and if it's very unlikely then you say "aha, the null hypothesis can be rejected". Here, "very unlikely" might mean probability less than 5%, or less than 1%, or whatever.

If you've ever seen an academic paper in science or economics or whatever that says things like "eating more white bread is associated (p<0.01) with being a Mahayana Buddhist", that "p<0.01" thing is that same "probability of results as unfavourable to the null hypothesis"; in this (made-up, of course) case it would be something like "probability of the chi-squared statistic being as large as we found it, if bread consumption and Mahayana Buddhism were independent".

Bayesians, on the other hand, are perfectly happy talking more directly about the probability that Mahayana Buddhists eat more white bread. However, then another difficulty arises. Obviously the experimental results on their own don't tell you that probability. (Simpler example: you know that a coin was flipped five times and came up heads every time. How likely is it that it's a cheaty coin with two heads? You'll answer that quite differently if (a) you just pulled the coin out of your own pocket or (b) some dubious character approached you in a bar and invited you to bet on his coin-flipping.) The probability after your experimental results are in is determined by two things: those results, and what you believed beforehand.

On the other hand, here's a possible advantage of the Bayesian approach: Instead of just saying how confidently you reject (if you do) the null hypothesis, you can talk about the probabilities of various different extents to which it could be violated. (Imagine two medical treatments. One is more confidently known to have some effect than the other -- but the effect the second one might have is ten times bigger. You might prefer the second treatment even though the risk that it doesn't really work is bigger.)

So the Bayesian statistician might end up saying something like this: Here's a reasonable "prior distribution" -- i.e., a reasonable assignment of probabilities to the various possibilities we're interested in, before the experimental results. Now here's what our probabilities turn into if we start there and then take account of the experimental results.

Or they might just describe the impact of the experimental results, and leave it up to individual readers to combine that with their own prior probabilities.

Here's the recipe that's at the heart of Bayesian inference: Take the prior probability for each possibility. Multiply it by the probability of getting exactly the observed results, if that possibility is the case. The result is the final probability except that the resulting probabilities won't generally add up to 1, so you have to rescale them all by whatever factor it takes to make them add up to 1.




Thanks for the great explanation. It definitely made things clearer.


In short:

Bayesian inference gives you a subjective imprecise answer to a question you need to answer.

Classical testing gives you an objective precise answer to a different question from the one you need to answer.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: