That's a nice filter. (Of course, I'm a former mathematician as well.) Here's ho...

orp · on Jan 2, 2014

For the interested, some links to Bayes' theorem: http://en.wikipedia.org/wiki/Bayes'_theorem

Useful if you want to know (or need a good way to explain) what a posterior probability is and how it's different from a prior probability

kalid · on Jan 3, 2014

I vastly prefer Bayes Theorem as ratios vs. percentages (plug, wrote about it here: http://betterexplained.com/articles/understanding-bayes-theo...)

I like a "factor label" style approach where you can see the prior probability (Fair: Biased), the information about the flips, and the posterior probability (the revised chances after the new information is taken into account):

Prior * Information = Posterior

so

(Fair : Biased) * (10 Fair heads : Fair ) : (10 Biased heads : Biased) = 10 Fair Heads : 10 Biased Heads

Plugging in, we'd have:

(999 / 1) * (1 / 1024 ) / (1 / 1) = 999 / 1024

The odds are ever-slightly in favor of a biased coin. So we can mentally guess ever-slightly above 3/4 for the chance of another heads. We could write (999/2 + 1024) / (999 + 1024) on the whiteboard to be exact.

phamilton · on Jan 2, 2014

Why is it 1/999? Shouldn't it be 1/1000 since there are 1000 total coins?

gjm11 · on Jan 2, 2014

Odds, not probability. Probability p means odds of p:(1-p) or, if you prefer writing it as a fraction, p/(1-p).

(Note 1. The odds of a thing are the ratio Pr(thing) : Pr(not thing). You can generalize this to any mutually exclusive and exhaustive set of things: the odds are the ratio of the probabilities. The fact that there may therefore be more than 2 such things is the reason why I prefer not to turn odds into fractions as above.)

(Note 2. Bayes' theorem is, as others have mentioned, much nicer when you work with odds rather than probabilities for your prior and posterior probabilities. If you're comfortable with logarithms, it's nicer still when you work with logarithms of odds. Now you're just adding the vector of log-likelihoods to the prior odds vector to get the posterior odds vector. Which is how I think of the question above, at least if I'm allowed to be sloppy and imprecise. You start with almost exactly 10 bits of prior prejudice for "fair" over "two-headed", then you get exactly 10 bits of evidence for "two-headed" over "fair", at which point those cancel out almost exactly so you should assign almost equal probabilities to those two possibilities.)

phamilton · on Jan 2, 2014

That makes sense. I've never dealt with odds as a fraction before.

gallamine · on Jan 2, 2014

Can you elaborate on the posterior calculation of (2^10)/999?

madcaptenor · on Jan 2, 2014

Sure. In terms of odds, Bayes' theorem says

(posterior odds) = (prior odds) * (likelihood ratio)

The prior odds are 1/999, so we need to show that the likelihood ratio is 2^10.

The likelihood ratio is the probability of seeing 10 heads from a double-headed coin divided by the probability of seeing 10 heads from a fair coin, which is 1/((1/2)^10) or 2^10.

11001 · on Jan 2, 2014

0.7531 if you don't assume that 2^10 = 999