Hacker News new | past | comments | ask | show | jobs | submit login

If I'm understanding correctly, you suggest that this is an example of the prime numbers not behaving randomly. I disagree.

Here's (more or less) the simplest possible random model of the primes. (It's called the Cramer model.) Take P to be a random subset of the natural numbers that includes each n>1 independently with probability 1/log(n). Then P is kinda like the prime numbers, though it lacks some structure it should have (e.g., it doesn't have the property that almost all elements are odd).

If you do this, then the expected number of ways to write a large n as the sum of two elements of P is on the order of n/log(n)^2, which is far enough from 0 that I bet (though I haven't checked) that Pr(every number >= 4 is the sum of two elements of P) is positive, and in fact probably quite big. This is kinda analogous to the Goldbach conjecture; note that it isn't only about even n, since in this model there can be lots of even "prime numbers".

I did a quick computer experiment, and it looks as if the probability is about 20%. In other words, if the primes were chosen at random according to the Cramer model, then about 1/5 of the time the appropriately mangled Goldbach conjecture would be true. (And the "odd Goldbach conjecture", which is what Helfgott has just proved, would be true substantially more often than that.)

Now, of course the Cramer model is way too simple. In particular, it doesn't know that almost all prime numbers are odd. So let's make it just a little smarter by declaring that 2 is always in P, the other elements of P are always odd, and that any odd n>1 is in P with probability 2/log(n). And now we should switch back to the original form of the Goldbach conjecture, looking only at even n. Well, according to my computer experiment the probability that the Goldbach conjecture holds if we use P instead of the primes is about 96%!

[EDITED to add: and if we force P to get 2 and 3 "right" instead of just 2, the probability goes up to about 99.5%. I fixed a few other things in this edit too.]

In other words: if the primes are basically random, then we should expect the Goldbach conjecture to be true.




Of course, your computer simulation completely disregards the possibility of some very large (but rare) numbers (possibly of density zero) not being expressible as such a sum. You merely verified it for low finite integers. I am very unconvinced your percentages are accurate. In fact, they may be completely wrong, as this issue is the entire crux of the Goldbach problem!


My simulation disregards it, but I don't. In this sort of random model, the probability of any (specific) large number not being the sum of two "primes" is extremely small (and decreasing rapidly as the number increases), which means that the total probability of any (at all) large number not being the sum of two "primes" is also extremely small.

In other words: in this model, with probability very close to 1 all large enough even numbers are sums of two "primes".

Let's put a little flesh on those bones. I'll stick with the unmodified Cramer model for simplicity; the tweaked versions are fiddlier but not fundamentally different. The probability that n=a+b is a "good" decomposition is 1/log(a)log(b); this is smallest, for fixed n, when a=b; so the probability that any given decomposition "works" is at least 1/log(n/2)^2. There are n/2-2 of these, and they're all independent, so the probability that none of them works is at most (1-1/log(n/2)^2)^(n/2-2), which is approximately exp(-n/(2 log(n/2)^2)). So the expected number of n beyond, say, 1000 for which that happens is the sum of this for n from 1000 up. Unfortunately neither the sum nor the obvious approximation as an integral has a closed form, but we can compute it numerically; it's about 0.000279. That's the expected number of failures above n=1000, and of course that's an upper bound on the probability of at least one such failure.

So, the probability (in this model) of any Goldbach failures above n=1000 is at most about 0.0003, and therefore the probability of any Goldbach failures at all is at most that much bigger than the probability of any Goldbach failures up to n=1000.

Note that although this is a statement with probabilities in, the probabilities aren't fractions of the integers for which Goldbach fails.

The situation is analogous to the following. Suppose you flip a coin 10 times, then 20 times, then 40 times, then 80 times, etc., and for each group of flips you "win" if you get at least one head. Then, with probability about 99.9%, you always win: the probability of failure in successive groups drops so fast that the contribution to the overall failure probability from all groups other than the first is tiny.

Similarly, with the Cramer-model analogue of Goldbach's conjecture, the probability that the conjecture fails "at n" decreases so rapidly with n that almost all the probability of there being any failure at all comes from the possibility of failures for small n.

And that's why you can get good estimates of the failure probability from computer simulations that look only at finitely many n.


Taking the Cramer model, the probability that a number N is not expressible as the sum of two primes is

product(x=1...N-1) (1 - 1/log(x)log(N-x))

< product(x=1...N-1) (1 - 1/log(N))

= (1 - 1/log(N)) ^ (N-1)

Which is eventually below

(1 - 1/log(N)) ^ log(N) ^ sqrt(N)

Which converges to

(1/e) ^ sqrt(N)

And so is eventually below

(1/2) ^ sqrt(N)

The -log of the probability that a number is expressible as the sum of two Cramer numbers is thus at most

-log(1 - (1/2) ^ sqrt(N))

Which is eventually below

2 * (1/2) ^ sqrt(N)

You can show that this is a convergent series, and if the -log of the probability converges then the probability that ALL integers (minus perhaps the first few) have this property must converge, to something above zero. Thus, there is a nonzero chance that a set of Cramer numbers will have the property that all a finite number of positive integers will be expressible as the sum of two of them. Thus, prime numbers do not need any magical properties for some version of the Goldbach conjecture to be correct on them - the magical properties are only necessary to actually prove it.


You've argued that if the the primes were basically random, we would expect Goldbach to hold, but as I understand it, we would expect to not be able to prove that it holds.

The fact that we can prove the ternary Goldbach conjecture is (necessarily) at least as surprising as it being true: in this case it seems to be significantly more surprising. Is the significance of this result in fact the techniques used, rather than the result itself?


This is really interesting. Have any theorems been proved or at least conjectured based on a probabilistic model for primes? Do you have any suggested resources for learning more about the Cramer model?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: