Big numbers are fun. Here's some stupid stuff I thought of while reading about t...

cousin_it · on Jan 19, 2016

Not unless you gzip it. Since the number is just the binary 1 repeated a bunch of times, it should gzip pretty well.

biot · on Jan 19, 2016

gzipped it's a 9043 byte file when the source is the number expressed exactly as bits. gzipping that 9043 byte file yields a 129 byte file, which gzip can't compress any further. Base64 of the file:

  H4sICBPHnlYCA2EuYmluegDt0LENQQEUQNHnK7xINF8rLIAVbKH88VU0ZlATK4gt1DQ6vURMoBID
  iN4CknOa29/BNrN72U+LZj2eL1fxPDU6EXE+3CbvUTEMAAAAAAAAAODfPfq9/DZfx6petw2B32Z5
  XZT31mYXH9vDv0RTIwAA

themadcreator · on Jan 20, 2016

Neato! Verified with:

  echo "H4sICBPHnlYCA2EuYmluegDt0LENQQEUQNHnK7xINF8rLIAVbKH88VU0ZlATK4gt1DQ6vURMoBIDiN4CknOa29/BNrN72U+LZj2eL1fxPDU6EXE+3CbvUTEMAAAAAAAAAODfPfq9/DZfx6petw2B32Z5XZT31mYXH9vDv0RTIwAA" | base64 --decode | gzip -d | gzip -d | xxd

Houshalter · on Jan 20, 2016

I see some redundancy there. The letter "A" appears many times in a row.

Also it could be compressed even further. All you need is a computer program that prints "1", 74 million times.

elenorelange · on Jan 20, 2016

It's already been concisely expressed as 2^74,207,281 - 1, so you don't need a computer program that prints "1" 74 million times.

You just need a computer language that can interpret bignum math equations.

Houshalter · on Jan 20, 2016

That would take ages to compute. Printing one is faster. And might produce the smallest file size if it's done in assembly.

elenorelange · on Jan 20, 2016

but the metric was the most concise way to store the data

rom1504 · on Jan 19, 2016

So I tried it and it's 72135 bytes using tar czf (created by `fs.writeFileSync('/tmp/file',new Buffer(74207281).fill("1"),'utf8')`)

Not that it's the most efficient representation at all, 74207281 fits in 4 bytes.

dchest · on Jan 19, 2016

This thread is officially dumb. One guy miscalculates 75 Mbit as ~75 MiB, another guy fills ~70 MiB file with ASCII "1"s.

yitchelle · on Jan 19, 2016

Strange but very useful things could emerge from just mucking about and poking around...

dchest · on Jan 19, 2016

Hehe, I didn't mean to sound negative! Agree with the other poster that it's quite fun.

Oh, and I also forgot to mention that the guy used `tar` for compressing one file. I'll see myself out.

coldpie · on Jan 20, 2016

I'm glad I started this stupid thread, and this is my favorite comment in it.

banku_brougham · on Jan 19, 2016

dumb, but fun

daveguy · on Jan 19, 2016

M49 (for the 49th mersenne prime) fits in 2 bytes! Or even better just 49! 6 bits! Assuming your decompression algorithm is optimized for mersenne primes. :)

alasdair_ · on Jan 19, 2016

It can fit in a single bit if your algorithm is optimized for "primes that are equal to M49" :)

lakofsth · on Jan 19, 2016

You could optimize the single bit away by assuming that there would be a bit.

ekimekim · on Jan 20, 2016

http://www.dangermouse.net/esoteric/lenpeg.html

graylights · on Jan 20, 2016

Great, I've just discovered the 50th mersenne prime, M50. Decompression takes a lot of time.

yen223 · on Jan 19, 2016

Repeated exactly 74,207,281 times, to be precise...

seibelj · on Jan 19, 2016

No, the number 5 is prime, yet in binary it is 101. The new prime number could have many zeroes interspersed with ones, so you don't know how many 1 digits it has.

EDIT: I have realized my mistake in the representation of 2's complement. However, I will leave this here for posterity.

gill984 · on Jan 19, 2016

It's a Mersenne Prime, so it is 1 less than some power of 2. This results in all of its non-leading digits in binary being 1s.

theandrewbailey · on Jan 19, 2016

But 5 is not 2^n-1

eggy · on Jan 19, 2016

5 is a prime number true, but not a Mersenne prime number. The Mersenne prime number 31 has 5 as its exponent n [1].

[1] http://mathworld.wolfram.com/MersennePrime.html

PeCaN · on Jan 19, 2016

Turns out run-length encoding it is more or less what you get if you store it as 2^74,207,281 - 1. I imagine this probably is how they store it actually; note that 2^n == 1<<n so you can just store the exponent and complement (74 207 281, 1) and then do normal two's complement math.

kale · on Jan 19, 2016

If you're talking about storing it in computer memory during calculation, then no, it's written out. The fast MOD operation you list isn't used, it's an FFT using a weight factor that automatically does the mod operation while squaring a number. You start with S[0] = 4, S[i+1] = Si^2 - 2 MOD N. If S[N-2] = 0 then the number is prime.

So it's 74,207,279 squares using FFT, each likely more than 75 MiB since there will likely be lots of leading zeros (we're probably close to the maximum number of digits using double precision, round off errors dominate unless you start using really small bases and even greater-length FFTs).

That's not counting the two or three compositeness tests that GIMPS uses before starting the Lucas-Lehmer test to prove primality.

For several years I've been working on a program to test for primality of the Generalized Fermat numbers, using NTT instead of FFT. NTT is exact (integer math), and the GFNs likely are more "dense" with primes than the Mersenne Numbers. And they also have a faster form to test, although the primality test has a 50% chance of false negative, so you have to run it a few times to be sure a number isn't prime. That's where Mersenne Numbers are easier to test.

Dylan16807 · on Jan 19, 2016

It's the sort of thing that's so repetitive you end up gzipping the gzip, because the "repeat N times" code doesn't support going so high.

Someone · on Jan 19, 2016

75 Megabits; 9.5 MB/7 3.5" floppies/5-10 selfies.

coldpie · on Jan 19, 2016

You're right! Thanks! I told you it was stupid :P

DonHopkins · on Jan 19, 2016

How many t-shirts?

ctdonath · on Jan 19, 2016

Someplace used to print Mersenne primes as posters. IIRC they were using a font around 2 points. Probably given up now...

ribasushi · on Jan 19, 2016

The person died, and the art with him :( http://www.mersenneforum.org/showthread.php?t=19100

sanoli · on Jan 20, 2016

I remember those. They either came with a loupe, or mentioned explicitly to the buyer that you needed one to see individual numbers. Always wanted to get one. It looked like an all grey poster if you didn't know what it was.

logicallee · on Jan 19, 2016

75 MB compresess neatly to under 12 bytes, which I say because the string "2^74207281-1" is totally unambiguous (Wolfram Alpha will evaluate it for you no questions asked, no guesswork, though obviously it only approximates the output). So the Kolmogorov complexity of the 75 MB is 96 bits or less.

gambogi · on Jan 19, 2016

Looks like you just communicated it in 16 bytes though.

J_Darnley · on Jan 19, 2016

It is far too many bits to be used for time representation. 512 bits is all that needed to store every planck time (10^-44 seconds) since the big bang until the heat death of the universe (10^100 years).

PhasmaFelis · on Jan 20, 2016

Sure, that's fine if all you want is to measure the time from the big bang until the last matter in the universe has evaporated into photon soup. But what if you want the time until those photons are evenly distributed? An article I read a while back estimated that duration as 10^10^78 years, which is absolutely mindboggling and remains the largest number I have ever seen with a real-world physical application.

(It may never happen, though; according to Wikipedia, random quantum fluctuations may be sufficient to produce another Big Bang in a mere 10^10^56 years.)

firethief · on Jan 20, 2016

That makes sense for durations, but a time_t is actually a timestamp. A proper time can have the same units as a duration, but is only useful in conjunction with information about the trajectory of the clock measuring the time. If space is continuous, the amount of information to specify the trajectory is limited only by the capabilities of the measurement apparatus. Of course, for all we know at this point spacetime is quantized; so we should probably hold off on standardizing the time format to end all time formats.

batbomb · on Jan 19, 2016

Or you create a new compression scheme just for Mersenne prime numbers where store the exponent. You'll be able to compress the entire number into just 32 bits for several decades at least!

hinkley · on Jan 19, 2016

Since the exponent is prime you should be able to compress it further than that, by simply noting n is the kth prime number. Supposedly as n goes to infinity the probability that n is prime drops to zero, so that should compress handily.

Decompressing it, on the other hand, would be pretty slow however. Are primality tests O(n) or O(nlogn)? That puts decompression up to O(n^2) without a pre computed lookup table.

CUViper · on Jan 19, 2016

https://en.wikipedia.org/wiki/Prime_number_theorem#Approxima...

So with the kth prime ~= k log k, the ratio of primes is 1 / log k. Yes that drops to zero, but very slowly.

You can use the upper bound of the kth prime to build a sufficiently large sieve, which can be done O(n).

spacehome · on Jan 20, 2016

If you don't care about computation time, then why stop there?

You can encode the Mth Mersenne with the integer M.

hinkley · on Jan 20, 2016

But then I have to keep patching the code every time a new one comes out. Too hard!

spacehome · on Jan 21, 2016

You don't. You can write a program that converts between m and M_m. It just takes a long time to convert.

analog31 · on Jan 19, 2016

I think there's a related mathematical paradox about the largest number that can be represented by an English phrase of fewer than 93 characters.

frandroid · on Jan 19, 2016

That's 75 megabits, not megabytes.