The largest number representable in 64 bits

4death4 · on Nov 27, 2023

The problem, as stated, provably has no answer. Assume such a number exists. Call it n. Now define a new 64-bit numbering scheme such that each number x is interpreted as n+x. n+x > n, which invalidates the hypothesis. There needs to be more constraints for this to be interesting. Like largest number representable where the representation is closed under addition and multiplication, or something like that.

tromp · on Nov 28, 2023

> There needs to be more constraints for this to be interesting.

Scott Aaronson's quote in the article provides this constraint:

> Precisely because the Turing machine model is so ancient and fixed, whatever emergent behavior we find in the Busy Beaver game, there can be no suspicion that we “cheated” by changing the model until we got the results we wanted.

Your "each number x is interpreted as n+x" is a clear example of the cheating that makes for an uninteresting scheme.

4death4 · on Nov 28, 2023

That doesn’t provide a constraint. The quote you mentioned is about why to model computation as a Turing machine vs. some other model, not about how to “find the largest number representable in 64-bits”, which doesn't have an answer.

tromp · on Nov 28, 2023

I gave an answer for the largest number known to be representable in 64 bits with a scheme that is clearly not cheating.

4death4 · on Nov 28, 2023

Define “cheating”.

volemo · on Nov 28, 2023

I believe tromp implies a constraint of “existed before we started answering the question”.

cyanydeez · on Nov 27, 2023

[flagged]

gnulinux · on Nov 28, 2023

This is the kind of thing you learn in the first lecture of a discrete mathematics class that's required for every single CS program in the world.

It's very obvious if you think about it. If the encoding on the memory is n that doesn't mean it represents "n" data conceptually. It can be n+x for any value of x. For example, you can declare 0 represents $0, 1 represents $0.01, 2 represents $0.02 and so n represents n cents. This is pretty common in tax calculation applications. Then, you can also say n represents n+99^99 so that what GP said holds. They just provided a formal argument for it which proves it by induction, as we call it.

dj_mc_merlin · on Nov 28, 2023

> They just provided a formal argument for it which proves it by induction, as we call it.

This is contradiction, I think.

Assume n (exists and) is the largest number representable in 64 bits is statement 1.

The numbers in the n+x system are larger than n is statement 2.

That causes a contradiction and since statement 2 is true, statement 1 must be false. The addition in n+x is not induction, just a way to define a system.

sva_ · on Nov 27, 2023

I think that answer is pretty obvious with some basic mathematical training.

quickthrower2 · on Nov 27, 2023

The computer science version is: how much data can you compress to 64 bits? Well you can come up with any payload, create a compression algorithm that returns 0 if it matches that pay load and 1 ++ payload otherwise, then compress that payload. If the “number” is the binary number expressed by the payload that is your maximum number. You can go as high as you like.

chriswarbo · on Nov 28, 2023

That's why you should include the size of the decompressor (AKA using a self-extracting archive). That's what the Hutter Prize does, for example http://prize.hutter1.net

The encodings in this article are self-extracting archives; although they don't use a bloated, arbitrary format like e.g. x86_64 Linux ELF executables. Instead they use Binary Lambda Calculus, which has a variety of interpreters (the smallest being 206 bits, written in BLC itself)

ledauphin · on Nov 28, 2023

this was my thought as well

firebot · on Nov 27, 2023

Posits formerly unums, these are amazing. The best solution I've seen. Look up John Gustafson. This article doesn't mention him or posits. They can expand with inaccuracy but retain precision. Or sacrifice precision for accuracy. Making the best use of the bits depending on the application.

matja · on Nov 27, 2023

IEEE754 64-bit representation already has infinity:

    uint64_t x = 0x7ff0000000000000ULL;
    printf("%f\n", *(double *)&x);

output:

inf

But you could use a representation where 0 is 0, and 1 is infinity, saving 63 bits...

tromp · on Nov 27, 2023

I don't consider infinity to be a number though. Especially not in a largest number contest.

lowq · on Nov 27, 2023

Let 0 correspond to zero, and 1 corresponded to Rayo's number. Crisis averted!

tromp · on Nov 27, 2023

I find Loader's number [1] more interesting, as it is actually computable, yet far far larger than other famous computable numbers, like Friedman's TREE(3) or SCG(3). I'm looking forward to one day programming it in the lambda calculus, and seeing how much smaller than the existing ~500 bytes of C-code it can be.

[1] https://www.youtube.com/watch?v=q6Etl4oGL4U&list=PL-R4p-BRL8...

danbruc · on Nov 27, 2023

Let all values encode Rayo's number. 64 bits saved!

loxias · on Nov 27, 2023

> But you could use a representation where 0 is 0, and 1 is infinity, saving 63 bits...

Reminds me of the hilarious and brilliant: http://tom7.org/nand/

1letterunixname · on Nov 28, 2023

Floating format 1:1:0:1's 8 possible values:

    000: +0
    001: +1 ( denormal: (-1)^0 * 0.5 * 2^(-0+1) )
    010: +inf
    011: +qnan
    100: -0
    101: -1 ( denormal: (-1)^1 * 0.5 * 2^(-0+1) )
    110: -inf
    111: -qnan

=== Floating point crib sheet ===

--- Format ---

Sign:exponent:stored explicit mantissa leading bit:mantissa fraction:

       binary16 = 1:5:0:10
       bfloat16 = 1:8:0:7
    TensorFloat = 1:8:0:10
           fp24 = 1:7:0:16
       binary32 = 1:8:0:23 
       binary64 = 1:11:0:52
           8087 = 1:11:1:67
      binary128 = 1:15:0:112

--- Interpretation ---

leading bit = (exponent != 0) ? 1 : 0 when implicit (not stored)

bias = 2^(exponent bits - 1) - 1

value = (-1)^sign * 0 when zero

value = (-1)^sign * {{leading bit}}.{{mantissa fraction}}b * 2^(exponent - bias) when normal

value = (-1)^sign * 0.{{mantissa fraction}}b * 2^(-bias+1) when denormal

--- Classification ---

zero = exponent == 0 && mantissa fraction == 0

denormal = exponent == 0 && mantissa fraction != 0

normal = exponent != 0 && exponent != ~0

inf = exponent == ~0 && mantissa fraction == 0

nan = exponent == ~0 && mantissa fraction != 0

snan = nan && msb(mantissa fraction) == 0

qnan = nan && msb(mantissa fraction) == 1

PS: It often takes fewer gates to implement a simpler microcoded microarchitecture than to implement a single hardwired macroarchitecture. Microcoded architectures are theoretically slower than hardwired but this is often not the case in reality because of the costs of gate fanout and extra gates for clock distribution that ameliorate gains of fully specified and decoded in hardware.

summerlight · on Nov 27, 2023

That doesn't qualify the explicitly stated condition "a largest (finite) representable value".

toxik · on Nov 27, 2023

Fair, but also an uninteresting answer.

pphysch · on Nov 27, 2023

It's about as interesting as the other answers proposed in TFA, and it gets to the meat of it: they are all just representations invented by people, and there's nothing stopping us from inventing our own representations that fit into 64 bits (or 1 bit).

dumbo-octopus · on Nov 27, 2023

Not really. The specialness of TFA's constructions is that the interpreter does not need to have any knowledge of the large numbers a priori.

8note · on Nov 27, 2023

Alternative contructions don't have to either - eg. "The bit 1 represents running the algorithm from TFA"

dumbo-octopus · on Nov 27, 2023

That's precisely identical to declaring the number in the interpreter.

toxik · on Nov 27, 2023

No, it doesn’t. The question is “what is the largest non-trivial number you can represent with some constraint on size of its expression”. It’s a really old question, and saying “infinity” as an answer misses the point. Saying you can invent an arbitrary number system also misses the point by simply not answering. If you need to spend a bunch of bytes explaining your new system, did you really use 8 bytes?

It just feels really bad faith.

pphysch · on Nov 27, 2023

What is "really bad faith" about saying "An ON bit indicates the value 'googolplex'?"

Computing is fundamentally about decoding bit strings as different arbitrary representations that are meaningful to humans.

tromp · on Nov 27, 2023

Even the word "googolplex" is quite a bit longer than the lambda calculus program in question...

8note · on Nov 27, 2023

The actual bit is just 1 though, the word "googolplex" is in the accompanying documents for interpreting the bit.

The course on reading and using lambda calculus is similarly longer than than the actual lambda calculus expression

chriswarbo · on Nov 28, 2023

> The course on reading and using lambda calculus is similarly longer than than the actual lambda calculus expression

I'm not sure what a "course on reading and using" has to do with description complexity? In any case, it takes 206 bits to implement a binary lambda calculus interpreter (that's Theorem 1 in http://tromp.github.io/cl/LC.pdf )

tromp · on Nov 27, 2023

I originally wrote this article back in March on the googology website, and it received some discussion at https://news.ycombinator.com/item?id=35677148

Since some commenters pointed out how awfully spammy that website is (which I had failed to notice due to my browser's adblockers), I recently decided to slightly rewrite and expand the article to host it on my newly formed personal blog.

tedunangst · on Nov 27, 2023

I thought it sounded very familiar, so much so I checked the date twice.

addaon · on Nov 27, 2023

The article discusses different encodings of numbers to give non-dense representations of numbers exceeding (2^64)-1. (These representations are inherently non-dense by the pigeonhole principle.) But I feel like this is missing a key point. A 64-bit string that we agree represents only numbers can represent 18446744073709551616 different numbers. The choice of what numbers are represented is completely up to us. If we want certain properties (dense integers) we end up with the highest being 18446744073709551615. If we want other properties (nearly logarithmic distribution, signedness, and good mapping to hardware for arithmetic) we might end up with FP64 with a maximum value around 10^308. And if we want no interesting property constraints except being dual to a program on a Turing machine, we end up with a busy beaver number. But... remember, we can choose any 18446744073709551616 values we want to be representable. There's no restriction on the interpretation of these strings; or, equivalently, the amount of external information required to explain the interpretation of these strings is unbounded. As a result, we can choose any computable number to be encoded by the string 64'b1, or by any other string, and exceed any desired bounds.

tromp · on Nov 27, 2023

> we can choose any computable number to be encoded by the string 64'b1

First of all, every finite number is computable by definition.

And second, your encodings will, unlike those in the lambda calculus, be completely arbitrary.

PS: in my self-delimiting encoding of the lambda calculus, there are only 1058720968043859 < 2^50 closed lambda terms of size up to 64 [1].

[1] https://oeis.org/A114852

addaon · on Nov 27, 2023

> every finite number is computable by definition

Every finite integer is computable. We often represent non-integer numbers.

> your encodings will, unlike those in the lambda calculus, be completely arbitrary

Well, they /may/ be completely arbitrary. They may not be. The key is to choose encodings that are useful for the problem domain. Admittedly if the problem domain is "winning at the schoolyard game of saying a bigger number," those encodings may look arbitrary for most other purposes.

But there's actually an intent to my original comment besides being pedantic. The most general way to think of a 64-bit numeric representation is as a list of 2^64 numbers, feeding into a 2^64:1 mux, with the 64-bit string being the select bits of the mux. (Or, equivalently, a 2^64 entry x arbitrary width ROM with one number per entry, with the 64-bit string being the address input of the ROM. Same thing.) The two questions you must answer, then, are (a) which 2^64 numbers are most useful in your problem domain; and (b) are there hardware optimizations to reduce the (ridiculous) scale of the mux/ROM model that are so valuable that you're willing to make compromises on which numbers you select?

Kranar · on Nov 27, 2023

>First of all, every finite number is computable by definition.

You likely mean every integer or rational is computable (although not by definition). There are finite real numbers that are not computable, in fact most of them are not.

tromp · on Nov 27, 2023

Right; I meant finite integer.

TheNewAndy · on Nov 28, 2023

What is the 100th busy beaver number? I believe this is an integer, and I also believe it is not computable

tromp · on Nov 28, 2023

As a number, N=BB(100) is trivially computable by the program "output N". What you really mean is that the function BB(.) itself is uncomputable.

TheNewAndy · on Nov 28, 2023

I believe you are correct, however I am struggling to see how.

If you have the time to help me with my understanding then I would appreciate it.

I'm looking at wikipedia's formal definition, which says that for x to be computable, you need to provide a function from naturals to integers such that if you pick a denominator of a fraction (n), this function can give the numerator such that (f(n)-1)/n and (f(n)+1)/n end up straddling the value which is computable.

So, for an integer N, you make f(x) = xN then (f(n)-1)/n = N-(1/n) and (f(n)+1)/n = N+(1/n).

Therefore, for any integer N, it is computable.

Now, what is stopping me from doing something similar with a real?

If I say: f(x) = floor(xN)

Now (f(n)-1)/n = floor(n*N)-(1/n)

It is at this point where I realise I need to go to bed and sleep. If you see this and have the time to explain to me where it falls apart with reals, then I will be most happy. To be clear - I'm quite sure I am wrong, and this isn't me being passive aggressive about it.

tromp · on Nov 30, 2023

You were looking at the definition of a computable real number [1]. The issue there is whether you can approximate the number arbitrarily well. An issue that is nonexistent in integers, as you note in

> Therefore, for any integer N, it is computable.

> If I say: f(x) = floor(xN)

For many definitions of a real, it's not at all clear whether you can compute this f(x). The ability to compute f already amounts to being able to approximate f arbitrarily well. For example, for Chaitin's number, you cannot compute your f() except for a few small values of N.

[1] https://en.wikipedia.org/wiki/Computable_number

TheNewAndy · on Dec 1, 2023

Thanks for your reply - and I probably would have missed it except by complete chance.

I still don't think it has properly clicked, because the function "f(x) = xN" still needs me to know what N is - despite it being an integer.

For example, let's suppose that you manage to have a conversation with God and you discover that BB(100) has the value of 42, and Chaitlin's number is 1/2.

Does Chaitlin's number suddenly become computable? My intuition is that it remains uncomputable, but maybe my intuition is wrong... I guess by the definition, it is computable, but you can't prove it.

I'm also struggling with the reciprocal of BB(100). This is rational, so maybe it too is computable.

I guess I am struggling with the lack of a constructive proof and what that means - it is like we are saying "there exists an algorithm to do X, but the algorithm can't be known", and maybe that is the core of me being wrong - we can prove that such an algorithm exists, and so 1/BB(100) is computable just like BB(100) is computable 0 but every time I write this down, I still can't see how this logic doesn't also break down with actually non-computable numbers. e.g. "There is a function f(x) which returns an integer, I can't tell you what the integer is, but it is the one that results in showing that Chaitlin's number is computable"

Anyway, if you notice this and do reply, then very much thank you - and apologies if your reply goes unread

tromp · on Dec 1, 2023

> BB(100) has the value of 42

You wouldn't be able to express BB(100) explicitly. My article is about a lower bound on the much smaller BB(63), and it's already very hard to give a sense of how large that is. Go would of course be able to give you the 100 bit program for it.

> Chaitlin's number is 1/2

We know that Chaitin's number is not computable. So it cannot be 1/2. It has an infinite binary expansion of which we can only ever compute a fixed number of bits.

> I guess by the definition, it is computable

It's not.

> I'm also struggling with the reciprocal of BB(100). This is rational, so maybe it too is computable.

A number x is computable if-and-only-if its reciprocal is. Any fixed integer, like N=BB(100) is computable (with the trivial program print N), and therefore so is its reciprocal. What is not computable is the function BB(.)

> there exists an algorithm to do X

What is X ?

If you want to discuss this further, please just send your reply by email.

Kranar · on Dec 1, 2023

>For example, let's suppose that you manage to have a conversation with God and you discover that BB(100) has the value of 42, and Chaitlin's number is 1/2.

An uncomputable number can not be expressed as a finite sequence of digits in any computable base. So Chaitin's constant must consist of an infinite number of digits regardless of what base you choose, so long as the base is computable.

So "God" or an oracle can never actually produce Chaitin's constant in any finite representation, all an oracle can do is behave as a function where you give it an integer N, and it returns the N'th digit of Chaitin's constant.

tshaddox · on Nov 27, 2023

It's interesting that you are counting how many lambda calculus terms can fit into 64 bits given a particular encoding, but you're not making room in the 64 bits for an explanation of how your encoding works (let alone how lambda calculus works!). Of course, any encoding you used to include an "explanation" in the 64 bits would itself require some explanation. I guess what I'm getting at is that I'm not sure that your approach is any less arbitrary than any other approach.

tromp · on Nov 27, 2023

Before I can understand your comment, could you please add an explanation about how the English language works? And for whatever language you use to explain that (obviously not English), please provide an explanation for that language as well... sorry; couldn't resist /s

tshaddox · on Nov 28, 2023

I don’t think it’s possible to provide an infinite chain of such explanations, nor am I attempting to define a problem like “find the largest number representable in English with n characters.”

MadnessASAP · on Nov 28, 2023

I think I got this one, just let me finish reading Vol 4 of Principia and I'll type it up.

Sharlin · on Nov 27, 2023

The lower bound on the information content of a string is the length of the shortest program that can output the string (and halt), this is called its Kolmogoroff complexity. Lambda calculus is (one of) the most compact ways to encode programs and requires the least context to interpret. Thus it’s totally fair to say that the largest number encoded in LC in some number of bits is much more fundamental and less arbitrary a candidate than just randomly deciding that "1" now refers to some externally defined large number.

eru · on Nov 27, 2023

Something like SKI calculus seems a lot less arbitrary than lambda calculus in terms of representation, perhaps.

chriswarbo · on Nov 28, 2023

The author has also implemented Binary Combinatory Logic. I also find combinators more "natural" than lambda calculus (although the latter is certainly more useful day-to-day!); however, I still find lambda calculus a reasonable foundation since the Binary Lambda Calculus self-interpreter is shorter than the Binary Combinatory Logic self-interpreter (e.g. see the author's page at http://tromp.github.io/cl/cl.html )

eru · on Nov 28, 2023

Writing programs in lambda calculus is definitely more enjoyable than in eg SKI calculus.

I think if you express your lambda calculus bindings in terms of De Bruijn indices, it might seem less arbitrary, too.

MadnessASAP · on Nov 28, 2023

I don't think your day and my day are quite the same.

chriswarbo · on Nov 28, 2023

Lambda calculus is present in most "modern" programming languages, i.e. those which use lexical scope rather than global or dynamic scope; and those which call functions to return results (especially first-class functions) rather than jumping to subroutines for their effects. It's why Python functions are written with the keyword `lambda`, and it's why Amazon's function-as-a-service product is called Lambda.

For example, say you're refactoring some code and come across:

    def foo(x):
      return bar(x)

You decide to simplify this definition to:

    foo = bar

Congratulations, you've just performed η-reduction! https://en.wikipedia.org/wiki/Lambda_calculus#%CE%B7-reducti...

eru · on Nov 29, 2023

As an aside: alas, η-reduction is not something that's properly supported in Python.

pmayrgundter · on Nov 27, 2023

came here and searched for "busy". good job :)

To the sibling comment about arbitrariness, we could use a hybrid where we trade off some bits from IEEE FP to introduce far reaches and also some precision there.. so like, keep 32 bits or 54 bits for IEEE compatibility, then switch to "extended" ranges for e.g. BB numbers, higher alephs, etc..

There was this one system for calculation with infinities that avoided the Hilbert Hotel problem.. can't find it but was called smth like Infinioid or some other play on the name. Would be neat to bolt those on too :)

Edit: "grossone" is the calculus for infinities.. love this work! https://www.theinfinitycomputer.com/

chriswarbo · on Nov 27, 2023

> To the sibling comment about arbitrariness, we could use a hybrid where we trade off some bits from IEEE FP to introduce far reaches and also some precision there.. so like, keep 32 bits or 54 bits for IEEE compatibility, then switch to "extended" ranges for e.g. BB numbers, higher alephs, etc.

This sort of trick/hack is the reason why theorems in (algorithmic) information theory involve constant factors. For example, we can define an image compressor which outputs a single `1` when given the Lenna test image[0], and otherwise acts exactly like PNG except prefixing its output with a `0`. To decode: a `1` decodes to the Lenna image, and anything starting with `0` gets decoded as PNG without the leading `0`. This gives perfect compression with no loss of quality, when tested on that Lenna image ;)

[0] https://en.wikipedia.org/wiki/Lenna

ithkuil · on Nov 27, 2023

Furthermore this method can be scaled up to cover more and more test images, with a modest increase of prefix bits

pmayrgundter · on Nov 28, 2023

anyone really interested, i'm just rereading an earlyish (2010) paper by Sergeyev.. it's amazing

https://www.theinfinitycomputer.com/wp-content/uploads/2020/...

teknopaul · on Nov 27, 2023

Came here to say that.

People sometimes mistakenly think that numbers or data in computers exist in some meaningful way.

Everything in a computer is a model or record or representation that serves a purpose. All bugs and limits are features.

chriswarbo · on Nov 27, 2023

> People sometimes mistakenly think that numbers or data in computers exist in some meaningful way.

That's basically Platonism. I think it's a reasonable position for some things, e.g. Booleans (two-valued logic), natural/integer/rational numbers, tuples, lists, binary trees, etc. I think it's meaningful to talk about, say, the number 2, separately from the way it may be encoded in RAM as a model of e.g. the number of items in a user's shopping cart.

This position gets less reasonable/interesting/useful as we consider data whose properties are more arbitrary and less "natural"; e.g. there's not much point separating the "essence" of an IEEE754 double-precision float from its representation in RAM; or pontificating about the fundamental nature of a InternalFrameInternalFrameTitlePaneInternalFrameTitlePaneMaximizeButtonWindowNotFocusedState[0]

The question in the article is whether lambda calculus is "natural" enough to be usefully Platonic. It's certainly a better candidate than, say, Javascript; although I have a soft spot for combinatory logic (which the author has also created a binary encoding for; although its self-interpreter is slightly larger), and alternatives like concatenative languages, linear combinators (which seem closer to physics), etc.

[0] https://web.archive.org/web/20160818035145/http://www.javafi...

addaon · on Nov 27, 2023

> InternalFrameInternalFrameTitlePaneInternalFrameTitlePaneMaximizeButtonWindowNotFocusedState

In Plato's allegory of the cave, this was the true monster casting shadows on the wall.

andrewstuart · on Nov 28, 2023

If you say "well ha ha, you could put a function in that represents more than 64 bits!", then you may as well just say it contains a pointer to the biggest number ever.

It's sort of a word game, being smart.

kstrauser · on Nov 27, 2023

I'm going with '9↑↑9', which is 64 bits of UTF-8. (See https://en.wikipedia.org/wiki/Knuth's_up-arrow_notation)

tromp · on Nov 27, 2023

This is much less than the Ackerman value ack(9,9) that is discussed in the article, which equals 2↑↑↑↑↑↑↑12 - 3.

Pharaoh2 · on Nov 27, 2023

The article's result is large than that

kstrauser · on Nov 27, 2023

Awww, rats. I managed to miss seeing Knuth already mentioned in there. :facepalm:

tzs · on Nov 28, 2023

I have no idea offhand how this compares to any of those, but my try would be 0xf090a8802dcf8900, which when interpreted as a UTF-8 encoded C style string is 𐡀-ω, which is the supremum of the set {A₀, A₁, A₂, ...} (Pretend all those 'A's are '𐡀's, because I couldn't figure out how to get a subscript to the right of an 𐡀. The 𐡀 symbol is from a right-to-left alphabet and whenever I'd try to add a subscript it would get moved to the left!).

If 𐡀-ω is too cheesy, my try would be 0xf090a880e2828900, which is 𐡀 subscript 9.

tromp · on Nov 28, 2023

Can you explain what these A₀, A₁, A₂, ... are, and how their supremum is a finite number?

tzs · on Nov 29, 2023

Oh, I didn't realize we were only doing finite numbers. I had thought we just needed numbers that.

The A's are the aleph numbers, just using A instead of 𐡀 because I don't know how to put a subscript on the right side of an 𐡀.

They are the cardinalities of well-ordered infinite sets. Aleph-0 is the cardinality of the natural numbers, aleph-1 is the cardinality of the smallest infinite set that is larger than the natural numbers, and it goes on from there.

tromp · on Nov 30, 2023

Yeah; I was doing finite numbers. Sorry if that wasn't clear.

firebot · on Nov 27, 2023

You just need posits! John Gustafson will teach you how. Amazing concept. It would be amazing to see it in more hardware. It's especially good for AI and numerous other applications. Puts floats to shame. Of course this guy starts with integers, which are great, but for floats posits are amazing especially if you can control the bits dedicated to each 'fraction'.

chriswarbo · on Nov 28, 2023

Their range (excluding infinity) is slightly higher than floats, but still abysmally tiny compared to the BLC encodings used in the article.

PS: I still prefer type 1 unums, due to their fallback to intervals :)

firebot · on Nov 28, 2023

Not really. I don't think you've read his papers. The implementations in hardware lack range, but soft posits can expand to ridiculous ranges, and comparing the math, posits are superior(imo, granted, they're more or less similar) to BLC, depending on how they're implemented, of course.

chriswarbo · on Nov 29, 2023

> posits are superior(imo, granted, they're more or less similar) to BLC, depending on how they're implemented, of course

I'm very confused that you say posits are "more or less similar" to Binary Lambda Calculus. Posits are an inert data encoding: to interpret a posit as a number, we plug its parts (sign, regime, exponent and fraction) into a simple formula to get a numerical value. Those parts can have varying size (e.g. for soft posits), but the range of results is fundamentally limited by its use of exponentials.

In constrast, BLC is a Turing-complete, general-purpose programming language. To interpret a BLC value as a number, we:

- Parse its binary representation into lambda abstractions (00), function calls (01) and de-Bruijn-indexed variables (encoded in unary)

- Apply the resulting term to the symbol `one-more-than` and the symbol `zero`

- Beta-reduce that expression until it reaches a normal form. There is no way, in general, to figure out if this step will finish or get stuck forever: even if will eventually finish, that could take trillions of years or longer.

- Read the resulting symbols as a unary number, e.g. `one-more-than (one-more-than zero)` is the number two

Posit-style numbers can certainly be represented in BLC, by writing a lambda abstraction that implements the posit formula; but BLC can implement any other computable function, which includes many that grow much faster than the exponentials used in the posit formula (e.g. this has a nice explanation of many truly huge numbers https://www.reddit.com/r/math/comments/283298/how_to_compute... )

firebot · on Nov 29, 2023

That's fun. I understand all of that and exactly where the error lies. But you don't. And I'm not gonna tell you. :)

lifthrasiir · on Nov 28, 2023

Posits would have been good if it was invented back when there was no standard floating point format. But it is not a tremendous upgrade from IEEE 754, which has lots of issues but so do posits. A customizable IEEE 754 would fare more or less equally to customizable posits, and AI workloads (especially for inference) shy away from floats nowadays because they need much finer control than what either IEEE 754 or posits can offer.

firebot · on Nov 28, 2023

All you gotta do is look at the graphs to see posits are more accurate and precise for pretty much any given scenario, assuming the bits are distributed 'wisely'. Unfortunately, it's not easy to do in hardware and the hardware implementations suffer from rigidness.

I don't really want to get into the nitty gritty, as John will answer emails regarding this stuff. I've personally done so, and he's very polite and informative. I was using them for fractals, but using them in software, which was unfortunately very slow, but the results were amazing. I've read through his papers on them and it took me a while to really 'get it', but I did and oh man, even basic unums put floats to shame. While perhaps not a tremendous upgrade, I much prefer the distribution and accuracy and how there's far less overlap, NaNs, infinities, etc.

lifthrasiir · on Nov 29, 2023

I don't think you need to explain everything again because I am aware of posits' strengths over IEEE 754, that's why I acknowledged that first. But as an incremental improvement from IEEE 754, posits are just not enough to justify the switch. When people need something better served by posits, they don't use posits---they use non-standard variants of IEEE 754 (e.g. FTZ/DAZ, bfloat16).

firebot · on Nov 29, 2023

I guess we just disagree. I find the distribution of posits very logical, and I can see your points, but I digress.

IshKebab · on Nov 27, 2023

Meaningless question. If you allow arbitrary number formats you can just define 1 to be an arbitrarily large number.

happytoexplain · on Nov 27, 2023

This is pedantic. Even mathematicians - famous pedants - embrace the subjective concept of "non-trivial" answers.

robocat · on Nov 27, 2023

To be pedantic, calling it pedantic is pedantic.

If the number representation is encoded outside the 64 bits then you have removed the 64 bit restriction. Of course it is hard to calculate how many bits of information are required to define the type. But "uint64" is pretty small and just requires our current human context (quite a few more bits of information) to make sense!

tshaddox · on Nov 27, 2023

> If the number representation is encoded outside the 64 bits then you have removed the 64 bit restriction.

The explanation for how to interpret the 64 bit string is always outside of the 64 bit string. It's going to be tough to compare how big two explanations are since that's going to depend a lot on what each person considers to be a sufficient explanation.

I'm personally quite rusty on lambda calculus, and from glancing at the author's papers about lambda calculus encodings I suspect it will take a rather large number of bytes to represent an explanation that will make it through my thick skull!

tromp · on Nov 27, 2023

But the lambda calculus is far from an arbitrary format. It's about the oldest and least arbitrary formalism for computation.

LegionMammal978 · on Nov 27, 2023

Perhaps the lambda calculus proper is the oldest, but the choices made in encoding the binary lambda calculus are somewhat arbitrary: for instance, why use de Bruijn indices for symbols, instead of some other scheme, such as the index of the lambda in the string? And why encode de Bruijn indices in unary, and not in any other 1-prefixed self-delimiting format? And why use a self-delimiting format in the first place? Obviously, these choices are quite reasonable and make for a simple implementation, but they're totally distinct from the formalism that Church first described. (And for that matter, Church originally only used lambda notation in the context of an overarching logical formalism that later got struck down.)

Indeed, if we look at Church's paper, we'd find that the term is 80 symbols written fully, {λt[{{{{{t}(t)}(λh[λf[λn[{{{n}(h)}(f)}(n)]]])}(t)}(t)}(t)]}(λf[λx[{f}({f}(x))]]), or 44 symbols when abbreviated, {λt t(t, (λh λf λn n(h, f, n)), t, t, t)}(λf λx f(f(x))), which aren't too impressive given the alphabet of 11 or 12 symbols.

chriswarbo · on Nov 27, 2023

> why use de Bruijn indices for symbols, instead of some other scheme, such as the index of the lambda in the string?

It's pretty natural to use de Bruijn indexes, since their meaning is local and their composition matches the language semantics (they have a denotational semantics, which gives expressions an intrinsic meaning). For example, with de Bruijn indices the expression `λ0` is always the identity function. If we tried a different scheme, like "the index of the lambda in the string", then expressions like `λ0` would have no intrinsic meaning; it depends on what "string" they occur in, which requires some concrete notion of "the string", which complicates any interpreter, etc.

Whenever I see an implementation of lambda calculus (or some derivative) which doesn't use de Bruijn indexes, I require some justification to convince me why not :) (There are such justifications, e.g. performance optimisations, use of a host language's naming facilities, etc.; but I've can't think of any reason a self-contained, minimalist approach should avoid them)

> why use a self-delimiting format in the first place?

There are various technical reasons to prefer self-delimiting languages when working in algorithmic information theory. For example, prefix-free (binary) encodings are paths through a (binary) tree, with programs as leaves. This lets us assign well-defined probability measures to the set of all programs (assign 1 to the root node, and divide each node's probability between its (two) children). It's also more natural when programs are executed before they're fully known; e.g. reading one bit at a time on-demand (say, generated by a coin toss)

LegionMammal978 · on Nov 28, 2023

Yeah, I know that de Bruijn indices and self-delimiting formats make a lot of sense, and are probably among the most natural options. My questions were mostly rhetorical; my main point is to challenge the idea that BLC and its transition function in particular should be considered as a fundamental measure of complexity, since it clearly incorporates a fair few design decisions external to the lambda calculus itself.

tromp · on Nov 27, 2023

> but the choices made in encoding the binary lambda calculus are somewhat arbitrary

I made the simplest choices I could that do not waste bits.

> And why use a self-delimiting format in the first place?

Because a lambda term description has many different parts that you need to be able to separate from each other.

> And why encode de Bruijn indices in unary

I tried to answer that in more detail in this previous discussion [1].

[1] https://news.ycombinator.com/item?id=37584869

kloch · on Nov 27, 2023

"18 Billion Trillion" is the approximation I use when referring to the absurdly large "standard" subnet size of /64 in IPv6

fanf2 · on Nov 27, 2023

Only off by a factor of 1000

tromp · on Nov 27, 2023

Note that 'billion' and 'trillion' themselves have multiple meanings [1] [2], even though most people use only the short scale.

[1] https://en.wikipedia.org/wiki/Billion

[2] https://en.wikipedia.org/wiki/Trillion

vlmutolo · on Nov 27, 2023

So maybe "18 thousand million billion"

userbinator · on Nov 28, 2023

You know you've been working with computers for a while when reading the title immmediately evokes 18446... in your mind.

I was expecting the article to at least mention Kolmogorov complexity, or perhaps the demoscene (there are, surprisingly enough, 8-byte entries there.)

tromp · on Nov 28, 2023

I mention Kolmogorov complexity in the form of "shortest description lengths" at

> This property mirrors a notion of optimality for shortest description lengths, where it’s known as the Invariance theorem:

with the latter linking to https://en.wikipedia.org/wiki/Kolmogorov_complexity#Invarian...

lifthrasiir · on Nov 28, 2023

I know the purpose of this article, but let me be "that person"---how would you do arithmetic? Or compare them? ;-) The proposed scheme is essentially a very compact version of constructive real number, which has a well-known caveat of not always being comparable to other numbers (because two real numbers can be made arbitrarily close to each other without being equal). It would be interesting to design a constructive real number format that can avoid those pedantries.

tromp · on Nov 28, 2023

The standard way of comparing these unfathomably huge numbers is by placing them in the Fast Growing Hierarchy. In fact, the article does this with both wCubed and Graham's number to show that the former is much larger.

lifthrasiir · on Nov 28, 2023

I mean, as others pointed out, a number format typically needs much more than just a representation. If there is a bit-size limitation any operation has to round any excess back into the limit. A computational representation based on Turing machines or lambda calculus---again, which is not very different from a constructive real number in usage---do not provide an easy way to do that. That wouldn't have been an issue if there was no bit-size limitation.

Okay, let's ignore arithmetics and just allow comparison. As you've said, a common practice is to normalize it into some standard notation with a well-founded ordering. But there is no mechanical way to convert (or even bound) a computational representation to such notation---the general approach is therefore to compute a difference and check its sign. Not really good when it can continue even after the heat death of universe...

Frankly speaking, I rather expected to see some improvement over Level-Index number systems [1], but it turns out that this post is completely unrelated to number formats. Otherwise it is good, hence my mild frustration here :S

[1] https://www.mrob.com/pub/math/altnum.html#li_sli

flavius29663 · on Nov 27, 2023

I will go with 0-1 in whatever numerical system you have implemented, assuming only positive numbers.

umvi · on Nov 28, 2023

Well if you can redefine what the bits mean you can make any number. "let the value of the 8 bytes be tree(n), where n is the unsigned integer value of the bytes". There, I just blew the article's big number out of the water.

tromp · on Nov 28, 2023

See my reply to the top comment. Cheating makes it uninteresting.

Sharlin · on Nov 27, 2023

Well, you can certainly count on HN commenters to deliver the most pedantic, point-missing "ackshually" comments to articles like this!

feoren · on Nov 28, 2023

The article feels like an excuse to talk about six different tangentially related topics at the same time. That's not a criticism of the article -- it's like listening to someone's wandering train of thought out loud, which is equally as interesting as the thoughts themselves. The article even ends with an ellipsis. I think it's a fine article, but the mental wandering could have gone any other way too.

So I'm confused about what point you think people here are missing. There basically wasn't a point to the article, so people are either adding their own thought-wandering musings (totally appropriate, given the style of the post), or they're responding to the title, which is the only time the author really attempts to actually make a clear point. But the title of the article is flawed, and the author's musings do not adequately address the title. That's fine to point out. Nobody is missing any points.

tromp · on Nov 28, 2023

The point of the article is that lambda calculus makes for a nicer Busy Beaver function that shows its particular strength (of surpassing the most famous huge number) at the standard 64 bit boundary for representing numbers.