More

djedr · on Aug 21, 2023

Another revelation: lambda calculus can be reduced down to 4 primitive operations[0].

I had this revelation after a pilgrimage into the land of Binary Lambda Calculus[1], a binary encoding of lambda calculus that represents variables with numerical (de Bruijn) indices[2] in unary.

Ultimately, 0 and 1 is all we need. ;)

[0] https://jevko.github.io/writing/2023-07-23-revelation.html -- the pun in the article (I'm the author) fits perfectly into the OP ;D

[1] https://tromp.github.io/cl/Binary_lambda_calculus.html

[2] https://en.wikipedia.org/wiki/De_Bruijn_index

djedr · on July 4, 2023

https://djedr.github.io/writing.html and https://xtao.org/blog.html

Mostly technical writing about programming [languages] and my projects.

Most popular posts:

* Introducing Jevko: a minimal general-purpose syntax / https://djedr.github.io/posts/jevko-2022-02-22.html / hit on the front page of HN / about a little project I've been working on for years

* Why NOT to add the pipeline operator to JavaScript / https://djedr.github.io/posts/random-2018-01-25.html / I guess this was controversial

And for looser writing/drafts: https://github.com/jevko/writing

Enjoy!

djedr · on July 4, 2023

This is one thing I designed Jevko[0][1] for.

If you have an idea for a format or a language and would like to quickly start hacking on the layer above the syntax, Jevko is an option.

It's meant to be even simpler and hackable than S-expressions.

It gets you from a string to a tree in the least amount of steps.

See here[2] if interested.

Happy hacking!

[0] https://jevko.org/ [1] https://djedr.github.io/posts/jevko-2022-02-22.html [2] https://gist.github.com/djedr/151241f1a9a5bc627059dd9b23fc74...

Jtsummers · on July 4, 2023

With the square brackets it has a bit of a Rebol feel to it. Was that intentional or coincidental?

djedr · on July 4, 2023

I suppose a bit of both.

I was more directly inspired by Lisps, but I do prefer the original M-expressions and the syntactic choices that REBOL and Red make.

I think placing the operator before the opening bracket better emphasizes its special significance and can reduce nesting for constructs like `f[x][y]` (vs. `((f x) y)` in Lisps). Square brackets somehow seem more aesthetically pleasing to me. And there is a practical reason to prefer them, especially if your syntax uses only one kind of brackets -- square brackets are the easiest to type on an average keyboard.

So REBOL-like syntax is nicer. As were M-expressions. They probably didn't catch on, because they were not minimal enough, compared to S-expressions. And maybe because S-expressions were fully implemented first.

marssaxman · on July 4, 2023

Nice project. Thank you for sharing.

djedr · on July 5, 2023

Thanks! :)

djedr · on April 23, 2023

What if we could have minimal redundancy and still be capable of error detection?

Check this[0] out:

    [
        id [0003]
        type [donut]
        name [Old Fashioned]
        ppu [0.55]
        batters [
            batter [
                [
                    id [1001]
                    type [Regular]
                ]
            ]
        ]
        topping [
            [
                id [5004]
                type [Maple]
            ]
        ]
    ]

Now if we delete `id` we will get a syntax error. And yet no commas, no colons, no quotemarks! Only square brackets. Minimal redundancy.

For a bit more error-checking-thru-redundancy we could analyze indentation (one reason why I recommend C-style rather than Lisp-style formatting) and warn if we detect any inconsistencies.

Ergo: through design magic we can get rid of a lot of the redundancy and increase desirable properties, without trading off much of the positive side.

NB for a machine we could compact the above into:

    [id[0003]type[donut]name[Old Fashioned]ppu[0.55]batters[batter[[id[1001]type[Regular]]]]topping[[id[5004]type[Maple]]]]

[0] More on that here: https://news.ycombinator.com/item?id=35675811

djedr · on April 23, 2023

As long as we are on topic of JSON alternatives suitable for configuration, I've been tinkering with various minimal syntaxes and formats for a long time, most notably Jevko[0].

One format based on that I've been conjuring up recently would represent the first example from the RON README like so:

    GameConfig[ optional struct name
        window size [[800] [600]]
        window title [PAC-MAN]
        fullscreen [false]

        mouse sensitivity [1.4]
        key bindings [
            up [Up]
            down [Down]
            left [Left]
            right [Right]
            
            Uncomment to enable WASD controls
            ;[
            W [Up]
            A [Down]
            S [Left]
            D [Right]
            ]
        ]
        
        difficulty options [
            start difficulty [Easy]
            adaptive [false]
        ]
    ]

I put a syntax-highlighted version and some more details in this gist[1].

I wonder what you guys think about such a minimal alternative.

[0] https://jevko.org

[1] https://gist.github.com/djedr/4eeac1de466512ec211ff17cfd1f5e...

djedr · on April 6, 2023

> With XML, the complexity is the baseline, and it only goes up from there. With JSON, the complexity is just an option, the baseline is pretty simple.

Very well put. And we could lower the baseline substantially towards simplicity, even from JSON.

It's pretty clear that a lot of people think this way. Some even seriously try to figure out what such a baseline of simplicity would look like.

There are lots of simple indentation-based designs (similar to YAML) such as NestedText[0], Tree Notation[1], StrictYAML[2], or even @Kuyawa's Dixy[3] linked in this thread.

There seem to be less new ideas based around nested brackets, the way S-expressions are. Over the years, I have developed a few in this space, most notably Jevko[4]. If there ever will be another lowering of the simplicity baseline, I believe something like Jevko is the most sensible next step.

[0] https://nestedtext.org/en/stable/ [1] https://treenotation.org/ [2] https://hitchdev.com/strictyaml/ [3] https://news.ycombinator.com/item?id=35469643 [4] https://jevko.org/

djedr · on March 12, 2023

Hey, just want to say that BLC is a remarkable artifact -- to me it's an art piece in computational minimalism.

I actually got so obsessed with it last year that I worked out a variant of lambda calculus[0] in which, with some trickery, a port of the BLC interpreter could be squeezed into 194 bits.

Which would be only 2 bits more than the intriguing conjecture from your paper[0] says to be optimal:

> We conjecture that any self-interpreter for any binary representation of lambda calculus must be at least 24 bytes in size, which would make E close to optimal.

I wonder what are the assumptions behind this conjecture. Surely my trickery broke some of them.

[0] https://xtao.org/blog/last-intro.html

[1] https://tromp.github.io/cl/LC.pdf

tromp · on March 13, 2023

LAST is an interesting variation, that is in essence identical to the oddly named "Real Fast Nora's Hair Salon 3: Shear Disaster Download" language [1]. Instead of L A S T, it names the 4 tokens LAMBDA APP ONE_MORE_THAN ZERO. I noticed that using two separate tokens for variable handling allows BLC to interpret LAST in only 193 bits.

Still, I suspect that for most programs, the savings from S-optimization do not quite make up for the (n-1) extra bits needed for every occurrence of variable n. What would for instance be the length of the shortest LAST program for the prime number character sequence, which takes 167 bits in BLC?

> I wonder what are the assumptions behind this conjecture.

I chose 24 bytes because it's a nice round number (3 * 2^3 * 2^3 bits) that sat a seemingly comfortable 14 bit margin below my best effort.

The conjecture assumes a binary input, that must be read bit-by-bit. How long is your LAST self-interpreter with a binary rather than quaternary input?

[1] https://esolangs.org/wiki/Real_Fast_Nora%27s_Hair_Salon_3:_S...

djedr · on March 13, 2023

I replied in Markdown here: https://gist.github.com/djedr/4ae1373faa7d43f359a2a245132341...

tromp · on March 13, 2023

> The other encodings for variables would certainly increase parsing complexity, so self-interpreters for these BLC variants would be much longer than the original.

Indeed this is the reason why I find those alternatives not too interesting. In most practical programs the variable index frequencies are reasonably well approximated by an exponential distribution for which the very simple unary encoding is optimal.

> In the end, as we relax the restrictions on the encoding of the input, it seems that we can decrease the length of the self-interpreter almost arbitrarily -- down to λm.m(λx.x)(λx.x) [Mogensen] (as pointed out by @sargstuff in this thread) or even λm.m(λx.x) [Brown and Palsberg]

Those are of a very different nature; mine has to tokenize and parse from a binary stream; theirs already has the whole term in a higher order abstract syntax tree.

> While obsessing about this and letting the mind wander I noticed the nice direct correspondence between the quaternary encoding of LAST and the DNA/RNA nucleobases

Yep; I didn't fail to notice that link either:-)

> Now this is a pretty cool coincidence that can provide some individuals with peculiar interests with even more peculiar entertainment, such as going through all possible permuations of AGCT, looking for the longest valid LC program embedded in random mosquito DNA.

They only way to not be valid is to run out of DNA or to not be closed. If you find a long repetition of a single nucleotide and make that L then the latter is unlikely to happen soon...

> Thank you for coherently writing down your thoughts for anybody to build upon what you have figured out. It's fun, it's inspiring, and it works!

Thanks for all the effort in creating and explaining LAST. I found it intellectually stimulating!

sargstuff · on March 13, 2023

Conjecture on why 194 bits[0] and not 192 bits[1]:

a) CS off by one error (started bit count at 1 instead of 0 )

b) Included 'NIL' as a bit.

[0] https://xtao.org/blog/last-intro.html

[1] https://tromp.github.io/cl/LC.pdf

sargstuff · on March 13, 2023

Snarky conjecture: 7bit ascii in-ram version of "λm.m(λx.x)(λx.x)" is mxx which if hardware supports bit addressing only uses 21 bits. Smallest ascii lisp is 14bits though.

djedr · on Jan 29, 2023

The article mentions SICP aka the wizard book. In it, there are a lot of metaphors pertaining to magic and spirits, which IMO makes the book much better than if they weren't there. One of the authors of the book, Hal Abelson, famously said:

> There's a good part of Computer Science that's like magic. Unfortunately there's a bad part of Computer Science that's like religion.

This points to perhaps a useful distinction that is to be made here between religion (as in tribalism which leads to holy wars) and mysticism/spirituality/magic (as in deep fascination in search of ultimate truth).

djedr · on Nov 3, 2022

> Precisely, or even find a way not to require commas.

Here it is: https://news.ycombinator.com/item?id=33457176

djedr · on Nov 3, 2022

The only solution to this war is no commas!

To roughly translate the first example in the article (I shortened the value of `d`):

  svg [ 
    x=[0px]
    y=[0px]
    viewBox=[0 0 21 21]

    path [
      fill=[#FFF]
      stroke=[#146AFF]
      strokeWidth=[2]
      d=[M11.1,1.4l2.4,...]
    ]
  ]

You can remove all whitespace and it's still unambiguous.

It's a perfectly viable lightweight syntax for SVG.

See also https://news.ycombinator.com/item?id=32995047

cardanome · on Nov 3, 2022

Another example would be Lisp. And yeah generally, fair point.

In Elm though it would cause some ambiguity because of the function call syntax not requiring parentheses. For example:

    [foo bar baz]

Is this a List with three elements or is it one element whose value is a function call on foo with the arguments bar and baz? Who knows. Sure you could have implicit commas on line break but that would be weird.

djedr · on Nov 4, 2022

Yes, in Elm or Lisp this would be problematic.

In this syntax though it isn't. Anything between brackets like this:

  [foo bar baz]

that does not contain any nested brackets is a simple tree which contains "foo bar baz" as its sole value (called a suffix).

In this syntax whitespace is not a separator.

So you can store a single piece of text in such a simple tree.

Now if you want to have 3 pieces of text you do this:

  [foo][bar][baz]

3 simple trees together -- they are subtrees of a top-level tree.

If you want to have 3 of these you do this:

  [[foo][bar][baz]]
  [[foo][bar][baz]]
  [[foo][bar][baz]]

The linebreaks don't change the tree structure, they are inserted purely for readability.

There are no implicit commas or anything.

Each subtree is also potentially a name-value pair. If you insert names before the opening brackets:

  [x [foo] y [bar] z [baz]]

You have 3 pairs.

The names themselves are also pieces of text (whitespace is not a separator), so you can have names like:

  [my x [foo] your y [bar] their z [baz]]

Still 3 pairs.

Now to turn trees like this into something useful, for example XML/SVG, you'd do a little bit of processing on them, e.g. you could transform the names like so:

  my x [foo] your y [bar] their z [baz]

into:

  <myX>foo</myX><yourY>bar</yourY><theirZ>bar</theirZ>

in this manner you could transform:

  svg [x=[0px] y=[0px] view box=[0 0 21 21] path [
    fill=[#FFF] stroke=[#146AFF] stroke width=[2] d=[M11.1,1.4l2.4,...]
  ]]

into:

  <svg x="0px" y="0px" viewBox="0 0 21 21"><path
    fill="#FFF" stroke="#146AFF" strokeWidth="2" d="M11.1,1.4l2.4,..."
  ></path></svg>

Notice that here I used a convention where names which end with "=" become XML attributes, whereas names which don't become children.

I have used the same convention here (except I don't bother with transforming names with spaces into camelCase): https://github.com/jevko/specifications/blob/master/easyjevk... to generate this HTML file: https://htmlpreview.github.io/?https://github.com/jevko/spec...

Now I intend to write specifications that codify conventions like this into different formats based on this fundamental syntax of square brackets.

It can be useful for all kinds of things. Its advantage is extreme simplicity and flexibility.

BTW, for clarity I have to say that the format that I used here: https://news.ycombinator.com/item?id=32995047 does a bit more transformations -- it actually sometimes treats whitespace as a separator (e.g. in `svg width[391]` space is a separator). That allows for extreme conciseness, but is not necessary and introduces complexity.