Syntax Design (2014)

djedr · on Oct 18, 2022

Very nice little article! Learned some new terms.

To anybody dabbling as I do in syntax design, who may be looking for an extremely minimal representation for trees (even more minimal than S-exprs!) I would like to introduce my little project called Jevko: https://djedr.github.io/posts/jevko-2022-02-22.html

It is pure distilled treeness. Its grammar fits into one line, if compressed well:

  Jevko = *("[" Jevko "]" / "`" ("`" / "[" / "]") / %x0-5a / %x5c / %x5e-5f / %x61-10ffff)

This took me years of syntax golfing to figure out. I think it's turned out pretty nice. It's complete, formally defined, with a few simple parsers written, except it has no users. ;D

To relate back to the article, an interesting and AFAIK original feature of this syntax is that newlines or other whitespace are neither significant nor insignificant nor "possibly significant" in Jevko. I'd call it whitespace-agnostic. Various whitespace rules can be laid on top of it, producing for example a Lisp-like language with native multiword identifiers with spaces, e.g.:

  define [sum primes [[a][b]]
    accumulate [
      [+]
      [0]
      filter [
        [prime?]
        enumerate interval [[a][b]]
      ]
    ]
  ]

here "sum primes" and "enumerate interval" are two double-word identifiers. It's the only right_solution to the identifierWars, I-tell-you!

nathell · on Oct 18, 2022

I thought “what a weird name”, then silently pronounced it and my Polish ear heard “drzewko”, meaning “little tree”. What a fitting name. :)

djedr · on Oct 18, 2022

:)

For a long time I couldn't find the right name that would express the generic nature of it.

An earlier prototype was called TAO, as an acronym for Tree Annotation Operator (it had an extra feature called operators), and as a reference to the ancient Chinese concept, in essence nameless and by design hard to pin down -- this seemed to fit perfectly.

However there is about 2^42 cowznofski potrzebie things called TAO (kind of ironic, as the original idea was that the Tao would be distinct from the countless named things), so it turned out to be a bad name after all. So I decided to find a more unique one and here we are.

The amount of time spent thinking about this and the lengths I went to are better left untold.

In other words, naming is hard.

mcluck · on Oct 19, 2022

I just want to tell you that Jevko is quite a fun little language you've built. I just threw together a parser for it in TypeScript (I'll upload it to Github eventually haha). It was super easy. I'm going to recommend it to my friends who aren't as excited about parsers and things as me for an introduction to parsing programming languages.

djedr · on Oct 19, 2022

Thank you, I'm very happy to hear that. :)

Ease of implementing is a goal, so that there is minimal friction when porting into any language.

This reminds me that I should finish this tutorial: https://github.com/jevko/tutorials/blob/master/parser.md

I hope you can get your friends excited! ;)

I'd really love to see folks put Jevko to use!

abathur · on Oct 18, 2022

This reminds me of Breck Yunits' Tree Notation (https://treenotation.org/). Both seem to have a ~totalizing energy. Maybe some common cause. :)

djedr · on Oct 18, 2022

Indeed, it's close. Obviously mine and Breck's levels of appreciation for indentation/brackets are very different. ;) Although independent, the paths we have taken to arrive at these are somewhat similar (somewhere early in there are experiments with visual programming). As are the tools of thought (minimalism). We were thus taken to similar places.

Before I was aware of the existence of Tree Notation I put my syntax online at tree-annotation.org (now defunct), so even naming converged. I was initially very confused myself. :D

Ultimately I think that the existence of multiple incarnations of this idea suggests that there is (perhaps a very niche) need for a minimal syntax like this. Something like S-exps, but general-purpose. Trying to satisfy that need is the common cause.

The way I imagine it is that it would be supported across programming languages, like JSON. It could be an universal format for (tree) structured data. There is this piece of the Unix philosophy which says that text streams are the universal interface. That's true on a certain level. On another level not far below binary streams are the universal interface. On another level not far above... there was nothing universal until XML. But that was overkill, so JSON displaced it. But that's still overkill, so...

abathur · on Oct 18, 2022

I agree that it feels like multiple projects are converging on something that is ripe (or close).

I have done some deep-digging for markup languages and came across more than one project in this space. (I've added Jevko to my list; https://twitter.com/abathur/status/1582492437984837632)

You may have already seen it as well, but you might also find https://github.com/teamtreesurf/link interesting.

djedr · on Oct 19, 2022

Nice list! I haven't seen any of these before.

Link Text is particularly ~~spooky~~ interesting to me, as it quotes Tao Te Ching. :O

See https://news.ycombinator.com/item?id=33251715

Maybe it should become recommended reading for minimalist syntax designers. :D

The spirit that lives in the computer must be the same one as the one in the Chinese classics.

The universal interface below text streams was also somewhat inspired by one:

https://en.wikipedia.org/wiki/Binary_number#Leibniz_and_the_...

Pretty fun to think about.

kenshoen · on Oct 19, 2022

So is it a sintax for n-ary trees? Nice!

  newtype Jevko = Jevko ([(String, Jevko)], String)

djedr · on Oct 19, 2022

Indeed that's a useful way to look at it and a type definition that'll do the job of storing Jevko parse trees.

See also: https://xtao.org/blog/rose.html

znkr · on Oct 18, 2022

This is very beautiful, nice work. I wonder if I should use it for something...

djedr · on Oct 18, 2022

Thank you. :)

> I wonder if I should use it for something...

I'd be honored!

A couple of ideas:

How about a simple configuration format? https://gist.github.com/djedr/681e0199859874b3324eaa84192c42... (I should make a library out of this)

Or you can put it in your query strings to make them more humane: https://github.com/jevko/queryjevko.js

Or make up a markup DSL: https://github.com/jevko/markup-experiments#asttohtmltable

Or serialize game objects in your indie game. Or make it the interface of your experimental app. Or use it to shave off a few unnecessary characters off your data: https://jevko.github.io/compactness.html

No parser in your favorite language? A basic one should be only a couple dozen lines! https://github.com/jevko/parsejevko.js / https://github.com/jevko/specifications/blob/master/spec-sta...

justinpombrio · on Oct 19, 2022

Could you write that example as an s-expression? I can't quite parse it.

djedr · on Oct 19, 2022

Sure. This is translated straight out of SICP 3.5.1[0]:

  (define (sum-primes a b)
    (accumulate +
                0
                (filter prime? (enumerate-interval a b))))

If you have an S-exp parser built into your optical nerve (to paraphrase L. Torvalds), it will be indeed hard to parse.

The translation rules for an S-exp are roughly:

1. switch to square brackets

2. surround all symbols except the first in brackets

3. move the first symbol outside the opening bracket

4. replace dashes in symbols with spaces

5. leading and trailing whitespace doesn't count into a symbol.

That's basically it.

Besides that indentation and closing brackets are written C-style rather than Lisp-style.

Now the real confusion might begin if you have an S-exp with an S-exp as the first element like:

  ((fnr) 1 2)

This would translate to:

  fnr[][[1][2]]

which is... interesting.

Anyway, this is just a theoretical example to demonstrate the whitespace-agnosticism. I have not implemented this particular language. But it's one way to arrange these trees into a Lisp-like language, if one was so inclined.

The cool thing is the simplicity and identifiers with spaces. Not cool, besides unfamiliarity, is the verbosity compared to regular S-exps (more brackets).

[0] https://mitp-content-server.mit.edu/books/content/sectbyfn/b...

emmelaich · on Oct 19, 2022

I like the idea of [] rather than {} or ().

Saves using the shift key!

djedr · on Oct 19, 2022

Glad you like it. That's the intention. ;)

thrtythreeforty · on Oct 18, 2022

I find the section on "syntactic salt" interesting:

> The opposite of syntactic sugar, a feature designed to make it harder to write bad code. Specifically, syntactic salt is a hoop the programmer must jump through just to prove that he knows what’s going on, rather than to express a program action.

This is perhaps an uncharitable way to describe it, but the concept does ring a bell. Rust's unsafe {}, C++'s reinterpret_cast<>(), etc - all slightly verbose. More important than jumping through hoops, the verbosity helps when reading code to know that something out of the ordinary is going on.

DonHopkins · on Oct 18, 2022

And then there's Perl's "syntactic syrup of ipecac".

https://en.wikipedia.org/wiki/Syrup_of_ipecac

mvf4z7 · on Oct 18, 2022

This reminds me of Reacts "dangerouslySetInnerHTML" prop.

https://reactjs.org/docs/dom-elements.html#dangerouslysetinn...

zppln · on Oct 18, 2022

> syntactic salt

I feel like this describes Rust's lifetime annotations pretty well too.

Karliss · on Oct 18, 2022

I disagree about lifetime annotations being syntax salt. At least with my interpretation of what syntax salt.

From syntax perspective lifetime annotations are almost as short as possible assuming you want to explicitly convey this information at all ('a is just two symbols and one of them is identifier). The alternative of not specifying it at all comes with major tradeoffs of either in memory safety (like in C) or runtime performance (like most programming languages with dynamic memory management). In theory there is third option of compiler fully deducing lifetimes, but that's far from trivial, has it's own costs and realistically even further narrow down what programs the compiler considers valid and increasing compilation time.

There are strong similarities with typing strategies. Just because there are programming languages with dynamic typing doesn't mean that explicit static typing is salt. Dynamic typing has performance cost, and static inferred typing has worse self documentation properties and slightly bigger compilation time cost since you can't process each function independently.

On the other hand reinterpret_cast<Foo*>(expr) doesn't provide to compiler any extra information that couldn't be conveyed with simpler less verbose syntax like R(Foo*)expr or (expr as Foo*). Same with unsafe{} blocks, compiler already know which operations are unsafe.

epage · on Oct 18, 2022

Not just lifetimes but types used to do more complex lifetimes that are normalized by other languages like RefCell, Arc, etc.

nyanpasu64 · on Oct 18, 2022

What I can't stand about Rust is that the language developers think they know better than language users developing software. They stack on syntactic salt to make it more unpleasant to write the equivalent of correct C++ programs with aliased mutation or manual freeing, in situations where the idiomatic Rust ways are also unpleasant to write (Cell, RefCell, raw pointers), have runtime overhead (RefCell), are busywork to implement in working programs (restructuring your entire program around an ownership tree with only ephemeral stack-allocated cross-linking &mut, which is only sometimes possible without reducing performance or increasing memory use), or are easy, tempting, and undefined behavior in Rust but not C++ (casting *mut to &mut in an unsafe block in a safe function).

Rusky · on Oct 18, 2022

This is not an accurate characterization of the Rust language developers. Neither of these features were designed as "syntactic salt!" They are compromises, made on a time budget to achieve goals which were higher priority for the project- but the door is still open to improve them. This is a far cry from "knowing better than language users," which implies that they could have simply left that syntax out while still achieving their goals. (Or worse, that their goal was specifically to annoy people...?)

For instance, they are not satisfied with the current raw pointer syntax either, as it interacts poorly with lvalue/place syntax in ways that make unsafe code harder to audit. There are regular proposals for how to improve things like `(*ptr).field` or the use of raw pointers as method receivers.

The situation with interior mutability is similar: compile-time memory safety inherently requires some limitations on programming style, but I regularly see proposals for how to improve "field projection" syntax.

> undefined behavior in Rust but not C++ (casting *mut to &mut in an unsafe block in a safe function).

The question of "syntactic salt" aside, this is simply false.

nyanpasu64 · on Oct 18, 2022

Casting mut to &mut in a safe function is unsound, and UB if the result is used alternatingly with an earlier &mut to the save object (I've seen this in a library I tried using). In C++ casting a to a & is sound, and casting a * into a __restrict & might be unsound but restrict is so rarely used that it doesn't matter, whereas safe Rust nearly requires using &mut for mutating through a pointer.

As for "compile-time memory safety inherently requires some limitations on programming style", I find compile-time lifetime safety to be a tradeoff, and often a net negative in not only performance but ease of programming for low-level code maintained by the same individual preserving a "theory" of the code over time (whereas I don't find compile-time bounds checking or thread safety to be a net negative to programmer experience nearly as often). And when I see people on crusades to stop programmers from writing code in unsafe languages (taking away programmers') ability to opt out of this tradeoff, I will stop at nothing to oppose these people.

Rusky · on Oct 18, 2022

> Casting *mut to &mut in a safe function is unsound, and UB if the result is used alternatingly with an earlier &mut to the save object

This is a very different statement than your first comment, and still wrong. Casting *mut to &mut is not, itself, unsound or UB- it is, for example, how `Mutex` and `RefCell` are implemented in the standard library!

Alternating &muts is of course UB, but that has nothing to do with raw pointers- that's just the definition of &mut, however you obtain one. Even if it were not strictly UB, it would still need to be forbidden, because thread safety and library code (like `Mutex`) relies on it.

> I find compile-time lifetime safety to be a tradeoff

Yes, this is exactly what I said in the first place- my further point was that the other side of the trade-off is not fixed, and the ergonomics of raw pointers and interior mutability can, have, and will continue to improve over time.

Or in other words, the Rust language developers are not "on a crusade to stop programmers from writing code in unsafe languages." Rather, they looked (and continue to look) for ways to expand the space of programs that can be written in safe languages. This is an important distinction, and you appear to be causing yourself a lot of grief ("stop at nothing"? really?) by conflating one with the other.

nyanpasu64 · on Oct 19, 2022

I'll reconsider Rust as a low level language once improved raw pointer syntax becomes a reality. And regarding the crusade, I wasn't targeting Rust's creators (I think safe programs are currently too limited and unsafe too difficult to call "replacing C++" a success, but I can pick alternative low-level language when needed), but:

> My ire is directed at people like pcwalton who argue that it's wrong to design non-watertight-safe languages like Zig at all for application development, and mjg59 who argue unsafe language creators should be held legally liable for users harmed by programs written in those languages.

saghm · on Oct 18, 2022

> Casting mut to &mut in a safe function is unsound, and UB if the result is used alternatingly with an earlier &mut to the save object (I've seen this in a library I tried using). In C++ casting a to a & is sound, and casting a * into a __restrict & might be unsound but restrict is so rarely used that it doesn't matter, whereas safe Rust nearly requires using &mut for mutating through a pointer.

I'm not 100% sure I know what you're trying to say here due to the autoformatting doing a number on your pointers, but if I'm correct, you're unhappy that being able to cast a raw pointer to a reference in Rust is unsafe when casting a pointer to a reference is sound in C++. What do you think should happen if the pointer is null? I don't think this is sound in C++ either; this simple example[1] segfaults for me, and I imagine that it's undefined behavior and that even a segfault isn't guaranteed. In Rust, you can call this method[2] to convert a raw mutable pointer to an Option of a mutable reference, which will contain None if the raw pointer is null. If you absolutely don't want to check that it's null, you can always unwrap that. Yes, `unsafe { foo.as_mut().unwrap() }` is more verbose than `foo as &mut _`, but it doesn't really strike me as controversial that the syntax is being optimized for other uses.

> As for "compile-time memory safety inherently requires some limitations on programming style", I find compile-time lifetime safety to be a tradeoff, and often a net negative in not only performance but ease of programming for low-level code maintained by the same individual preserving a "theory" of the code over time (whereas I don't find compile-time bounds checking or thread safety to be a net negative to programmer experience nearly as often).

I think that's a fair opinion to hold, even though I don't happen to agree with it personally.

> And when I see people on crusades to stop programmers from writing code in unsafe languages (taking away programmers') ability to opt out of this tradeoff, I will stop at nothing to oppose these people.

This is where you lose me. Developing a language with verbose for casting a possibly null pointer to a type that assumes it's not null is worth "stopping at nothing to oppose"? This sounds like you're the one going on a crusade to stop people from using languages you don't like. I don't really see why you'd assume that the people who develop Rust are the same ones who are evangelizing it in a way you don't like, but I guess that it doesn't matter if that's accurate or not if you're going to stop at nothing.

[1]: https://godbolt.org/z/fvfc8cGda [2]: https://doc.rust-lang.org/std/primitive.pointer.html#method....

nyanpasu64 · on Oct 19, 2022

Rust is a language with advances and flaws compared to C++ (and presumably C too). I think Rust made some bad decisions to the point I would avoid using it for certain classes of software (generally low-level code, or emulators with complex shared mutability by design). My ire is directed at people like pcwalton who argue that it's wrong to design non-watertight-safe languages like Zig at all for application development, and mjg59 who argue unsafe language creators should be held legally liable for users harmed by programs written in those languages.

athrowaway3z · on Oct 18, 2022

"Man tries to hammer in screw. Angry at hammer manufactures for not being screwdriver manufactures"

----

I'll just edit this comment to clarify: Rust does a lot of thing really well. Non-RC manual memory management for a tree/graph structure is absolutely not it.

As for the verbose 'unsafe' pointer/memory manipulations, i really don't see the issue. I've written my share and I think its fine to add a roadblock if you want to shoulder the ability to add segfaults and other issues into a codebase. Additionally, it usually helps that you decide to encapsulate it into the least number of unsafe functions, instead of 'doing it manually' all over a codebase.

Chris_Newton · on Oct 18, 2022

“Left-handed person tries to put screw in with screwdriver shaped for right-handed holding. Right-handed people are surprised when left-handed person decides screwdriver isn’t for them and uses something easier instead.”

We saw this for years with C++ and the new-style casts. The principle of making casting behaviours more specific and clarifying the risk of using each of them was fine. In practice, if someone asks programmers to start writing verbose, syntactically awkward stuff like reinterpret_cast<X*>(p) instead of (X*)p and either will work, obviously in the real world many will choose the latter. Empirically, the in-your-face syntax turned out to be a deterrent to adopting the better tools the language offered and so devalued those tools for everyone.

kibwen · on Oct 19, 2022

This entire conversation is assuming that `unsafe` in Rust is designed to be syntactic salt to discourage unsafe code, but this is false. The `unsafe` keyword exists to make it easier to understand, review, and audit where memory-safety invariants might ultimately be violated in a zillion-line program, so that instead of needing to carefully consider a zillion lines you only need to consider a tiny fraction of those. It's not there to make programmers' lives harder, it's intended to make programmers' lives easier.

nyanpasu64 · on Oct 18, 2022

Rust is intended to be a systems language capable of replacing C++ in its niche, and interfacing with existing C++ at a fairly coarse-grained level like Firefox's oxidation (though cxx is trying to enable rich interop passing richer types than C-ABI ones), so it's trying to be a better screwdriver more so than a hammer. So difficulty expressing C++ concepts is arguably a flaw, and difficulty implementing all software with the same CPU/memory overhead as C++ (which I'd argue is the case, though some would disagree) is definitely a flaw.

It's like creating a new screwdriver bit or handle, trying to convert the world to it, then attracting a legion of followers arguing that manufacturing flat-head screwdrivers should expose you to legal liability for anyone who slips it out of the socket and injures themselves (ignoring that flat-head screws existed and will continue to exist).

cxr · on Oct 18, 2022

> More important [...] the verbosity helps when reading code to know that something out of the ordinary is going on

This applies to JS as well with its strict equality check (triple equals). Bad practices within the NodeJS ecosystem, however, have led to circumstances where triple equals has been cargo culted as the "right" thing to do for any equality comparison. The consequences of this include code that is more verbose, is no more type safe (and often doesn't do the right thing for some inputs—whereas with double equals, in contrast, it would...), and that the appearance of triple equals is no longer a strong signal that there's something happening that's worth paying attention to.

adamddev1 · on Oct 18, 2022

Can you give some examples of inputs where === doesn't do the right thing?

cxr · on Oct 21, 2022

Suppose you're working on a compiler. You want to add a feature flag or otherwise be able to produce a build that when run attaches rich diagnostics information to the emitted tokens to assist in debugging—let's say you're really interested in the source offset where a given token appeared, for example. Unless you've already really gone off the rails in other ways, then you can trivially do this just by making special mode compiler pass around String objects where your token processing routines expect a string—making sure you define a `position` property at the time of creation. This is possible because your input isn't overconstrained. It's an abstract data type.

Using triple equals, on the other hand, will screw this up. Your inputs are now overconstrained—they must be string primitives. In other words, you've invented an unnecessary requirement <https://www.teamten.com/lawrence/programming/dont-invent-unn...>.

adamddev1 · on Nov 2, 2022

Wow that's a well thought out and interesting example, thank you. I recently learned about `new String()` and wondered where people might use/get burned by that.

cxr · on Nov 2, 2022

Glad you found it useful. I was worried it would come off as too esoteric/contrived. There are lots of other legitimate, substantive scenarios, but it's generally hard to give examples that have both the intended effect of really resonating with a wide audience and that can also be described succinctly. The tokenizer use case is something I struck upon a few months ago and remains the best thing that reasonably stands alone to work as an example that I've found so far.

tmtvl · on Oct 18, 2022

I've heard it called "syntactic vinegar" instead.

I wonder if there's a term for syntactic useless stuff, like commas in Clojure quasiquoted lists.

lioeters · on Oct 20, 2022

I propose "syntactic konnyaku".

> Konnyaku is a rubbery, flavorless zero-calorie food made of yams that are high in fiber and added to various Japanese foods for its squishy texture.

mncharity · on Oct 18, 2022

Another exploration of syntax: http://rigaux.org/language-study/syntax-across-languages.htm...

xaedes · on Oct 18, 2022

Wow, amazing! This really is a comprehensive overview. Basically a syntax Tafelwerk.

samsquire · on Oct 18, 2022

I began designing a language that handled recursion and iteration as relations between variables which are topologically sorted to determine control flow.

Each function is a toplogical graph of stream functions so it is similar to a data flow language or reactive programming language. The goal is that you should express the critical insight of the algorithm to work out what to write and the code is not nested so there is very little tree structure.

Algebralang is rough notes on how it would appear.

https://github.com/samsquire/algebralang

Example programs in the repository are binary search, btrees, a* algorithm.

I wrote a multithreaded parallel actor interpreter in Java and it uses an invented assembly language which doesn't have a bytecode, it's just text.

https://github.com/samsquire/multiversion-concurrency-contro...

I like the ideas behind ani language https://github.com/waves281/anic

Jtsummers · on Oct 18, 2022

If possible, I'd add indentation to your examples to make them much more readable. As it stands, it's like reading one of my math prof's C code (he was an old FORTRAN when it was shouting coder and never learned to indent):

  insert t node = recursive_deepest_first(items=t.children,item=t,
  lastRecursion=l)(
  if len(t.children) == 0 t.activate()
  location = reversed(t.children).find(item=i, item.value >= node.value)
  output = insert t node
  if len(t.children) > 3 {
  replace(t, Node(value=t.children÷(point=middle=m) 
  = m.value,children.sort(item=i,sortKey=i.value)=t) 
  output = l.t }
  else deepest t.children.append(node)
  )

Assuming this doesn't invalidate the program it reads more clearly and only takes one more line:

  insert t node = recursive_deepest_first(items=t.children, item=t, lastRecursion=l)(
    if len(t.children) == 0
      t.activate()
    location = reversed(t.children).find(item=i, item.value >= node.value)
    output = insert t node
    if len(t.children) > 3 {
      replace(t, Node(value=t.children÷(point=middle=m) = m.value, children.sort(item=i, sortKey=i.value)=t) 
      output = l.t
    }
    else
      deepest t.children.append(node)
    )

That original program was hard to decipher for both lack of indentation and the odd line breaks, and inconsistent choice of space or no space after commas. Another question, why do you use `if ... then...` in some examples and not in others? Is that a user choice?

samsquire · on Oct 18, 2022

Wow thanks for reading my page and looking at the examples. I appreciate your time.

There's a few bugs in that code. Sorry for presenting something that obviously wasn't ready. I didn't use location when I insert into that position in the btree. And the node spliting code has an error.

And thanks for reformatting the code.

It's a very rough design. The critical insight over many algorithms is hidden in one character or line. Such as a strategic -1 or +1 or pattern of recursion that means it becomes understandable.

I find when writing code the structure of the code is more important than the calculation or addition or subtraction. Which is surprising because computers are calculators. The structure of traversal, laying out data in memory and structure of the jumping around instructions in memory is harder than the core insight of a division, or subtraction or addition or append or +1 or if statement here or there.

When I write recursive code I often want to refer to outer context of an outer recursion. So that's the meaning of the "deepest"

Jtsummers · on Oct 18, 2022

You may be interested in things like Strand and the work on parallel Prologs which have a similar "let the computer system sort out the proper execution order". This wouldn't satisfy your syntax desire (Algol family) but may help develop your understanding of the problem domain.

A discussion last year: https://news.ycombinator.com/item?id=26948351 (wow, 18 months since that discussion, seemed more recent in my memory)

tabtab · on Oct 18, 2022

I actually like the VB-style, but VB did it mostly wrong: if you start the block with X, you should always end it with "End X". Thus, you'd have While ... End While instead of crap like "Wend" and "Next" (in For...Next).

It's more legible to know what block is being ended. C-style continually frustrates me that area. The End-X style just never found a nice way to wrap text for longer statements.

C-style also has a problem in that there is no way to define arbitrary blocks: it relies too much on key-words. I'm trying to remedy this with "Moth" syntax:

https://www.reddit.com/r/ProgrammingLanguages/comments/ky22d...

It's LINQ-esque but without the bloated Lambda conventions, and influenced by XML in that you have a simple syntax pattern that can "implement" many domains' needs. It started with an attempt to merge the best of Lisp and C-style. (Whether it succeeded or not is hotly contested. I welcome other attempts.)

Jtsummers · on Oct 18, 2022

> I actually like the VB-style, but VB did it mostly wrong: if you start the block with X, you should always end it with "End X". Thus, you'd have While ... End While instead of crap like "Wend" and "Next" (For...).

That's covered in their syntactic salt section, but Ada does, mostly, what you describe.

  procedure Proc(...) is
    -- vars, types, and subprograms defined here
  begin
    ...
  end Proc;

  for I in Some_Array'Range loop
    ...
  end loop;

  if condition then
    ...
  end if;

mojifwisi · on Oct 18, 2022

> C-style also has a problem in that there is no way to define arbitrary blocks

I might have misunderstood what you mean by "arbitrary blocks", but you can definitely do this in C:

  int main()
  {
    {
      /* arbitrary block */
      return 0;
    }
  }

Jtsummers · on Oct 18, 2022

They seem to mean a computational block (or a closure/lambda) that can be passed on to other functions. Try to do this in C (it's invalid as presented, but this is the concept):

  int main() {
    int* collection = ...
    int* filtered = filter(collection, int func(int item) { return item > 10; });
    ...
  }

You have to actually define a function at the top level in order to pass that in and there is no notion of closures so you can't do the more useful thing that you might have in even C++ these days:

  int main() {
    int* collection = ...
    int limit = ...
    auto filter = [&limit](auto item) { return item > limit; };
    int* filtered = filter(collection, filter); // assuming `filter` is defined
    ...
  }

You can come close, but you create a lot of extra bookkeeping in your C program to pull it off, and the functions are still only defined at the top-level.

hoosieree · on Oct 18, 2022

GCC has an extension for nested functions: https://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html#Nes...

Not as nice as anonymous blocks but helps put the function closer to where it's used, which is nice.

Jtsummers · on Oct 19, 2022

Yeah, but the two dominant open source C compilers have chosen different approaches. This makes nested functions (gcc) and blocks (clang) both unusable if you want portable code even between the pair of them (and not generally).

f1shy · on Oct 18, 2022

Note that in the for case, you can have many "next" and having many "end" would be silly.

Arch-TK · on Oct 18, 2022

> Because C does not have real arrays

C does have real arrays, they just get implicitly converted to pointers to their first element in a lot of cases (for a multitude of reasons in part having to do with simplifying the language), A[B] is defined as such so it works with normal pointers and arrays-converted-to-pointers in the same manner.

Try using an array with sizeof, unary &, or in the form of a string literal used to initialise an array. In those situations it suddenly stops behaving like a pointer to its first element and definitely behaves like something which is unlike anything else in C (hint: it's an array).

Existenceblinks · on Oct 18, 2022

Fun read. I would love to read what if it's not text based, is it going to be different? Visual programming seems to suffer from composability and it's also bounded to be using human language as well, box with border is hard to comprehend, can get messy easily.

I mean, visual but text-like theme. It seems to be in sweat spot. Only fix some downside/limit of text.

csmeyer · on Oct 18, 2022

Shameless plug for my hybrid visual/text pl, Pickcode, which matches what you're talking about

Demo programs: https://app.pickcode.io/playground

Existenceblinks · on Oct 18, 2022

Hey, yes! Nice, I'm thinking more serious and ambitious. You could go mass by adding "module" and ways to compose. The keyboard navigation is ok-ish (honest opinion) because this is the hardest ones which is design for every day task in long run, at minimum should be as fast as text based programming. At least 3-4 devs are comfortable to work on this codebase (non-realtime, just normal version branching flow)

I really really want this decades old idea to take off. We should have grammar files in .. json is fine (have to start somewhere), and have spec for editor implementor to spread it across platforms. Ideally languages creators only have to customize "view" to decide how their lang would look like. Probably configure keymaps if they think their lang can be developed fast with certain keystroke (akin to emacs but more friendly because medium is not text anymore)

Vermeulen · on Oct 18, 2022

Wow this is awesome, love this style. I did something really similar with our game's scripting language called MBScript: https://docs.modboxgame.com/docs/mbscript Same kind of line setup, visual add button, etc.

Whats Pickcode made for? Web programming?

csmeyer · on Oct 18, 2022

Pickcode is meant for K-12 education as an alternative to block programming. The end goal is to have a WYSIWG editor for web apps with behavior defined using the visual programming language.

MBScript looks great and I’d love to talk about your learnings from it. My contact is on the pickcode.io landing page if you want to chat!

masklinn · on Oct 18, 2022

You could check out Self. It’s image-based so the objects are “live”, and can be interacted with directly via the UI.

Existenceblinks · on Oct 18, 2022

Is it https://www.youtube.com/watch?v=CCx6Nj_Hr1g ? I've seen quite many live programming languages. Though not a single one seems to want to go mass. Like able to have at least a 3-4 devs team work on it.

masklinn · on Oct 19, 2022

> Is it https://www.youtube.com/watch?v=CCx6Nj_Hr1g ?

Yes.

> I've seen quite many live programming languages.

I have no idea what that means.

JohnDeHope · on Oct 18, 2022

I enjoy this sort of "one example, multiple different lenses" style of discussion. It reminds me of this book... Exercises in Programming Style by Cristina Videira Lopes.

rramadass · on Oct 19, 2022

>Exercises in Programming Style by Cristina Videira Lopes

I didn't know of this book (have been looking for something like this for a while) and hence thanks for the pointer.

kragen · on Oct 18, 2022

Another interesting weird syntax I ran across a few years ago is OGDL: https://ogdl.org/

It's sort of an alternative to S-expressions with much less punctuation, but the data model is slightly different — in S-expressions you label the leaves, and in OGDL you label the nodes. In other contexts these node-labeled trees are sometimes called "rose trees"; they are the basic data model of, for example, Prolog. Labeling nodes is almost equivalent to labeling arcs, but OGDL does support multiple references, so not quite.

The OGDL proposal was intended for data, like XML, not programs. They started out by trying to simplify YAML, which has arrays and dicts, and they simplified it by unifying them into a single structure.

Here's one of their examples:

    network
      eth0
        ip   192.168.0.10
        mask 255.255.255.0
        gw   192.168.0.1

    hostname crispin

This is not quite just an edge-labeled digraph because, as in S-expressions, the order of arcs within a node is significant; you can have multiple edges with the same label in the same node, and you can select edges by ordinal rather than, or in addition to, label.

This is of course amenable to use as a programming syntax.

adamnemecek · on Oct 18, 2022

Seems down https://web.archive.org/web/20221018135106/https://cs.lmu.ed...

Archelaos · on Oct 19, 2022

I wonder what would be possible if we could move away from the text editor and code with an e-pen on a tablet, the way mathematicians use a black board, where the parsing is not from left to right, but follows specific paths in two dimensions (like the notations for fractions, square roots, sums, integrals, etc. we learned at high school). -- Has this been tried out recently?

wodow · on Oct 18, 2022

I really like the term "Sugary Functional Style" -- sweetening pure functional programming with faux procedurality.

Looks like it's a (three word) Google Whack at the time of writing: https://www.google.com/search?hl=en&q=%22Sugary%20Functional...

bhauer · on Oct 18, 2022

It's awesome to see an article from Dr. Ray Toal on HN this morning! In my biased opinion, the excellent tenured professors at LMU's CS program make it a stand out for its size.

emmelaich · on Oct 19, 2022

> [mark the end by] .. spelling the initial word backwards

This reminds of the quirk in shell. if/fi, case/esac but ... do/done?

It might be because `od` was taken for octal dump.

hzhou321 · on Oct 18, 2022

The `infix` syntax is missing from the major items. Without infix syntax, all languages are just variations of LISP -- I guess that was all the article is about.