Generators are a fairly natural way to write recursive descent parsers. The alte...

kevincox · on Nov 7, 2022

+1 There is a lot of code that is easier to write in "push style" where the code runs blocking and "pushes" results (either to a queue, a callback or a result array that is returned at the end) but it is better for the consumer if it is "pull style" where they can process items as they receive them and can do streaming/incremental parsing. Language supported generation functions/courtesies make it very easy to write in a "push style" while being possible to call in the "pull style".

girvo · on Nov 8, 2022

An example of what you’re describing in a different domain, is how we are using channels between tasks in FreeRTOS in our embedded firmware. Other tasks just push commands or data into a channel that the consumer can pull at is leisure (modulo the size of the channel of course, we have fun constraints!)

shadowofneptune · on Nov 7, 2022

In my own lexer in JS, I considered using a generator but instead used a class with an index property. In your experience, what advantages does a generator have over that approach?

MrJohz · on Nov 8, 2022

The big advantage (and possibly disadvantage) is that the class properties (i.e. the internal state of the iterator/lexer) are implicit, which means you don't need to worry about setting them and restoring them each time. For example, you could write something like this:

    let idx = 0;
    while (true) {
        switch input[idx] {
            case '"': {
                const [token, length] = lexString(input.slice(idx));
                 yield token;
                 idx += length;
            case '...': {}
        }
    }

Here, we don't need to worry about saving the state of the index, because it's implicitly "saved" at the yield, and we know we'll end up back on the line after the yield once control is returned to the generator.

In this case, where the only relevant state is the index, that's not that much of an advantage, but it could be more of an advantage if you were writing a generator function with more internal state - for example, a parser using a state machine to determine which tokens are valid next.

mr_toad · on Nov 7, 2022

Not just for parsers, you can walk any tree structure with a generator. For example turning a binary tree into a sorted iterable.

jitl · on Nov 7, 2022

A bunch of Notion's text editing algorithms need to walk up and down our tree of blocks in a particular order. We implement those walks as generators.

triyambakam · on Nov 7, 2022

That's really cool. Do you have any further more detailed examples of writing such a parser using generators? It's very coincidental because I've been learning about this topic recently.