I don't really think of yacc as having much more abstraction than a regex engine...

laxd · on July 2, 2023

You got good theory backing hand-written parsers, and the code is there right in your face. I don't know how a regex engine works.

remexre · on July 2, 2023

By "good theory," I mean being able to know properties of the parser as a result of the formalism.

For example, one can prove that any yacc-generated parser never infinite loops and runs in O(n) (at least for the actual parsing; you can write custom code in actions that runs when a construct has been parsed, and of course you can write whatever you want there).

If you hand-write a grammar like the following, you'll end up with an infinite loop without restructuring the grammar; such restructuring is of course much easier if you have written the grammar out, rather than having it implicitly exist in code.

    Expr ::= Expr '[' Expr ']'
    Expr ::= Name
    Expr ::= Int

If you try to naively parse the following, you'll never parse the marked production either. Something like yacc will warn you about this, while an ordinary compiler compiling a hand-written parser won't.

    Stmt ::= Expr ';'
    Stmt ::= Name '=' Expr ';'
    Stmt ::= Expr '[' Expr ']' '=' Expr ';'
    Expr ::= Expr '[' Expr ']'
    Expr ::= Name
    Expr ::= Int

[0] is a good introduction to regex matching, [1] is a good introduction to the algorithm yacc uses, if you want to learn either.

[0]: https://swtch.com/~rsc/regexp/regexp1.html [1]: https://web.archive.org/web/20210507215636/http://web.cs.dal...

laxd · on July 2, 2023

> For example, one can prove that any yacc-generated parser never infinite loops and runs in O(n)

Trivial when actually writing the code. And your example of infinite left recursion is basics in parser theory. No one in their right mind would write a hanging program like that, at least not for long.

remexre · on July 2, 2023

Left recursion is basic, sure, but it's not a problem for an (LA)LR parser generator like yacc. (Depending on your parse actions, you might even get slightly better memory consumption when using left recursion rather than right recursion.)