Two approaches are severely underused in the software world:
1) Domain-specific languages (DSLs)
2) Virtual machines (or just explicit state machines more generally)
What I mean is, alot of problems could be solved cleanly, elegantly, more safely, and more powerfully by using one (or both) of the above. The problem is that when people think DSL or VM, they think big (Scheme or JVM) instead of thinking small (printf). A DSL or VM doesn't need to be complex; it could be incredibly simple but still be immensely more powerful than coding a solution directly in an existing language using its constructs and APIs.
Case in point: the BSD hexdump(1) utility. POSIX defines the od(1) utility for formatting binary data as text, and it takes a long list of complex command-line arguments. The hexdump utility, by contrast, uses a simple DSL to specify how to format output. hexdump can implement almost every conceivable output format of od and then some using its DSL. The DSL is basically printf format specifiers combined with looping declarations.
I got bored one day and decided to implement hexdump as a library (i.e. "one hexdump to rule them all"), with a thin command-line wrapper that emulates the BSD utility version. Unlike BSD hexdump(1) or POSIX od(1), which implement everything in C in the typical manner, I decided to translate the hexdump DSL into bytecode for a simple virtual machine.
The end result was that my implementation was about the same size as either of those, but 1) could built as a shared library, command-line utility, or Lua module, 2) is more performant (formats almost 30% faster for the common outputs, thanks to a couple of obvious, easy, single-line optimizations the approach opened up) than either of the others, and 3) is arguably easier to read and hack on.
Granted, my little hexdump utility doesn't have much value. I still tend to rewrite a simple dumper in a couple dozen lines of code for different projects (I'm big on avoiding dependencies), and not many other people use it. But I really liked the experience and the end result. I've used simple DSLs, VMs, and especially explicit state machines many times before and after, but this one was one of the largest and most satisfying.
The only more complex VM I've written was for an asynchronous I/O SPF C library, but that one is more difficult to explain and justify, though I will if pressed.
1. Every function is a tiny VM already. Every macro is a layer on top of a "compiler" to let you redesign a language. LISP gives much more power, precisely because every program is its own DSL, and all power of the language within that DSL is available. http://www.paulgraham.com/avg.html
The most powerful constructs I've seen is a combination of these 2: Clojure/async is a macro which transforms any program to state machines. I think that kind of power gives you 20-30x advantage in business type applications. In fact I've seen C++ programmers spending a decade on discovering similar things by themselves. I strongly believe everyone should know at least a little about LISP.
I whole-heartedly agree with your points. The ability to transform programs programmatically is extremely powerful. But, FWIW, I tend to see that as a form of generic programming.
When I think DSL I think regular expressions (especially PCREs) or SQL, where the syntax is tailored-designed for the specific task at hand.
The problem with in-program transformations is that you're still largely bound to the syntax of the host language. That's particularly the case with s-expression. That binding has amazing benefits (e.g. when auto-transforming programs into resumable state machines) but also puts constraints on the language syntax you can employ. That's not a problem when you're dealing with experienced engineers and devising technical solutions, but people tend to stay away from DSLs generally because they fear the burden imposed on others (including their future selves) having to learn a new language, how to use it, and how it integrates within a larger framework. You minimize that burden and make DSLs more cost-effective by maximizing the expressiveness of the DSL in the context of its purpose, and minimize switching costs by making the delineation between the host language and the DSL clear and obvious.
So, for example, in concrete terms you'd generally implement arithmetic expressions using infix notation. You can sorta implement infix notation in LISP, but you still have the s-expression baggage (such as it is; it's obviously not baggage from the expert's standpoint), which among other things makes it difficult to know where the host language ends and the DSL begins.
Lua had a head start on most popular languages with its LPeg PEG parser, which makes it trivial to parse custom DSLs into ASTs. For all the limitations of PEGs[1], they're just amazingly powerful. But to drive home my previous point, while expert LPeg users use the Snowball-inspired pattern where you instantiate PEG terms as Lua objects and build grammars using Lua's arithmetic operators (with operator overloading "a * b" means concatenation instead of multiplication, and "a^1" means repeat 1 or more times), newbies tends to prefer the specialized syntax which uses a notation similar to the original PEG paper and to typical regular expression syntax. (LPeg includes a small module to parse and compile that notation.)
Converting the AST into useable code is another matter, and that's an area where LISP shines for sure. And now that I think about it, the best of both worlds might be a PEG parser in LISP. But I'm completely ashamed to say I haven't use LISP or LISP derivatives.
[1] Yes, implementing left-recursion can be taxing using a PEG. But the syntax is so intuitive and powerful that the alternatives just can't replace it. Regular expressions are limited, too, but they'll never replace PEGs (or anything more sophisticated) because their expressiveness is just too powerful and cost-effective within certain domains. Instead we supplement regular expressions rather than replace them; and if PEGs continue catching on I expect PEGs to be supplemented rather than replaced.
(The Ragel stuff is for parsing the SPF records and is a rather small detail. I always intended to remove it--it's a dependency that turns alot of people off--but never got around to it. But, FWIW, Ragel is awesome when you don't mind the build-time dependency.)
Basically what happened was that when I set out to write an async SPF library I found myself constantly refactoring the internal API and devising ever more clever hacks for implementing continuations in C. The biggest problems were
1) It had to support asynchronous I/O, and specifically non-blocking I/O.
2) I explicitly DID NOT want to use callbacks because one design goal of the library was that it be easy to integrate into other code and frameworks, regardless of event loop and even without an event loop. I had recently implemented a non-blocking DNS library after years of using and hacking on ADNS and c-ares taught me to hate callbacks in C.
Instead, you repeatedly call spf_check() until you get something other than EAGAIN. When you get EAGAIN, you use spf_pollfd(), spf_events(), and spf_timeout() for the polling parameters. (The file descriptor can change dynamically, e.g. if the DNS query had to switch to TCP from UDP. The events can be POLLIN or POLLOUT, though usually POLLIN, and the timeout is necessary because DNS send/recv timeouts can be much smaller than the timeout for the whole SPF check operation.)
3) SPF is recursive--policies can include other policies, which can include other policies. So you're already dealing with a logical stack situation, but you can't rely on the language's runtime stack. Also, for each rule there can be an iteration over DNS records with intermediate lookups, and it's especially ugly in C to pause and resume loops with inner function calls.
4) I had never implemented SPF before. I kept running into areas where my approach turned out to be deficient as I became more familiar with the spec. And because of constraints 1-3, this was very costly in time and effort. Were I to implement an SPF library again I probably wouldn't use a VM; instead I'd go back to using a basic state machine, function jump table, and a simpler state stack. But at the time it made sense because it was a more flexible and conceptually simpler approach given the unknowns. Plus, it was an excuse to try out a concept I had long had, which was implement a VM for non-blocking I/O operations.
So basically what it does is decompose the terms in an SPF policy to logical opcodes. So the "ip4:1.2.3.4" term composes into an opcode which pushes a typed address (1.2.3.4) onto the stack and the opcode (OP_IP4) for matching the address at the top of the stack to the address of the connecting host (which is part of the context of an instantiated SPF processor object).
The term "mx:foo.bar", for example, requires querying MX records for foo.bar, iterating over each host and recursively querying the A or AAAA records, and then matching the IP address like above. The "mx:" term is first emitted like the "ip:" term--push domain string following my OP_MX. But when the OP_MX opcode is executed it dynamically generates more bytecode to do the query. I don't remember why I did it this way--probably because it seemed simpler/easier at the time.
It works well, but objectively is over-engineered. At the very least the VM is _too_ general. Nonetheless, AFAIK it's the most comprehensive async I/O SPF library in C that doesn't use callbacks. It only passes ~80% of the SPF test suite, but that's mostly because of the policy parser--IIRC the ABNF grammar for SPF is broken because the proof-of-concept was implemented using a naive regular expression to chop of the string (the ABNF translation of the pattern is ambiguous); and I never bothered or needed to handle some of the corner cases necessary to match the behavior of, e.g., libspf2. But my implementation handles many millions (if not billions) of queries every day as it's used in several commercial and cloud products.
Thanks for responding. Great explanation! I agree that adhering to the SPF standard has diminishing returns as support is added for the more esoteric macros. Support for every single macro in the standard is almost certainly unnecessary.
I've seen a simple recursive descent parser to process SPF records, but never considered it could be done with a VM or a stack-based machine. Even with the DNS Lookup Limit, it appears you made the right choice to avoid the call stack for a fully async I/O interface.
Your coding style is simple but since I have never studied a VM before, I have trouble telling what's going on. You've inspired me to learn more!
1) Domain-specific languages (DSLs)
2) Virtual machines (or just explicit state machines more generally)
What I mean is, alot of problems could be solved cleanly, elegantly, more safely, and more powerfully by using one (or both) of the above. The problem is that when people think DSL or VM, they think big (Scheme or JVM) instead of thinking small (printf). A DSL or VM doesn't need to be complex; it could be incredibly simple but still be immensely more powerful than coding a solution directly in an existing language using its constructs and APIs.
Case in point: the BSD hexdump(1) utility. POSIX defines the od(1) utility for formatting binary data as text, and it takes a long list of complex command-line arguments. The hexdump utility, by contrast, uses a simple DSL to specify how to format output. hexdump can implement almost every conceivable output format of od and then some using its DSL. The DSL is basically printf format specifiers combined with looping declarations.
I got bored one day and decided to implement hexdump as a library (i.e. "one hexdump to rule them all"), with a thin command-line wrapper that emulates the BSD utility version. Unlike BSD hexdump(1) or POSIX od(1), which implement everything in C in the typical manner, I decided to translate the hexdump DSL into bytecode for a simple virtual machine.
The end result was that my implementation was about the same size as either of those, but 1) could built as a shared library, command-line utility, or Lua module, 2) is more performant (formats almost 30% faster for the common outputs, thanks to a couple of obvious, easy, single-line optimizations the approach opened up) than either of the others, and 3) is arguably easier to read and hack on.Granted, my little hexdump utility doesn't have much value. I still tend to rewrite a simple dumper in a couple dozen lines of code for different projects (I'm big on avoiding dependencies), and not many other people use it. But I really liked the experience and the end result. I've used simple DSLs, VMs, and especially explicit state machines many times before and after, but this one was one of the largest and most satisfying.
The only more complex VM I've written was for an asynchronous I/O SPF C library, but that one is more difficult to explain and justify, though I will if pressed.