I have a very well documented implementation of rete, that also produces nice looking SVGs of the state of the rete algorithm here: https://github.com/bollu/rete
It's a really interesting algorithm, and allows one to get O(1) incremental pattern matching (ie, when one adds a pattern, one is told of a matching pattern in O(1) time) at the cost of O(npattern * nitems) memory usage. I was trying to use it in the context of pattern matching within a compiler, but I never went anywhere since I had COVID, then a PhD to get to :)
I used Drools Planner for generating a timetable. The rules scores the state of domain objects (teachers, periods etc) like "10 violations, 15000 points" and the algorithm (simulated annealing in my case) randomly swaps states and backtracks the swap so it becomes "0 violations, 9000 points".
Early in my career, I wrote a LEAPS based rules engine for managing business rules when deciding how to pack orders into boxes across multiple warehouses. As part of it, I also wrote a heuristic for approximating solutions to the 3-dimensional bin packing problem, along with a visualizer for solutions as the warehouse staff did not believe that certain pack orders could "fit".
Among other things, the rules engine would ensure that bowling balls wouldn't be packed in with ceramic dolls, or that goods that could be dangerous if spilled and mixed wouldn't be packed together. Overall, a very fun project for a young programmer.
I've been studying Rete for the past few years, and I've been working exclusively in CLIPS for the past year, a programming language that uses the Rete Algorithm at it's core. It's a very clever algorithm, and provides a lot of "nice features" that developers find themselves re-implementing in mature code bases. Things like:
- pattern matching
- custom DSL
- caching
I find it difficult to summarize in an "elevator pitch," and I only started seeing things fall into place after trying to use it in earnest on my own. I highly recommend reading the CLIPS documentation posted on the main CLIPS website. The PDFs are long, but quite complete and well written.
Just a heads up: this is not light reading. Rules-based programming is a confusing departure from traditional programming; there's more "magic" involved, similar to convention-over-configutation in other languages/frameworks. The benefits outweigh the upfront learning cost, though.
If anyone is working directly with CLIPS, let me know! I'm actively working on a low level networking library called CLIPSockets, and would like to work with the language full time some day.
Hey! I was using CLIPS in my network management PhD work, but life happened and I was absorbed by my profession. Maybe one day I'll dig up the half finished material and continue, who knows.
CLIPS is one of those definitely underrated gems, many many thanks for what you are doing!
I was mostly imagining legacy banking systems are the main use-cases for stuff like this, and was a little surprised to see Apache adopted Drools in 2023[1].
In case anyone might be able to answer..
What are some of the biggest rule systems deployed in the wild, and are any of them recent? Since it seems like this is mostly about first order logic.. do any of these rule-engines actually out perform all the modern work on SMT solvers and other theorem-provers? Same question for this more probabilistic Rete-OO flavor.. isn't there already some faster and more generic engine available that handles this kind of thing as a special case?
Not recent but the cybersecurity company Rapid7 uses a Java implementation called JESS (Java Expert System Shell) within their vulnerability management product to help determine vulnerable / unpatched state of scanned assets.
The Rete algorithm is at the core of FICO's commercial rule engine (Blaze Advisor) and Blaze Advisor is used heavily in Insurance, Finance, Tax / Revenue Management etc.
More than a decade after i used it last, recruiters still randomly call out of the blue for short to medium term contracts using Blaze Advisor.
Rete is an elegant algorithm that is also quite flexible: one can make non-distributive aggregations, outer joins and recursive traversal work. However, it is quite memory-hungry due to the large state of certain nodes (e.g., join operators). I tried to work around this by using distributed execution, but that made the system very inefficient and complex to operate. I also looked into the TREAT algorithm but found that it is more limited in its functionality than Rete.
My research went in similar directions although I was looking at RETE for rule evaluation as part of a general purpose workflow engine with rules as just one part.
Incremental view maintenance is different enough from rule composition and evaluation that the model diverges to be more optimal. Collections of tuples and instead of DAGs lots of cyclical loops to continue computation of the diffs.
There are deep and intrinsic space or time trade offs so many of the modern approaches moved toward natural dataflow concurrency, and streaming semantics where space or time trade offs can be chosen at runtime through batching and data context opposed to early RETE variations which were very OOP and eagerly evaluated instead of lazy (all in memory in the same place instantiated and mutated).
The coolest thing I ever did with the Rete algorithm was to hack OPS5 to support spawning multiple working memories (or data worlds).
I have a funny history with OPS5. When my company bought me a Xerox 1108 Lisp Machine in 1982, converting OPS5 to run on InterLisp-D and adding a nice UI was my first project. I also ported it to the Mac in 1984 (ExperOPS5), and one weekend I converted Charles Forgy’s original Common Lisp code to MIT Scheme.
I did some work with Rete for my undergraduate dissertation. One thing I remember about the original paper is that it defines the algorithm in terms of a kind of virtual machine, where specific instructions implement the matching process. I’ve never seen anyone implement it that way.
I cannot comment on this for "it = Rete", but virtual machine-like models for matching are not that un-common, from PROLOG (WAM = Warren Abstract Machine) to regular expressions. Stretching things slightly, SQLite's SQL engine also technically is a byte-code interpreter of a DSL VM.
One of the coolest applications of Rete out there in the wild right now is that they have a Rete engine built in to Apache Jena, which operates on RDF. (I do not believe its unique to Jena, inference seems to be part of the whole RDF ontology schema thing.)
So, now your knowledge base is RDF nodes.
I have not yet had a chance to play with it myself, but when I was playing with RDF and turned down that corridor in surprise: "Oh my, what. have. we. here!"
Despite modern advancements in ML, rules engines are still preferred in some scenarios as their execution tends to be deterministic.
They have different heuristics for conflict resolution when more than one rule can be fired. Once you define the strategy, it will be consistent throughout the entire lifecycle.
Once the network is compiled, it ensures that previously matched patterns are not recomputed, which increases performance.
Yeah modern ML is not really at all comparable and they're more complementary than modern approaches replacing rules. All these agent frameworks and platforms cropping up will be using things like rules, workflow DAG models and so on as the execution engine with LLMs embedded as steps and/or to construct a workflow.
Likewise either with knowledge graphs or using LLMs to generate possible predicates and constraints to run against a rule engine or backwards chain through facts is a way to minimize hallucinations of generative models.
The other big question regarding Rete is, today, at what point is the complexity of the algorithm worth the performance.
A primary goal of Rete is to narrow the rule set to be applied based on changes in the working memory. Quite important if you have a large rule set over a more naive, perhaps brute force, approach of apply rules.
But modern machines are pretty stupid fast, and when operating on very slow hardware of the day, Rete was very important. Today, I don't know. Or, simply, that the use cases perhaps narrow particularly for smaller rule sets.
"Here's a bunch of rules (say, anonymous JS functions). Run each one against this working set (which is little more than a fancy map). If any of the members of the map change, do it again."
Cheap hack 2B. Register with each rule what variables its interested in:
addRule(['weight', 'height'], function(map) { var w = map.get("weight"); var h = map.get("height"); var bmi = w / (h * h); map.put("bmi", bmi); });
addRule(['bmi'], function(map) { var bmi = map.get("bmi"); if (bmi > 30) { map.put("weight_status", "obese"); } });
(No, this isn't a rule development environment. Yes, rule engines can be very complicated, and rule interaction, precedence, etc. are a Thing. This is a trivial example, doesn't mean it's not potentially useful for some scenarios however.)
I've always meant to hack up a Rete implement but never found the time. Your example here makes it look like a reactive program though, which I have done. In functional reactive programming, you topologically sort the update graph to ensure that nodes are updated in a correct order so no glitches occur, and the reactive graph only updates subgraphs that are affected by any change. Is that basically what Rete is doing?
This reminds me of the CUE evaluator, which has its roots in LinGo and the data structures / algos for efficiently representing / evaluating linguistic rules for old-style (pre NN) NLP systems. I believe it is more like Typed Feature Structures now, though it is evolving into something different since the target language is not a human language or doesn't involve probabilities
Just for anyone who tracks down this thread later, almost all of the modern blaze implementations (now decision modeler) use a compiled sequential Java code base rather than a RETE algorithm. This is because the performance impact of maintaining the working knowledge is pretty significant, surprisingly hard to predict and there are a lot of technologies inside of modern rules engines that can make dependencies much easier to reason about.
RETE is cool and common context applications with LLMs is going to be a very interesting space that may revive this as a more general purpose technology.
I'm using CLIPS + Python for a kind of Neuro-Symbolic system, kind of an old-school expert system for a specific (small) domain blended with an LLM. Coincidence I see this here.
Worked with a rete engine in Tibco's CEP. This was used for utility events and deciding what if anything to be done with the event in the context of others and then route it. I've seen it similarly used with bank security and customer management. Another interesting use case was using it as a master data management platform for customer data across a company with a bunch of subsidiaries that had the same customers.
What we liked about it was the ability to contextualize events by keeping a history and relationship between objects and events that we could reference as part of our rules. Particularly when we get an event storm and putting more signal into situational awareness when there is a lot of noise.
I really like nools, which is a drools clone, but for JavaScript. It's fantastic for quick hacks and for getting to know how to write code for rule engines.
For me, the "It's just Clojure" part is a drawback, not only for me personally but when I think about the kind of audiences that I have traditionally wanted to author rules, between a real, lisp-y, dynamically-typed, programming language and asking folks to write in the Drools DSLs, I'll take my chances with the DSLs
I'm another early/core contributor to the clara-rules project. It does utilize a DSL that is implemented and allows for some Clojure - so it does have some coupling on that front. The https://www.clara-rules.org webpage shows it has a Java based API that works as far as integrating with the rest of a system (eg. when you aren't using Clojure).
The rule structures themself have a data structure representation that is independent of the DSL. You can have other DSL's implemented that target this structure. Again though, it certainly is easiest to do with Clojure. This was touched on in this post https://www.toomuchcode.org/blog/2015/11/14/insta-declarativ...
Actually... that would be a fascinating application. RETE works best when all of the facts and rules known are presented up front (versus human thinking, which we are trained from kindergarten to follow flow charts and decision trees).
It's been a while since I've thought about this algorithm, but I remember coming away with the impression that it is memory-bound, and thus not horizontally scalable across multiple nodes. Is that true?
Yeah from what I recall RETE can be quite memory intensive in that it keeps the results of any intermediate rule computations that result from changing inputs in memory, in the hope that it will speed up answering subsequent queries (similar to how you might want to incrementally update a materialized view). Subsequent algorithms explored trading off memory consumption for query latency (amongst other differences).
it's named after anatomical structures with the name "rete" (which is just latin for net, but Forgy meant specifically that kind of structure). eg https://en.wikipedia.org/wiki/Rete_mirabile
This just seems like a very specific way of implementing tree matching? Top-down graph matching instruction selection implementations look a lot like this if you squint (apply labels to child nodes, then have rules that fire on simple trees of labels, which gives you structural sharing of rules) but without the multiple explicit node types and specifying exactly a linked list and things. It being so specific to the implementation and not a high level overview of what the algorithm is trying to do is kinda weird and make it hard to understand, honestly.
It's a really interesting algorithm, and allows one to get O(1) incremental pattern matching (ie, when one adds a pattern, one is told of a matching pattern in O(1) time) at the cost of O(npattern * nitems) memory usage. I was trying to use it in the context of pattern matching within a compiler, but I never went anywhere since I had COVID, then a PhD to get to :)