Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Datalog is a logic programming language for relational data. It is much easier to express recursive relationships in Datalog compared to SQL. The most obvious application of Datalog is for working with graphs. The starter example I've usually seen is "graph reachability", which something like "given a list of edges between nodes, tell me if I can reach node X from node Y".

    // Given an edge x -> y, there is a path x -> y.
    path(x, y) :- edge(x, y).

    // Given an edge x -> z and a path z -> y, there is a path x -> y.
    path(x, y) :- edge(x, y: z), path(x: z, y).
Here is the equivalent recursive SQL (SQLite syntax):

    WITH RECURSIVE path(x, y) AS (
      SELECT
        edge.x AS x,
        edge.y AS y
      FROM
        edge
      UNION
      SELECT
        edge.x AS x,
        path.y AS y
      FROM
        edge,
        path
      WHERE
        edge.y = path.x
    )
    SELECT
      DISTINCT *
    FROM
      path

If you want to play with Datalog, https://percival.ink is an online notebook with a very gentle introduction. I also published a version of percival.ink that converts simple Datalog queries to recursive SQLite queries here: https://percival.jake.tl (the example above is from percival.jake.tl).

As jorkadeen said in their answer (I added emphasis to the last sentence):

> Datalog programs are values. You can store them in local variables, pass them as arguments, and return them. If you have two Datalog values you can combine them into one (effectively it's just the union of them). This allows you to write small "Datalog fragments" and stitch them together into a larger program. The type system ensures that this stays meaningful.

Why this is important: From my perspective is that a very large percent of industrial code interacts with data (in databases) by building query strings. In most languages, we throw away all the nice features of the language when it comes to data access. To regain some level of composability and safety, developers pour years into libraries like [Arel in Ruby](https://www.rubydoc.info/gems/arel) or [Calcite in Java](https://calcite.apache.org/docs/algebra.html). The most notable example of language native database access is [Language Integrated Query (LINQ)](https://docs.microsoft.com/en-us/dotnet/csharp/programming-g...) in C#. But all of these tools are hampered by the constraints of SQL. SQL is very good for flat, normalized data with one or two joins. Once you begin working with trees or graphs, tools based around SQL become cumbersome.

Using datalog instead of SQL means you have a logical langauge that is inherently more accepting of composition in a theoretical way. But, most systems that speak datalog today are extremely specialized. [Souffle](https://souffle-lang.github.io/simple) and [differential-datalog](https://github.com/vmware/differential-datalog) are useful ahead-of-time compilers for datalog, but you have to design your whole program around using them. The datalog variant [inside Datomic](https://docs.datomic.com/on-prem/query/query.html) is dynamic, but is only usable for accessing data in a data store. Good luck using it for analysis of data in memory.

Including datalog into the language as first-class values mean that you can access the power of datalog and actually compose it in practice.

[Note: I've never used Flix – just a datalog fan.]



in prolog the definition of path is something like: path(x,y) := edge(x,y). path(x,z) := edge(x,y),path(y,z).

So to me it seems a little strange the definition of path in datalog from a prolog user point of view.


At first sight, in the examples: inc1(7) pure, inc2(8) impure, what's the meaning of inc2? (some kind of increment?) Then in the definion of map there is an argument function f, and in the definition there is a new function g that comes from nowhere, so for me at first sight the examples are not insightful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: