It's common among workflow languages to represent everything as a graph. What I think is more appropriate is an expression-like approach, where the relationships between the different actions are implicit in the syntax.
For example, if you have a graph like the following:
node A
node B
node C
node D
edge B -> A
edge C -> A
edge D -> B
edge D -> C
you could write that as an expression like this:
A (B D) (C D)
In the action example, the "needs" field is a dependency on the task, or in other words a subexpression. If there are multiple independent expressions that all depend on that, you could use something like let/letrec to assign the result of "Provision Database" to a variable, and then reference that from other expressions.
Basically, the action syntax they have is like writing an abstract syntax tree where every node explicitly lists its dependencies. An expression is a much more compact form and the subexpressions are implicit in the AST produced by the parser; this is the approach used by most programming languages. But for reasons I still don't fully understand, workflow languages (which I consider to be a specific class of programming language) don't seem to adopt this more compact representation.
We've recently been working on a workflow tool for people significantly less technical than Github's users and we started out with an approach somewhat similar to the one you outline. We developed a compact DSL that maps almost 1-1 with the UI we wanted to build and inferred the dependency graph from the preconditions set and data used in each action. The thinking was that there was no sense in explicit ordering since it could only slow down execution and introduce errors and that allowing preconditions would create a declarative way to ensure "after" behavior when necessary.
But what we found in user testing is that our users kept wanting to manually re-order actions and were really uncomfortable with the system just "figuring out" execution order on its own. We had to introduce the concept of UI-only ordering (the backend still tries as hard as possible to execute in parallel without violating the dependency DAG) to give them the illusion of control.
But our users aren't programmers, so Github might have bit more leeway to push complex topics like these onto users.
For non-programmers, the visual approach is very attractive, and provides an intuitive way to display the workflow in a manner that is easy to make sense of.
For programmers (i.e. pretty much everyone that uses GitHub), I think a more DSL-like approach would be appropriate, though this doesn't preclude a visual editor as an alternative. As gbaygon mentioned, they do have a DSL, but I think the approach is suboptimal.
In a project I'm working on at the moment (which extends the work from my thesis), we're targeting two audiences - programmers write the workflows, but "business" people can see a visualisation. Essentially we take the Scheme code that comprises the workflow and render it as a graph, similar to BPMN. I think that's a nice approach when you have people on staff who have the necessary programming skills and are working alongside non-technical people. But that doesn't apply in every situation, so visual design can be useful when you don't have experienced programmers creating the workflows.
I'm an experienced programmer and I want effective visualizations. I strongly believe that domain-specific visualizations are the way forward to real progress, even if we haven't created a practical one yet.
Totally agree. I have come to integrate a live module dependency graph visualization[1] in my JS workflow, it really helps me when apprehending a new project or prototyping ideas.
More flexible representations (at the function scale for instance) would probably help even more.
Visualizing and manipulating should probably be seen as seperate domains though, or at least working on seperate levels of abstraction: effective visualization implies hiding operations; you don’t want to view, or visually edit, your stribg manipulations. But re-ordering your functions called from main() might be more malleable
But even then, how often does such a reordering not constitute detail changes as well?
I think visualizations are much more useful for “reading” code, than for writing it. Which is why visual editors are so appealing: they’re showing off the reading aspect.
Im pretty sure something like the grandparent, or that visual-haskell project whose name I cant remember (where code and visualization are directly equivalent) is the way to go: there’s no paradigm shift to be had here.
Sure, what kinds of questions do you have? We're still pretty early in the process and I think we made a few mistakes that we'll need to correct. But we made a number of choices that have worked out really well despite the fears of some of the team.
A few notables:
- Written in Rust. This caused some early struggles, but has been paying dividends ever since we got it working. The thing is rock solid and easily handles tens of thousands automation runs per second. Also, Pest is awesome and writing complex lexer/parser implementation is really easy these days.
- We're currently triggering automations with HTTP calls, but we want to move to primarily triggering with some sort of work queue.
- We didn't consider aggregates in the first iteration of the product and now we're feeling that pain and looking at solutions.
- Tracking the types of all data through every step of the automations was a lot of work to setup, but is hugely valuable. Being able to suggest what values/operations are available to users in the UI as well as doing AOT type checks when saving automations means a lot fewer errors at runtime.
- How have users taken to the tool? Have they needed ongoing support, or once trained they understand what to do?
- Were alternatives considered? Or was the complexity such that a workflow was the only way users could control this?
- Do people ever manage to design impossible flows?
- Anything about the use-case you're able to say, for where the tool is needed by users (and not just as an easier way for developers to adjust the system)
Sorry I missed your reply...I stopped following the thread. But in case you're still reading, here's some responses:
> How have users taken to the tool? Have they needed ongoing support, or once trained they understand what to do?
It's been a bit of a struggle. Once people understand how to use the UI, they go to town and get a lot of value out of it. But we've found it's not approachable and basically requires us to teach them how to use it. We're continuing to experiment with it. The good part is that everything we're trying is supported by the underlying DSL and workflow engine and we really haven't had to make more than a couple of tweaks to that.
> Were alternatives considered? Or was the complexity such that a workflow was the only way users could control this?
We looked into off-the-shelf options, but we didn't think they'd give us the level of control we wanted to build a product around it. As mentioned above, the hardest part is the UI, and if we're building this as a product, we need to build that anyways.
> Do people ever manage to design impossible flows?
No, that's impossible through the UI. Since we're tracking the types of all data throughout the execution of the flow, we're able to analyze the flow statically before it's saved to the database and give users an error. But they basically can't even get that because our UI prevents them from choosing illegal values or setting up infinite dependency chains.
> Anything about the use-case you're able to say, for where the tool is needed by users
It's designed to be kinda like Zapier, but for a much more specific audience who are generally less technically adventurous. In talking with these users, many of whom use Zapier, we've identified that they find it difficult to use and not really suited to their use case, so we're hoping that something that's purpose built for that use case will make their lives easier and convince them to switch.
That's exactly what I've explored in a Haskell project: DepTrack.
https://github.com/lucasdicioccio/deptrack-project basically it decorates expressions like in your example to collect actions. It's strongly-typed. As a result, with a bit of Haskell type-system knowledge you can easily enforce invariants like "I won't run a command if you don't install it before" and "if you don't tunnel/proxy a given service then the config will not compile".
There are other niceties that the Haskell type system gives you by playing with the underlying effects (e.g., forbidding IOs enforces that a same config always gives the same result, using a List allows to cleanly handle heterogeneous-platforms concerns) but these are advanced topics.
Most workflows are a DAG[1], not a graph, this makes them representable as Tuple[List[Step], List[Tuple[Step, List[Step]]]. In other words, (List[Step], Map[Step, Dependencies]), so your example could be
which is clearer than the graph representation. Notably, your syntax also assumes a DAG, it can't represent a full graph, so the graph syntax is more "powerful". Though unnecessarily so.
The expression-y representation doesn't scale well. If you consider that workflows are mostly linear, but have branches, the syntax you provided gets ugly fast.
This is one of those weird things that is very much not obvious without hindsight, but try describing a workflow with a critical path of length 10 or 15, and some subchains that are mutual but not exactly the same. Formatting the expression based form you suggest quickly becomes a bit of a nightmare. In the extreme, consider representing a git commit graph, which is also a dag, in the various syntaxes proposed. Then consider trying to modify that structure. It's not very ergonomic.
[1]: Anything loopy or graph-requiring should be factored out into its own sub-flow implemented in a turing complete construct. A workflow should be a composition of such turing complete sub pieces.