I've gone through a large number of these, and I think that Airflow is the best on Kubernetes for managed orchestration. The things I like are:
* Source control for workflows/DAGs (using git-sync)
* Tracking/retries with SLAs
* Jobs run in Kubernetes
* Web UI for management
* Fully open source
I also use Argo Workflows, because I like its native handling of Kubernetes objects (e.g. the ability to manage and update a deployment as one of the steps), but it just doesn't have the orchestration/tracking side of things very well managed yet
Yep, this is it. They're developed by the same team, Temporal is still beta but I believe the production version is coming out late-June.
I can't find an easy way to explain everything it does, but it pretty much allows you to write naive functions with no error handling, with month-long sleeps, auto-retries on unreliable function calls, etc etc.
It also gives you a web interface when you can inspect the running functions, and allows for external code (and other workflow functions) to signal/query the running workflows.
Just found out about Temporal and it looks interesting. I eager to jump in but our organization primarily uses Ruby. I know the big difference between Cadence vs Temporal is the fact they are using GRPC which seems much easier to adopt.
Moving off of Airflow and to Cadence/Temporal was the single biggest relief in terms of maintainability, operational ease and scalability. Also +1 on being free of any DSL.
I'm currently moving from a custom yaml DSL-based engine to Temporal and it's the best architectural decision I've taken in a long time. I researched a lot and couldn't find anything that even came close to the freedom it provides.
Curious about this. Can you elaborate more? Also happy to hop on a call/zoom if you don't want to share publicly (email me at saguziel@gmail.com). I'm working on something similar.