I agree with other commenters that it would be really helpful to understand exactly what you mean by "pipeline", and how they can help/improve a project or workflow. What problem does this solve, and how does it solve that problem better than other solutions?
Also, very minor nitpick, but your programming language icons shouldn't be in <a> tags. They appear to be clickable, but do nothing, which gives the appearance that the site is not working as intended.
Looks like it runs a series of (small) self-contained applications written in any language as long as it implements the GRPC interface in a specific order and with consolidated state tracking and logs.
The individual steps are jobs and the whole thing is a pipeline. Since it executes arbitrary code, it has the flexibility to do anything from ETL to CI.
Thanks for the feedback. You are absolutely right.
I think we have answered those questions on our github readme page (https://github.com/gaia-pipeline/gaia) but somehow missed that on our webpage. We will work on that!
Edit: Oh and thanks. I will fix the links of the language icons. :-)
Not sure your Github readme covers it all that well, either:
>What is a pipeline? A pipeline is a real application with at least one function (we call it Job). Every programming language can be used as long as gRPC is supported. We offer SDKs to support the development.
What is this and how would I use it? More importantly, why do I want to use it? Every application is "real", but it's not clear what the function takes as input and outputs, or how it's dispatched, or... well, anything.
for what its worth: i'm the target audience (devops engineer) and understood it within moments of opening your landing page...
And after looking at the gif, i resolved myself to take a look in a few weeks for a personal project.
I'm however missing an excerpt about the developer/-s of this project.
Who is building it and why? how likely is development to be abandoned? is there paid support available or planned in the short term?
from what i could tell from the contributors its pretty much a solo endeavour from you, with some help from @skarlso in June/July. Are you planning to create a GmbH and sell this as a product, or is this just a personal project you're doing on the side?
This looks great. To echo other folks’ sentiment - I think you mean CI-like pipelines specifically, though it could be extended to do some other stuff. You probably want to list some concrete use cases on the main site (e.g. CI, cron jobs, ETL?).
I’m curious what differences/trade offs there are in Gaia vs something like Argo (https://github.com/argoproj/argo) or Buildkite (https://buildkite.com/). It looks like at least one difference is an actual API for steps rather than bash commands. Is there anything interesting in terms of cross job caching (e.g. saving npm install data) or persistent runners? And obviously - what else am I not thinking about?
Really good idea. Website definitively needs something like that. I will work on this!
Argo is pretty cool but I think it's different. Imagine developing an automation workflow with 20-30 different steps. In Argo you basically have to develop 20-30 different applications, compile them, build a separate docker image and push all these images to a registry.
In my opinion that's too much overhead and a configuration management nightmare as well. Additionally, what if one step needs to share information with another one? What if when this information is not trivial like a binary for example?
Raises the question "what is a pipeline"? I'm assuming you don't mean the "prog | sort" kind of pipeline, but I was no clearer after reading it what you do mean.
Also, make the page display something without 2 different Javascript sources having to be enabled.
To answer your question, a pipeline is a real application which consists of at least one function (we call it job).
You can compare it with a Jenkins Pipeline (https://jenkins.io/solutions/pipeline/). Therefore, a pipeline is one flow of automation task which consists of one or more steps.
"a pipeline is a real application which consists of at least one function"
So.... every application is a pipeline? This really clears very little up for someone not already familiar with the concept. The Jenkins link is about testing and deploying code. Is that the intended use case for this tool as well?
I think they meant it is a replacement for existing CI/CD pipelines. As someone who deals with them every day, I like the simplicity of Gaia pipelines. But it seems it only supports running on the host that Gaia is installed on, whereas Jenkins and Gitlab CI will have the ability to run on multiple agents/build slaves.
Yeah, you definitely need a "What are pipelines?" section. From looking over the examples, my impression is this is some kind of job scheduler, like sidekiq.
Some might opine that having to search through upwards of 75% of the length of a page before finding out what problem a tool addresses is a situation possessed of wonderful and bountiful opportunity to become even clearer.
Gaia is just a few month old so we currently only officially support those languages. Feel free to open an issue on our github page to help us figuring out which languages are missing/needed.
From looking through the front page of the site, I see a bunch of code examples, but no explanation of how this differs from using Unix with a bunch of programs that each read from stdin and write to stdout.
I'm sure there's actually something useful about this product, but the website really doesn't give any indication of what that useful thing is.
We have just added the support for C++ and currently looking at Rust but it's a bit tricky. There is no official gRPC SDK for Rust but I think we will manage it soon. :-)
Providing a warning about the alpha edition in the README was a responsible thing to do. As Gaia is being actively promoted, I assume that the project has reached a certain level of maturity since the warning was written. Is this the case? In other words, does the warning still apply?
Echoing what others have said regarding 'what is this' ... is this Jenkins? Jenkins but instead of writing a Jenkinsfile (groovy) you write it in your own language.
YAML in devop is essentially being used as a declarative language. They do have their place in domain specific applications, and can often be quite powerful.
While being able to write in any language gives you flexibility, I do wonder about maintainability down the line. How would this be addressed? Or it doesn't matter as this sort of code tends to be thrown away?
The issue with YAML is that the abstraction ceiling is way too low for its use cases. How do you maintain a 500 line YAML file? You have no tools to simplify it apart from super basic replacements
Meanwhile if I give you a 500 line python function you have a lot of tools at your disposal to refactor to improve maintainability (you could actually write basic tests for your CI pipeline config instead of having failures when the config gets pushed out)
I remember people doing this for seismc data analysis piggy-backed UNIX pipes 40 years ago. They would send and transform JSON-like metadata down the pipe while the massive seismic data would remain in disk files or shared core memory. I think variants of these packages are the most widely used seismic freeware to this day.
So, a task/process queue? Is that what this is? How does this vary from the quite mature task/process queues used by media production companies? (Examples include Deadline, and dozens of tested-by-production proprietary queues at every production studio.)
Hey bsenftner.
Could you provide a link or an example of what you mean? :-)
I googled "Deadline" but what I've found looks not correct.
I usually compare Gaia with tools like Jenkins and Spinnaker. For example, many people use Jenkins Pipeline (https://jenkins.io/solutions/pipeline/) which allows you to write CI/CD tasks in Groovy.
In my opinion, Gaia fulfills this job way better because it doesn't force you to use a specific language. It's also super fast and provides features like the automatic (re)build of your pipelines.
> Gaia fulfills this job way better because it doesn't force you to use a specific language
CI typically seems much more dependent on bash/unixy sorts of tools to get things done. This seems to not really support that workflow, requiring code to define pipelines instead.
If it's intended to do CI, how do you deal with CI-style tasks, like shuffling around files between pipeline steps? Or the corollary, what does this do that makes it easier to do CI in practice than with typical unix command based workflows? Inherently, it seems like "Create a golang script that can start a subprocess that runs a test suite" is more overhead than "run a test suite".
At first glance, it looks like more of a competitor to, say, AWS step functions, but it doesn't sound like that's what you're targeting.
In the past you simply had to compile your application, package it and push it to a remote server. Nowadays, it's not that simple anymore. You often find yourself writing scripts to create Kubernetes resources, manage remote APIs to create services which are needed by your application, talk to remote services (like HashiCorp Vault) to get credentials or secrets.
Gaia does a great job here because you can directly use client SDKs in your preferred language to communicate with those remote APIs.
Here's Deadline: https://deadline.thinkboxsoftware.com/
I used it in the past, along with the render queues at a few VFX studios. Very feature rich process scheduling and scaling managers are in heavy use by media productions.
Thanks. In the end it is similar to Gaia (a task scheduler) but also comparable with Jenkins/TravisCI/CircleCI and probably hundreds more schedulers. :-)
The basic idea is not new.
In my opinion Gaia is perfect for programmers which are "forced" to write automation tasks. It allows you to write automation tasks in your preferred programming language and makes it super easy for you to schedule them because Gaia comes with an automatic build feature (just provide the git url of your source code and Gaia does the rest).
Additionally, Gaia is super fast, lightweight and open-source.
I agree with the questions about what a pipeline is. Also, have you considered whether this competes with Airflow/Luigi which are used more in terms of etl pipelines?
Good question! :-)
Gaia is basically a scheduler. It automatically starts your pipeline, executes the functions defined in the pipeline and is also responsible for terminating the pipeline process.
The text in video is unreadable and I hardly understand what happens there actually, just jumping thru some CRUDs without commentary and no way to stop.
Dunno, maybe some "Use cases"/"Case study" section could show some killer real project usage.
Oh right, I saw that. The reason I forgot about it is because I couldn't find a connection between that and the example code written in several languages.
At the end Gaia is a task scheduler. You can compare it with AWS stepflow but also with Jenkins/CircleCI/TravisCI/Bamboo/Codeship and many more.
In my opinion Gaia is perfect for programmers which are "forced" to write automation tasks. It allows you to write automation tasks in your preferred programming language and makes it super easy for you to schedule them because Gaia comes with an automatic build feature (just provide the git url of your source code and Gaia does the rest). Additionally, Gaia is super fast, lightweight and open-source.
Gaia is only a few month old so we currently only support Go, Java, Python and C++. Feel free to open an issue on our github page to rise some awareness that C# support is appreciated. :-)
At the end Gaia is a task scheduler. You can compare it with AWS stepflow but also with Jenkins/CircleCI/TravisCI/Bamboo/Codeship and many more.
In my opinion Gaia is perfect for programmers which are "forced" to write automation tasks. It allows you to write automation tasks in your preferred programming language and makes it super easy for you to schedule them because Gaia comes with an automatic build feature (just provide the git url of your source code and Gaia does the rest). Additionally, Gaia is super fast, lightweight and open-source.
Also, very minor nitpick, but your programming language icons shouldn't be in <a> tags. They appear to be clickable, but do nothing, which gives the appearance that the site is not working as intended.