In 2016, I wrote a pretty detailed answer for "Spark vs. Redshift" question. This was in the very early days of what today I guess is called "the modern data stack"
The core of the answer was that cloud warehouses are not suitable for real-time use cases, because the batch processing and transformations take too long. If you want real-time, you need to pay up - hence Databricks / Spark. I did call out the fraud use case in that answer.
There were 1st generation ETL tools like Alooma that tried to go into the direction of streaming, and they pushed the limits.
Back then (we had built cloud warehouse monitoring tool), the closest to real-time I've ever seen any company get was IronSource. The time between an event and until that event was available in a dashboard was five minutes (they were using Redshift).
I'll stick my neck out and say that certain industries will be all over Artie, whereas others will shrug their shoulders.
The industries that I think will be all over Artie:
These are industries where a couple of minutes of difference in data recency can make a difference of millions of dollars. And you'll probably cost less than existing streaming solutions, which is obviously nice. But I think the real advantage will be simplicity.
I know you can't support all destinations at once, and need to go with where demand is. But I would expect that the Materialize and Clickhouse crowds are good target users for you.
The core of the answer was that cloud warehouses are not suitable for real-time use cases, because the batch processing and transformations take too long. If you want real-time, you need to pay up - hence Databricks / Spark. I did call out the fraud use case in that answer.
There were 1st generation ETL tools like Alooma that tried to go into the direction of streaming, and they pushed the limits.
Back then (we had built cloud warehouse monitoring tool), the closest to real-time I've ever seen any company get was IronSource. The time between an event and until that event was available in a dashboard was five minutes (they were using Redshift).
I'll stick my neck out and say that certain industries will be all over Artie, whereas others will shrug their shoulders.
The industries that I think will be all over Artie:
- FinTech - Insurance - AdTech - Gaming - Publishing - Logistics / Delivery
These are industries where a couple of minutes of difference in data recency can make a difference of millions of dollars. And you'll probably cost less than existing streaming solutions, which is obviously nice. But I think the real advantage will be simplicity.
I know you can't support all destinations at once, and need to go with where demand is. But I would expect that the Materialize and Clickhouse crowds are good target users for you.
Good luck, this is an exciting product!