> - Don't use it for tasks that don't require idempotency (eg. a job that uses a bookmark).
You can totally design your tasks to be idempotent - but its up to you to make them that way. The scheduler or executor doesn't have any context into your job.
> - Don't use it for latency-sensitive jobs (this one should be obvious).
IIRC this is being addressed in Airflow 2.0
> - Don't use sensors or cross-DAG dependencies.
This is a little extreme. I've never ran into issues with cross dag dependencies or sensors. They make managing my DAGs way easier because wee can separate computation dags from loading dags.
context: I built/manage my company's Airflow platform. Everything is managed on k8s.
Using the KubernetesPodOperator for everything adds a huge amount of overhead. You still need Airflow worker nodes, but they're just babysitting the K8S pods doing the real work.
I know it's 2020 and memory is cheap or whatever, but Airflow is shockingly wasteful of system resources.
You can totally design your tasks to be idempotent - but its up to you to make them that way. The scheduler or executor doesn't have any context into your job.
This is why I encourage people to use a unified base operator and then pass their own docker containers to it. Aka like how https://medium.com/bluecore-engineering/were-all-using-airfl... outlines it.
> - Don't use it for latency-sensitive jobs (this one should be obvious).
IIRC this is being addressed in Airflow 2.0
> - Don't use sensors or cross-DAG dependencies.
This is a little extreme. I've never ran into issues with cross dag dependencies or sensors. They make managing my DAGs way easier because wee can separate computation dags from loading dags.
context: I built/manage my company's Airflow platform. Everything is managed on k8s.