Lord no! I'm a data engineer also, feel the same. The part that I find most maddening is it seems pretty devoid from sincerely attempting to provide value.
Things databricks offers that makes peoples lives easier:
- Out the box kubernetes with no set up
- Preconfigured spark
Those are genuinely really useful, but then there's all this extra stuff that makes people's lives worse or drives bad practice:
- Everything is a notebook
- Local development is discouraged
- Version pinning of libraries has very ugly/bad support
- Clusters take 5 minutes to load even if you just want to "print('hello world')"
Sigh! I worked at a company that was databricks heavy and an still suffering PTSD. Sorry for the rant.
A lot of things has changed quite long ago - not everything is notebook, local dev is fully supported, version pinning wasn’t a problem, cluster startup time heavily dependent on underlying cloud provider, and serverless notebooks/jobs are coming
Things databricks offers that makes peoples lives easier:
- Out the box kubernetes with no set up
- Preconfigured spark
Those are genuinely really useful, but then there's all this extra stuff that makes people's lives worse or drives bad practice:
- Everything is a notebook
- Local development is discouraged
- Version pinning of libraries has very ugly/bad support
- Clusters take 5 minutes to load even if you just want to "print('hello world')"
Sigh! I worked at a company that was databricks heavy and an still suffering PTSD. Sorry for the rant.