Maybe I am missing something but would there ever be a scenario where taking a s... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

alex-korr on Aug 15, 2023 | parent | context | favorite | on: Launch HN: Serra (YC S23) – Open-core, Python-base...

Maybe I am missing something but would there ever be a scenario where taking a single albeit large sql statement and rewriting it as several pyspark scripts would result in faster runtime for your data pipeline? In most cases, this will be much much slower.

0cf8612b2e1e on Aug 16, 2023 [–]

Greatly depends on your environment. I am thankfully in an area where there are very modest timeliness requirements. Improving the speed of a job means little to me. However, improving debugability or checkpointing when things go wrong is always valuable.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact