Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That’ll often not scale to millions of records. Letting the database optimizer find the optimal execution path instead of doing it procedurally elsewhere might result in “finishes in 5 minutes”, versus “doesn’t fit in a night”.


This isn’t the 90s. Most hardware is way over-specced for the data sizes most people are dealing with.

The number of use cases which are too heavy to finish in hours but small enough to fit in a single instance is pretty limited.


Costs are another reason to optimize queries, long running, inefficient queries will be a lot more expensive on things like snowflake than more efficient queries.


SQL is popular because it can be run on a map/reduce backend. So once you have written your code it can run on any number of machines.


a) SQL is not that popular on map/reduce backends. Most people are doing it in code.

b) Only basic SQL works on any database and even then there are major differences in how they treat things like nulls, type coercion etc.


BigQuery? Athena/Redshift?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: