Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Effectively a compute engine on top of S3 storage that is smart enough to push-down to S3Select where possible. Well the general idea of a compute engine on top of S3 storage where data is partitioned efficiently with the appropriate prefixing is quite common. Commonly known as an external table. Hive is an example that comes to mind.

Couple that with a meta-index/catalogue and a modern file format like parquet and you've got yourself a more modern flavor of an external table. For instance Iceberg or Deltalake.

I'm unsure whether the technologies/libraries built around interacting or managing these external table formats can actually push-down to S3Select, but I can't imagine why they can't.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: