Hacker News new | past | comments | ask | show | jobs | submit login

Very interesting. I wonder how this compares to writing events to a Parquet (http://parquet.io/) or Avro (or both) file - this way all manners of distributed tools like Hive or Spark could already read it.



TrailDB is optimized for a very specific data model (http://traildb.io/docs/technical_overview/#data-model) which allows it to do compression that you couldn't do with other data layouts.

TrailDB is a C library, Parquet and Avro are not. Depending on your use case this might be a pro or a con.


Parquet is a columnar format, so it's optimized for aggregations over columns filtered by a set of dimensions. TrailDB data format is optimized for discrete event analysis without aggregation: data points are granularly grouped by their source (for example a user account and actions on a website) and your queries operate on each of these groups independently. No aggregation happens in TrailDB.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: