Dynamically Adjustable Key-Value Store by Combining LSM and COW B+ Tree [pdf]

otoburb · on Nov 10, 2019

Looks like the team presented at HotStorage 2019[1] from July with slides from their talk[2].

[1] https://www.usenix.org/conference/hotstorage19/workshop-prog...

[2] https://www.usenix.org/sites/default/files/conference/protec...

hinkley · on Nov 10, 2019

Did they benchmark against B+Trees and LSM? I don’t see that in the slides.

otoburb · on Nov 10, 2019

Slides 13, 14 and 15 show Jungle's performance using a range of compaction factors (C=[2, 3, 5, 10]) measured against LSM-tree using leveled compaction and LSM variant using tiered (aka size) compaction.

Having said this, the paper itself (in the original link) under Section 4 "Evaluation" describes the differences in more detail, and Figure 6 probably does the best job of showing Jungle benefits in a compact human-readable line of charts.

If I'm reading the paper correctly, and as summarized on slide 16, using the combination of CoW B+ and LSM means that instead of a 3-way tradeoff between Read/Write/Space, Jungle can minimize the cost trade-off such that the only remaining material trade-off is Write/Space.

Pretty cool stuff.

hinkley · on Nov 10, 2019

I looked at those slides twice and only saw them as comparing different settings of their algorithms. I dunno if that says more about me or how long PhD students (haven’t) spent with Tufte.

Maybe three colors for the bars.

otoburb · on Nov 10, 2019

>>[...] how long PhD students (haven’t) spent with Tufte.

I think this :) I also had to look more closely a few times.

willvarfar · on Nov 10, 2019

its exciting that things are happening again in db-land. A few years ago it was Fractal Trees, then leveldb arrived and got wider adoption, then Facebook put it in MySQL with MyRocks etc.

Very recently timescaledb did some really really interesting stuff with compressing tables, and I’m wondering if that is useable with non-time-series data too etc?

I’m a heavy tokudb user because of the compression and I’m looking forward to seeing if a b+ lsm with compression is going to turn up in MySQL or even better Postgres.

StreamBright · on Nov 10, 2019

Is there any reason Postgres or other projects could not adopt tokudb's solution?

willvarfar · on Nov 11, 2019

Architecturally, MySQL went with a “the storage engine is a plug-in” and Postgres went with a more traditional “the storage is part of the core”.

Postgres has slowly grown some ... tolerance ... for alternative storage engines, via federation and forks like greenplum and timescaledb that put their own engines in, but they are always feeling like second class citizens and hitting integration limits.

continuations · on Nov 10, 2019

What made you pick TokuDB instead of MyRocks?

I got the impression that TokuDB is pretty much dead. Hasn't Percona stopped updating TokuDB and is focusing on MyRocks instead?

willvarfar · on Nov 10, 2019

I picked tokudb before myrocks existed.

Tokudb runs rings around myrocks. Particularly, it has better read/write perf and much better compression.

But it will disappear by MySQL 9. Tokudb died because it wasn’t widely adopted nor sponsored, not because it was worse tech.

I think the recent timescaledb stuff about compression is really promising. I hope Postgres moves towards built-in compression rather than telling people the fs should do it.

valyala · on Nov 12, 2019

I hope TimescaleDB compression level will eventually reach VictoriaMetrics compression level for typical time series data :) [1]

[1] https://medium.com/@valyala/measuring-vertical-scalability-f...

continuations · on Nov 10, 2019

Is source code available?

jules · on Nov 10, 2019

How does the performance compare to B-epsilon trees, which are B trees with a write buffer in each node to make writes more efficient.

_tkzm · on Nov 10, 2019

looks like great candidate for bolt(b+) and badger(lsm)