I’ve taken a brief look at the website and the GitHub repo. Could not find any description of the architecture, algorithms used, any references to research papers or anything like that.
It does not build any confidence as the topic of distributed highly available and scalable databases is inherently difficult and solutions are full of trade offs.
Appears to be another Yandex spinoff like ClickHouse, so it's probably reasonable to assume that it has the same properties: well-built, fast, and (initially, at least) slightly under-documented ;)
Same here. The only clue I found are the following two lines in the Q/A section: "What consistency model does YDB use? To read data, YDB uses a model of strict data consistency."
Well it is an in-house tool that just got opensourced, so it is natural that docs and other materials are lacking. There are videos from C++/HighLoad meetups that shed light on internals, but mostly in russian as of now. Give it time.
>Interesting, the separate compute and storage tiers is another system going that direction which I think is becoming almost the standard at this point, especially for "cloud-native" things designed to run on k8s. From what I can tell (it isn't very explicit on this point) they are avoiding a distributed consensus at the storage layer and instead relying on a single writer/multiple reader model with the single writer being enforced by assignment of the tablets in the compute tier, with the tablet being responsible for writing to multiple storage nodes for durability? (But I might be wrong)
Hmmm, 4D around the 90's/early 2000's also keep the data and code separated in 2 different files. They maybe still do, but haven't touched it in a while. Tech is more and more like fashion, if you wait around long enough it will come back. ;-)
It does not build any confidence as the topic of distributed highly available and scalable databases is inherently difficult and solutions are full of trade offs.