Why does nobody seem to have any introspection on why RethinkDB failed? Clearly there are some major problems that people re ignoring. If my favorite DB (I must mention Kx Systems once a month) folded, I could give you a laundry list of issues where things went sideways, but all I see is glowing praise and comments about the best tech not always winning (KDB knocks the socks off of everything, but I sure can give you a list of places it fails).
This isn't meant to be harsh, but these are times to learn, not simply pat each other on the back.
If I had to speculate, I'd say that they spent a long time in development before monetizing, longer than investors were willing to entertain. It's hard for a B2B company to raise a Series B without a thoroughly proven revenue engine.
I don't know how this could have been fixed though. Databases are hard to develop and it's a tough market to crack. Enterprises aren't going to buy an incomplete product, especially not a database, which is arguably the most critical component in the entire stack. Anyone investing would have to know that this would be a years-long effort to build a solid product before the first dollar.
Perhaps RethinkDB could have shortened their development time by not pivoting around (originally SSD optimizations, later realtime. also Horizon is a bit of a mini-pivot), but I don't know by how much.
Just looking at some of the RethinkDB and ReQL stuff, I certainly wouldn't have used it. Two things hit me immediately:
- bad performance
- a dearth of types (literally only a single numeric type that is a 64-bit float, so that eliminates entire categories that rely on integer/fixed-precision exact arithmetic. Also, time series are seriously hurt by that decision. I've seen DBs have to move to 64-bit longs because of that issue alone. Having a pseudo-type layered on top is just going to tie up CPU cycles in the encode/decodes that need to happen. (Plus milliseconds aren't enough in a nano-second world now.)
This whole "let's do everything with as few types as possible" I hope is a fad that will just die quickly in the DB world.
This is with about 20 minutes of looking around and just listing a few things. It looks like RethinkDB was written by web devs for web devs and I don't think that can compete in the database world. You might get the web devs onboard, but the database and HPC people at the firm will probably look at it and go "how quaint when can we get Oracle/Postgres/Informix/etc up and running?"
When you work with dynamically typed languages, the typeless nature of RethinkDB is awesome. MongoDB follows the same design.
Functionally, to me, RethinkDB was perfect. ReQL is one of the best query languages ever - It's so easy to learn and remember.
I haven't operated RethinkDB at scale in production (so I can't say much about performance) but I was pretty impressed with its scalability features during testing.
I'm really more leaning towards the idea that this is purely a monetization failure. They've been going at it for 7 years - There must be a good reason why investors kept it going for so long - I think it's because of the product.
I think they did identify a good monetization strategy in the end but maybe it was too late - They dug themselves into a niche that had great long term growth potential but they didn't have the resources to wait it out any longer.
> literally only a single numeric type that is a 64-bit float
After dealing with JavaScript and Lua, I am ready to call this a complete anti-pattern. To be a good language, it must support at least one size each of machine floats and ints. To be really good, it should be possible for me to choose any size of machine-supported floats and ints. To be great, it should also support rationals, fixed-point and complex numbers out of the box.
Giving me floats but not ints just doesn't cut it. It works, in a kind of shoddy way, but … it's tasteless.
If you don't provide me with bitwise operations (earlier versions of Lua, I'm looking at you), then you don't get to call yourself a real language.
For a database, though, I suppose one could always store integers as their string representation. But please, no language ever do this again ever.
The only concrete problem anybody's mentioned with floats is that they only have 53 bits of precision, and some people need their integers to go up to 64 bits, or more.
Here's two actual concrete problems: efficiency and type safety. Indexing an array with "1.5" doesn't seem awesome, nor does using 8 bytes for a double where a 1 byte int would do fine.
A double with integral value in 127..127 already takes 2 bytes to store in rethinkdb (the first being a tag distinguishing between array, object, string, etc), compared to some random double's 9 bytes. The type safety advantages of distinguishing doubles from integers are pretty minimal in a database, because you'd need a schema for that anyway, and the benefits of that far outweighs the add-on benefits of having type checked query logic.
You can definitely get around the currency issue by scaling yourself, but things like the timestamp issues (fractional seconds since the epoch with millisecond precision) are a little more problematic. You basically have to roll your own datetime format and lose any db support.
Databases are about data after all. Why not have a rich way to describe it?
I think so too. But I don't think, say, not having a 32-bit integer type, and 16-bit and 8-bit integers, is a big problem, if you've got doubles. Maybe it's a nice-to-have. A 64 bit type or bigint would add real value.
I think as far as number types, you NEED (in the you should be considered seriously broken even if you can hobble along without) to have:
- a 64-bit float
- a 64-bit int
everything else can be emulated in code from there and you can play all the encoding games you want to save space when you don't have to use it. This covers 99.999% of the use cases you reasonably see.
From there, I would argue, you should also have:
- a bit and byte
- 32-bit ints
- 32-bit float
- 128-bit int (once you add UUIDs you might as well make them full numerics citizens).
If I were doing a modern database, I'd have the kitchen sink as a type full numerics from 8 to 128 bit both signed and unsigned int and all supported hardware float sizes. I'd probably even have a 512-bit AVX type just to see what people would use it for.
Why do you NEED a 64 bit int? Why not 32 bit? You're not storing pointers in a database. (And then you can implement your 32 bit int in terms of the 53 bits of int-capacity in a double.)
Because I want to store something that is a 64bit int? I mean this is really a weird question. There are a lot of things that require this starting with timestamps:
datetimes, aggregations, etc. pretty much anything non-toy in the modern world requires 64 bit int.
like I said. read NEED as in you can certainly hobble around without it (pick an appropriate offset and read your 53 bits in relation to it), but like a fracture in a leg, it's still considered broken.
A query optimizer might do a lot better with a native integer type than a user-defined one cobbled out of int32's. It knows all the mathematical properties of the type and add things in any order, knows that x < y implies x + 1 <= y, etc.
probably easier to have this convo in email (check my profile). HN iisn't quite setup for back and forth. I'm more than willing to see your point about about a 53-bit int. I think dev experience can be made more difficult as long as a you get compensation in other areas. Email me a "hi" and I'll respond, but it might have to wait until later in the day.
Edit - I actually meant for this to be in reply to jnordwick's comment further down thread (the one with all the rhetorical questions), but apparently I suck at clicking the right things.
You seem a bit worked up, and it's hard not to read your comments in this thread in a tone which suggests you feel like you've been robbed of something.
These people don't owe us anything. When a company decides to open up and publish a postmortem it's a great and wonderful thing, but it's in no way obligatory. Even if their technology doesn't hold a candle to your expectations or the market's needs, it's still not without merit. If you want to learn, they've already given plenty for you to learn from. Rather than demanding more from them, be thankful for what they have already contributed.
I think his cynicism is warranted. This is a forum, and one that is subject to trends and favorites. Anytime RethinkDB pops up on HN (or stripe for that matter) it seems to be met with unchecked praise.
And yet here we are with a failed product. I think it's OK to ask why and consider its shortfallings. Obviously there was something, and I don't think it's fair to completely coast over the technology side of things and blame it on "marketing".
I think RethinkDB's marketing was excellent. The unchecked praise you mention is likely in no small part also a result of this.
However I don't think that means the technology was the problem. There are far more broken and misarchitected pieces of technology that are financially successful.
What exactly went wrong is speculative until we see the insights Slava promised, but it seems the failure was entirely financial: they had a great product, they brilliantly marketed it, but that didn't translate into sustainable revenue.
What scenarios were you finding you had bad performance?
I was finding I could getting comparable performance to MySQL for straight by key CRUD activity which is all I'd use a document store for (particularly once you take into account MySQL replicating to multiple nodes and the related failover setup).
> This is with about 20 minutes of looking around and just listing a few things. It looks like RethinkDB was written by web devs for web devs and I don't think that can compete in the database world. You might get the web devs onboard, but the database and HPC people at the firm will probably look at it and go "how quaint when can we get Oracle/Postgres/Informix/etc up and running?"
Have you ever tried to fully automate the failover/replication/etc for those? It is a substantial pain point that usually requires an experienced DB person with that database to do it well.
There are plenty of CRUD datasets where RethinkDB would be good enough and wouldn't require the dedicated ops people. That is the pain point I'd say RethinkDB was focused on solving.
----
I'd say the real reason of the failure is more the business model end where the intersection of orgs w/o dedicated ops people AND willing to pay for support was simply too small. RethinkDB was built (imo) for small organizations that simply didn't have a dedicated ops team and it was just 5-10 developers + marketing/business.
That could just be how I feel because its the situation I'm in (less than 10 IT supporting 200+ person org and our sysadmins deal almost entirely with end user issues).
This is the exact kind of comment I was warning against.
The "it surely can't be the technology!" comment when yes, Shirley, it really can be. Even the best database systems I know (orders of magnitude quantitatively better than RethinkDB) have areas of technical weakness. And not just some areas, but usual gaping valleys. Nothing does everything well, and if it tries to it usually does everything below average.
Time to man up. Performance really was crappy? That only having a float for your ONLY number type really was kind of dumb? Having no native datetime was bad? Saddling timezones on the pseudo datetime was a brainfart? Milliseconds only was shortsighted? Horizon was a bad idea too soon? Document databases are the new object databases - forever 5 years away? Aggregates were poorly done? Nobody really knew how to do stationary time series properly (I read through the GitHub issues regarding this and it was laughable that these people called themselves database experts)? and on and on.
I'm not saying all of this is true. I'm saying let's be straight and not always blame sales and marketing or dumb consumers or C-level execs or everybody but our own crappy code.
Having worked for a successful(ish) database vendor, I can say that user stupidity (ignorance) is a major factor in database software success.
Literally no users, and almost no employees at the vendors, I would bet (if my experience is not unique) have any idea of the qualities of the software they use/peddle, its semantics or guarantees, or its performance characteristics or how to use it correctly.
This makes the technology secondary; what is most important is how well you carve out mindshare. Whether that is even primarily down to "good" marketing is something I doubt. In the modern saturated database market, it is entirely unclear to me how you win sufficient mindshare and trust to obtain a wide enough (paying) userbase.
I have no idea if RethinkDB was "good" or not, as I did not take the time to investigate; I have little interest in "document stores". It actually seemed to me they had some pretty solid engineering foundations; the kind of thing that the industry should value highly, but due to poor information is unable to price into their purchasing decision, and as a result is missing in many modern (especially OSS) database offerings today.
> I dislike cassandra immensely, but I still use it every day.
I feel your pain. There is very few robust database system that handle recovery well enough at scale, and for those without the deep pocket to keep a dba on hand there are very little choices.
For small/middle scale robust system Cockroach db is promising but still needs a couple years development, if it can survive without a monetization strategy long enough.
Why do you assume performance would be bad? It's got indexes. Obviously if you never create an index or never profile your queries it might get sluggish but that's the same in SQL.
The dearth of types can be good or bad depending on your use case. I agree that for some use cases it's a bad thing.
I like many things about SQL and still prefer it to RethinkDB in some ways, but there are no SQL databases that offer replication or HA fail over configurations that don't require incredible time investments to configure correctly. This includes commercial DBs as far as I can tell. If you're too small to have a DBA, HA SQL is out... unless you use a somewhat expensive and also restrictive cloud-managed solution. The latter would have been our second choice behind RethinkDB.
When Horizon was announced, that's when I stopped even considering RethinkDB for new projects (even if it was a good fit, feature-wise). Call it a "business smell", but I'm not surprised to hear they've wound the business down.
Good call, I missed that aspect of horizon. Though if they could've transitioned to a managed db / app service provider it might've provided a way to keep going. Makes me think the mini-pivot was "a dollar short, a day late", and that perhaps managed PaaS could provide good revenue models for open source infrastructure.
Those of us who have had failed startups can tell you that pats on the back and community support help. The Hype Machine would have you believe that failure is opportunity, and that you get a chip on your shoulder when you fail; but failure means failing (grade F in the US) and it hurts. A lot.
They mentioned that they will be releasing more post-mortems later. For now, I'm comfortable with supporting them and hearing about their learning later.
The RethinkDB team built an amazing product that many people appreciated - as noted by the comments here- and for that they have my praise and admiration. The fact that the product will continue without the business is also noteworthy. Well done all
Slava mentioned in his blog post that he would be writing some lessons learned over the upcoming months. The fact that this is an open source project that should continue to live on helps with the reception, too. It's sad news to see the company shut down, but it's positive news to know that we can still keep the project alive.
I guess those other document-oriented dbms just had better early stage shilling to get funding before RethinkDB did, and by the time Rethink shopped around there was already enough competitors that investments in documented-oriented dbms dried up.
The current state of the art is an in-memory dbms to take advantage of future available (soon, 2017) non-volatile memory. CMU is developing such a dbms http://pelotondb.org/ and there's also MemSql and Sap Hana.
Many early stage startups can't afford even a single salary. By the time enough cash is on the table to consider those kind of purchasing decisions, the initial round of architecture decisions have typically been made, and been cemented into the initial version of the software.
E.g. I've worked on more than one startup where the first angel investment wasn't on the table before 6-12 months into development, and where that first round in some cases was below $250k
On top of that you also have the issue of finding someone that knows it, and the associated staffing risk that comes with that (yes, I'm sure you can always find someone, but at what price? there are place I can go where I couldn't throw a stone in any direction without hitting someone that "knows" Postgres or MySQL or both sufficiently well to be an acceptable tradeoff)
In many startups the tech choices end up being made not just based on what fits and what is affordable, but also based on what you can find affordable people to work with (including e.g. co-founders or other people willing to do initial work for equity) - sometimes that can lead to niche tech getting used. But far more often it means picking from a small set of the most common alternatives.
So then the startup needs to get by with what it's got and earn money until they can afford it. Just like everyone else does with literally everything else.
Yes, and that is exactly the point: This means databases like Postgres or MySQL gets entrenched over options like KDB that costs a lot of even get started with. By the time they could afford the license, the cost of switching has risen dramatically.
After that comment I'd bet good money you've never started a company or worked full time at an early stage startup.
Recommend you take one of your ideas and sketch out a back of the napkin first year plan. I bet it doesn't include using 250k of investor's seed money for a database.
please do introduce me to your generous VC. At current low interest rates, building a stack that is reliant on KDB, will cause a large valuation hit because you're effectively locked into a negative -250k/year cashflow in perpetuity. At a 5% discount rate for example that's more than 5m USD of inflexible negative NPV right up front. Look there are fintech applications where this will be seen as fine (right tool for the job might be the big difference between success and failure) but you'll admit that it's a big ask for the less conventional "disruptor" style businesses which may not have big ticket upfront funding.
It isn't that you didn't think it completely through, just that there are a lot of variables that go into any hiring or tech decision. Take this for instance (but it generalizes to many others): does the DB reduce your need for developers? does it open new market segments? does it reduce your need for developers? does it open new areas to target?
Startups are hard (Ive failed a couple times and had it work a couple times - definitely not always through my own effort).
My current view on this is that RethikDB didn't rethink enough They solved a probabem without much money involved and too small. They might be great devs, but just diidn't solve a prolblem that needed to be solved.
I can just speculate just as a user playing with RethinkDB (and having programing database SW from Tandem NonStop days) - for me, it seems like RethinkDB did not solve problem companies using databases and having money needs to be solved.
I think they needed to develop just one feature which cannot be done by Oracle and that is that.
The other problem could be that NoSQL hype just died out :(
This isn't meant to be harsh, but these are times to learn, not simply pat each other on the back.