Hacker News new | past | comments | ask | show | jobs | submit login
RethinkDB joins the Linux Foundation: What Happens Next (rethinkdb.com)
733 points by mglukhovsky on Feb 6, 2017 | hide | past | favorite | 106 comments



Bryan Cantrill posted his thoughts on the CNCF's decision to donate RethinkDB to The Linux Foundation here: https://news.ycombinator.com/item?id=13579544

We wanted to share RethinkDB's next steps in our new home with The Linux Foundation.

We've also had a lot of folks ask if they can donate to support the project. Stripe has generously offered to match up to $25k in donations (which will help fund server costs and future development.) You can learn how to contribute to the project with OSS contributions or donations here: https://rethinkdb.com/contribute


I don't have much to add, but as someone who has a lot of projects dependent on RethinkDB and also loves using it:

Thank you to everyone involved!


Do you/would you use for anything other than something that needs live updates ASAP?


Yes, I use it in my upcoming project, which is just a regular, (no real-time sensitive) webapp and RDB is perfectly usable as a regular NoSQL database.

I really like the fact that you query the database by calling methods on an object rather than passing strings to it, feels more like actual programming and protects you from injection attacks.

The WebUI and easy scalability are great features as well.

It and CockroachDB, (for more traditional SQL, same scaleability benefits), are the databases I am most excited about, but I seem to prefer ReQL to SQL despite being vastly more familiar with SQL :-)


Absolutely. Rethink's core value props to me are:

Strong promises about consistency Nosql with joins Atomic changefeeds (real time) A great functional query language (reql) Great admin UI Simple auto sharding that actually works


We use changefeeds but just to update Elasicsearch. The key value offer is the very powerful query language which feels right at home in Node, strong robustness, and real proper join support with NoSQL.


Mike and Slava, you guys are so great! You've worked overtime to secure social confidence in RethinkDB for others when the engineering talent alone was good enough. You guys are so admirable and hard workers and such a good role model to others.


Thanks Mark, I appreciate the kind words.


I haven't used ReThinkDB yet, but I know that projects I have or will rely on do. Therefore, I donated and I encourage you to do the same! It helps that I met Slava at a bar in Mountain View once and he was really nice to me =)


I also donated... I haven't been able to use it in a workplace project (yet), but have to say following several of the issues/resolutions, it would be my first choice today for most scenarios.


I stayed away from Rethink in the past few years due to its uncertain future. Now I'm seriously interested. Looking forward to the next chapter of RethinkDB.


I wonder if they could have closed more deals if they'd promised to do something like this. Then again, maybe the board would have balked at that as a sign of no confidence.


You generally have to move carefully to get proprietary software to an open source model. One typical customer reaction to rumors you are open sourcing is "Great! We'll wait until you do that to try it out." The result is that [already not good] sales may tank completely.


Also known as the Obsborne effect (https://en.wikipedia.org/wiki/Osborne_effect).


RethinkDB wasn't proprietary. It was open source, licensed under the AGPL v2, the same license used by MongoDB, which is doing just fine. MongoDB is one of the top five in the DB-Engines rankings.

One of the key business differences between RethinkDB and MongoDB is that the latter keeps some enterprise-grade features/tools closed-source and available under a commercial license, i.e. the "open core" business model.


I don't know if "we'll relicense if we go out of business" is a particularly helpful stance to a potential customer that wants to use the product in the present. Presumably, if a deal is being closed, then the buyer has agreed to some licensing that is already reasonable enough (e.g. freely modifiable and distributable within the org).

This kind of stance can be useful to a community, though, that wants to ensure future proprietary development remains possible (kind of strange to want but it can make sense). I recall when Trolltech did this with Qt. If they went out of business or there were a change of control, Qt would switch from GPL to BSD licensed.


If the board balked, that would be a sign of not-smart investors. My investors don't have a problem with our MIT/Zlib/Apache2 Open Source database - however, then again, Tim Draper (our lead) was the original investor in Skype way back in the day, so he is way ahead the curve than most (dare I say on HN, even YC).

That is why it is really important to have only the best and smartest VCs on board. Edit: Don't give equity up in your company unless your vision aligns with your investors and they can do more than just fund you.


>I wonder if they could have closed more deals if they'd promised to do something like this.

I think continuation of development and community growth even after they closed down is what gives people confidence - and there's no practical way to show that upfront (assuming that having a big/visible community outside of your core company is not viable for such project in that ammount of time) - re-licencing just the first step to that. They proved what happens in a "worse case scenario" but I don't think you could have done it without going trough it.


I think as a consumer DB that may have helped. However, if the company (rethinkdb in this case) is still small enough, and the client big enough, than they can work out a contract that states that the source code gets licensed to them in the case of a shutdown.

This allows the company to get larger accounts with no real downside as they would be out of business.


The team behind RethinkDB has been a class act. Thank you for creating a fantastic product and doing what's right by your users even as you were winding down.


It's really true. I was looking at RethinkDB for a project a while back, but decided to go another way, but I've been following it from a distance ever since. At all steps of this process, it's been clear that the core developers have been doing their utmost to do right by their users, even in the face of unfortunate business developments. It's really heartening to see this news.


I've always been fond of Rethinkdb, but never actually used it. Perhaps if I came across pragmatic examples of how to do x with y, like you typically see with Redis, I could have convinced my team/s otherwise.

One of the aspects of Rethinkdb I admire most is the tooling. I find myself often trying out something new with React, Postgres, ASP.NET Core, Elm, Go, Kotlin or what not and biasing my experience getting started with preference to use.

I recond Rethink as Pied Piper in Silicon Valley; a great product ultimately being misunderstood. I'm relieved to hear Rethinkdb will live on under the Linux Foundation (and applaud them for doing so) and earnestly hope it re-establishes itself in a niche, such as that of Firebase/Parse, with partnerships and a legacy to rival that of Postgres one day.


The news in the article is that CNCF spent good money to wipe out a copyleft license. They think history has shown a more permissive license without copyleft or a patent grab (just a patent notice) - namely ASLv2 - is a far superior choice and paid to get that in place. Their very recent explanation is well worth reading: https://www.cncf.io/blog/2017/02/01/cncf-recommends-aslv2

I'll explain what I mean by patent notice since those are my words, quick version: If you don't declare that you have patent rights affecting a portion of code you have contributed, you've given those patent rights.



ironic that this will be the move that actually propels rethinkdb...

it's my feeling that software licensing day's are over for the little guys. If you are Oracle or Microsoft and have that brand recognition great.

Coupled with commoditization of developers, I think it'd be great if we had a kickstarter site where you could request a commercial project to be open-sourced, pitch in some money to support the developer.

For instance if somebody released an open source version of Hootsuite I think that would put a severe dent in Vancouver's tech scene-Hootsuite customers wouldn't even think twice about switching to a zero cost solution, as it's not a pain killer but a vitamin. Free vitamin is always better than an expensive one. Pain killers on the other hand are less flexible because it's an emotional buy.


I'd love to be able to pitch in some money to turn Datomic into an open-source project. As much as I'd love to use it for my own projects, commercial licenses tend to causes huge pains for deployment, and poses as an insurmountable philosophical barrier for use in my own open-source projects.


I worked with Datomic but it was such a turn off to use because of it's limiting license plans. I'm also worried that they might not be around in the future and what then?

An open source Datomic would be wonderful. I worked with Datomic last year and while I enjoyed clojure and datalog, it was also a big pain in the butt to be googling "how do I do X in clojure/datomic/datalog".

A really sharp developer might be able to master it (I'm not) given ample time but it's very expensive both to ramp new developer's up and pay clojure devs which aren't cheap due to their limited supply.

If there was a website like Kickstarter that had a way for people to vote (with money) on open-sourcifying an existing commercial project, would people come?


hey lewis9029,

just wanted to let you know we are working on exactly that: a way to crowdfund open source alternatives to commercial software like Datomic.

Do a search for "letsopensource" on the thread here:

https://news.ycombinator.com/item?id=13591321


Honestly what needs to happen next is a serious effort to explain why or when rethink is better than mongo, cassandra, arango, aerospike, memsql, mysql, riak, or postgres, ++, not to mention all the TSDBs. On the event pushes I am unconvinced that message queues/computation graphs arent superior and that's another crowded space. When I last looked at it the advantages struck me as mostly incremental on the query language and decremental on performance. There are many excellent competitors in this space, most of which are well funded, and moving targets. Rethink doesn't seem to have a USP, or none that has been effectively communicated at least, IMO.


TBH, given the option... RethinkDB is probably the best case for anything that needs distribution/HA and automatic failover. SQL is a decent option, but you either pay a lot for HA, or you need to have a lot of domain knowledge or hire dedicated DBA support. Not that RethinkDB doesn't need some knowledge, their admin interface is great.

The replication model is similar to Cassandra (ring + redundancy), while the master/slave model and failover has had a lot of work to make it bulletproof.

It will scale well from 3-15 nodes, then it starts to drop off as less than linear growth. But if you need more than that, then you're in a whole other league.

If you want search only, go for ElasticSearch. If you need much greater linear growth at the cost of application complexity, Cassandra. If you need fast memory access, then go for Redis. If you don't need bullet-proof automagic failover, or are willing to pay through the nose for it, go for SQL. If you are okay with a single system, go SQL. Otherwise, RethinkDB should probably be the first choice.

Don't get me wrong, I'll reach for SQL first in many cases... but RethinkDB if I have a choice and HA is a requirement. I also happen to prefer a document-centric model/approach.


Aerospike touches most of your points at much higher speed and scale. It is next gen redis, basically, with disk, with auto-sharding scale, with cross node queries. Cassandra is not difficult once you wrap your mind around column storage, and if you need that, no other storage style will do.

The 15-node thing is also a major achilles heel. Who wants to commit to a stack that incurs massive technical debt in the event of massive success? Imagine reengineering your db and your event pushes, at scale...


Aerospike doesn't offer many consistency guarantees. If you run it in a cluster on the cloud you are more than likely to see silent data loss [1].

It's not a fair comparison, RethinkDB is much safer. I'm sure, if you turn down the defaults on both read and write operations on RethinkDB you could scale it well past 15 nodes and with very high read and write throughput.

1: https://aphyr.com/posts/324-jepsen-aerospike


You can scale past 15 nodes... it's just you'll want to tweak things and/or you won't get linear growth as you add more nodes. That doesn't mean you can't. Also, if you need more than that, Cassandra and other options are there, and you'll likely have to feel that pain regardless.

There are other ways to separate your data depending on use cases. It's just a rough guideline... You'll see similar issues beyond 10-20 servers in a local cluster in many of the NoSQL options.

RethinkDB also has much better consistency guarantees over Aerospike, not to mention being FLOSS under a more permissive license.


RethinkDB pushes well past 15 nodes: teams have demonstrated north of 25-30 nodes with linear scale.


Thank you... iirc, the recommendation was 12-15 nodes at the top end. Though I haven't investigated deeply for a while now, as for the past 2 years I haven't had the option of what I've been using.


Comes down to jenkins for me. RethinkDB aced the jenkins tests.

Mongo has failed every jenkins test it's been put through, dunno about the status now though. Last I checked Mongo's default durability level was "data loss on power outage". Aerospike failed jenkins too, and not on small edge cases like Mongo, but with major dataloss.

Going by the problems Gitlab has recently with Postgres I wouldn't use that for a distributed database. Likely true for MySQL too.

* Yep. I meant to write Jepsen :)


I think you mean Jepsen? It's a slight exaggeration to say that RethinkDB aced it -- but they did very well[1] and (more importantly to me, honestly) Jepsen was used to find a subtle and nasty issue that was subsequently fixed.[2]

[1] https://aphyr.com/posts/329-jepsen-rethinkdb-2-1-5

[2] https://aphyr.com/posts/330-jepsen-rethinkdb-2-2-3-reconfigu...


I've been having a lot of trouble with Jenkins at work today. So when I though Jepsen, I wrote Jenkins.

Still feel they aced it. No one passes Jepsen on their first try. But RethinkDB is the first to immediately fix the issue.

I wouldn't go as far as to call it a nasty issue. It would only happen if you got node failures while reconfiguring your cluster. And reconfiguring the cluster must be initiated by the admin and it's something that happens very often.


Sorry, clarification: by "nasty" I just meant "subtle", not "debilitating" -- and agreed that they did really well on Jepsen.


Passt find replace Jenkins Jepsen call-me-maybe ;)


Jepsen


Linux foundation are collecting quite a lot "failed" projects and turn them to gold these days? I sometimes feel it is acting like a software goodwill store partially. Whatever that is, hope RethinkDB will do well in the future.


IMO the ASF is better at collecting abandoned failed software while LF has more openwashed corporate code dumps. CNCF is actually legit though because they generally accept technically sound software that is already successful.


RethinkDB is quite a bit more than an openwashed code dump though. It seems like a natural fit to some of the rest of the CNCF work, and probably why they had an interest... though not a direct fit, hense being under LF proper instead of CNCF.

I think it's an appropriate home though. ASF always seems like a minefield when I look into some of their projects.


What are other examples?


Do you have some examples to point out?


Node.js and Express? Although neither seemed failed, maybe troubled at one point.


I think the NodeJS Foundation (https://nodejs.org/en/) and project are really successful and putting them under the auspices of the LF helped resolve a lot of community issues (including the io.js fork): https://nodesource.com/node-by-numbers

I'm not familiar enough with Express to comment on it.


Node.js had "failed" (userbase mutinied and a strong fork appeared) until they moved to the LF.


Sorry if this should be obvious but what is/are the killer feature/s of RethinkDB, what differentiates it from something like Redis or even CockroachDB?


Only RethinkDB gets you a) working changefeeds, where you can receive real-time changefeed updates to your queries, b) a well-implemented and Jepsen-proven distributed database.

As far as I know, there is no other solution which gets both things right.

I use it in PartsBox (https://partsbox.io/), a solution for keeping track of electronic components.

I am surprised more people aren't interested in changefeeds — the way I see it, it's the only way to implement multi-user webapps which update in real-time (as in: a change is made in one session and all other open sessions get the update immediately).


Have to agree with pmalynin, there are other dbs that do this.

Couchbase Mobile has changes feed as well.

How you are defining/measuring "well-implemented".

Not sure why there is no Jepsen test of Couchbase. I see many requests, and a closed issue with no discussion of why it was closed.


Well a) is provided by Mongo and is literally the reason why Meteor can do exactly what you described: multi-user webapps which update in real-time.


Sigh. Yes, the Mongo oplog and RethinkDB changefeeds are superficially similar. They are both for feeding changes, just like a paper airplane and a passenger jet are both for flying. And yet there is a world of difference.

Leaving aside reliability and ease of use, let's focus on correctness. RethinkDB lets me query a database, get initial data, and then get all the subsequent changes to that data. Notice there is no race condition there.

You can use this to implement systems where when a user logs in, gets the initial data loaded, and then subsequent changes are sent as they happen. Even if the same data is modified by someone else during this time (e.g. during the initial load), things will be processed correctly.

Comparing this to attaching a processor to a feed of all operations in the database doesn't make a lot of sense, because the oplog doesn't provide the same functionality.


The oplog replication provided by Mongo is a very different mechanism than RethinkDB's change feeds. This is a pretty good overview of the differences: https://www.compose.com/articles/rethinking-changes-how-two-...


Unfortunately b) is critical to some, which Mongo did not fair well on:

https://aphyr.com/posts/284-jepsen-mongodb


I couldn't find any documentation on change feeds for Mongo after a quick google search.

Could you post a link please?



While the oplog does provide some semblance of RethinkDB's changefeed, it's not nearly as powerful. With Rethink, you say you want a query, and rethink will let you know about changes to that query. Mongo just says "hey, here are operations that were done", and leaves the reconstruction to you".

So I guess Mongo+Meteor match up with RethinkDB... sort of.


It's also worth noting that changefeeds are highly scalable: you can run tens of thousands of them on a single node, and scale them out linearly from there (even as they're scoped to specific queries.)

Obviously the performance characteristics will be impacted by the volume of changes that arrive to the database, but the architecture to support this is highly parallelized (all the way down to cores on the CPU.)


RethinkDB is just a great document-centric database. It has guarantees similar to SQL, including server-side joins, while having a great replication/redundancy pattern (similar to C). It's probably best in a use case where you are planning on 3-15 servers for a cluster. If you need more than that C may be better.

I like to think of it as MongoDB done right. Above and beyond better consistency models and a broader, more well thought out API, they have an admin interface that is second to none (well SQL Management Studio might be slightly better). It's definitely better than any other "NoSQL" database.

A couple years ago, I had been considering it for a project, at the time it was missing a required feature for the project (geolocation indexes), so I wasn't able to use it then... but I followed the development of the feature, and prerequisites for that and the automatic master failover and the engineering discipline and planning was far better than pretty much any project I'd been exposed to ... The team(s) and their energies were not wasted, and I really appreciate what they have done.

I was sad to see the company shutter, but very happy to see the project under LF, and hope that it really takes off from here. It would be a pretty natural fit as an RDS service under Amazon and there are a few hosted options. Horizon also looks interesting compared to firebase.

This is another feature over competitors is that streaming updates is in the box, and not bolted on to oplog processing like competitors.


What is that C you're talking about?


C* is short for Cassandra.


Why did people start doing that? I noticed all the Cassandra people at work started using C* at pretty much the same time, too, including signatures in e-mail. Was there a global "there's too many letters in Cassandra and C7a looks weird" memo to the entire Cassandra community? Drives me nuts for absolutely no reason I can think of.


Probably a few high profile devs started using it and the community started to emulate.


I'm honestly not sure, I thought it was a canonical abbreviation as a lot of the training docs I've seen use it.


The * characters in your original comment were interpreted as markdown emphasis markers, so they effectively got lost.


yeah, I noticed after I replied... thx.


This might be a useful read: https://rethinkdb.com/faq/

It explains RethinkDB's ideal use cases, explains how to compare it to other databases, and details some of the differentiating features.


RethinkDB has passed jepsen testing.

*Fixed typo


I think you might mean the Jepsen tests: https://aphyr.com/posts/329-jepsen-rethinkdb-2-1-5


This is fantastic news. My current project would not have been possible if it wasn't for RethinkDB. Very glad to see it moving forward!


I wish some company (apart from compose.io) would create a RethinkDB as a service (DBaaS).

I would totally love a dynamodb or firebase kind of payment structure!

I specifically excluded compose not because their service is not great (from what I've heard - it's excellent) but more so because they charge a premium for it.


There are a couple of community members seriously considering this with the license change.


If there is any way I can help (anything from building, or even just documentation or support), I'd love to get involved.

I am very passionate about XaaS (X as a service) think that we need a lot more DBaaS and we just don't seem to have that many options to choose from which is sad considering almost all software projects out there - especially the hobby ones would benefit so much from such a service.


I'm wondering what kind of legal shenanigans I would have to wrangle with in order to do this.

Exactly what is and is not allowed when making a service based on an Open Source tool?


My weird and unrelated question is: if I donate software to an open source group like the Linux Foundation, can I write it off my taxes? And if so, how do I assess the value of it? RethinkDB probably has some legitimate market value...can the founders reflect that on their taxes?


Not likely, since the Linux Foundation is a 501(c)6 organization, and donations are not tax-deductible.


This is correct.


Had never thought of this... for Linux Foundation, I don't believe so. However, it'd be interesting to see an Open Source Foundation set up as a 501(c)3, where contributors would get to write off a certain value given for their contribution. The hard part would be, what is the value of the product you have contributed, and how will it stand up under audit?

Or would it just be considered a service and, just like volunteering, you can't really write it off.

I can see a foundation that provides software to the community, especially communities that would other wise not have access to it, and or municipalities, etc.


It's not the founder's asset -- unless it has been released back to them in adjudicated liquidation proceedings or by contractual agreement. It is the company's asset, and would be reflected in the company's taxes.


I had a question about this passage:

"The company behind RethinkDB shut down last year after struggling to build a sustainable business around the product. Many former RethinkDB employees currently work for Stripe, where they help build infrastructure for developers around the world."

Is Stripe a big RethinkDB shop or is there another connection between the two?


Stripe hired most of the Rethink engineering team after the shutdown. They're no longer working on Rethink professionally, but many are still contributing in their own time.


Any news about the future of horizon?

I played with rethinkdb and horizon and it looked like the the way to go to me.

When you see that with a few easy lines of code, any change in state in the browser of your computer it's updated in the browser of your mobile, without practically doing anything in the server, it feels like the future.


Horizon will be joining under RethinkDB's aegis at The Linux Foundation: a new community-driven release is in the works.


Any chance this release will include custom auth support?

The lack of that was our main reason for staying away.


Incredible news. I was certain this kind of relicensing would be impossible. Congrats to the community.


I just donated. Thanks to the original RethinkDB team for your amazing efforts... I meetup with Michael Glukhovsky briefly over coffee here in SF and was immensely impressed then, and I am still immensely impressed now.

Fantastic to hear RethinkDB lives on. Long live RethinkDB. ;-)


The rethinkDB team are truly the best. I wouldn't hesitate to work with them in the future.


Just donated! We have an app in production using Rethinkdb. This needs to live on!


Is this one of the first companies that has made the transition from business to fully open-source? If so this could set an interesting president for others to follow suite under similar circumstances.


Blender would seem to be another example:

https://en.wikipedia.org/wiki/Blender_(software)#History


This is great news! Rethink is NoSql DB done properly, I know many others such as Cassandra often get mentioned as alternatives to Mongo but they aren't really.

I just hope the rigor and correctness that have characterized RethinkDB continue moving forward as a community project. Part of me feels sad it never caught on and will never be a commercial success like Mongo, but that's in the past.

Horizon is another exciting project I hope gets traction.


RethinkDB is in my mind firstly Mongo done right... Cassandra is a very different beast, but seems to be the better option when you need massive (100+ node clusters) scaling, but that comes with a lot of work in terms of development.

While RethinkDB is probably most comparable to Mongo, it's worth noting that the sharding/replication/fail-over support and model is much better than Mongo's. Beyond that, the update notifications (streams) are in the box, where with mongo it's bolt-on. Also, rethink supports joins at the server (though best to avoid a lot of the time).

The admin ux is pretty awesome, and the dev team has been very cool to follow.


>"Also, rethink supports joins at the server (though best to avoid a lot of the time)."

Does RethinkDB have distributed joins then? These sort of notoriously difficult to implement well no? I see you say its best to avoid. I would curious to hear you experience with them.


They are difficult, and costly, performance-wise. I wouldnt use them for a large expected result-set... maybe sub-items against a single parent.


I'm not too familiar with RethinkDB. Can someone explain why it's so popular on HN? The wiki page shows that it's not that popular overall and the comparison to other DBs makes it sound pretty bad.[1]

1. https://en.wikipedia.org/wiki/RethinkDB



This is awesome news! Good to see Cloud Native Foundation growing to address the needs in Cloud Computing space.


What kind of workload RethinkDB is suitable for ?

1. Transactional

2. Analytical

3. Operational


The tagline on the site describes the most obvious use case: "RethinkDB pushes JSON to your apps in realtime. When your app polls for data, it becomes slow, unscalable, and cumbersome to maintain."

In their FAQ, they call out these types of applications: Collaborative web and mobile apps, Streaming analytics apps, Multiplayer games, Realtime marketplaces, Connected devices


Not really transactional. RethinkDB only has document-level atomicity, and even then, not all writes are guaranteed to be atomic.


Is this the best possible outcome given the circumstances?


Yes.


I see Slava has no comment :-)


They should have stuck with AGPL.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: