Hacker News new | past | comments | ask | show | jobs | submit login
Stonebraker: Send Relational DBMSs to the Home for Tired Software (marklogic.blogspot.com)
18 points by mblakele on July 2, 2009 | hide | past | favorite | 25 comments



You know, every couple of weeks an article like this comes up on HN or Slashdot predicting the end of the RDBMS. I don't deny that special purpose data stores can offer great performance improvements in some applications. But there is a reason that RDBMSs are celebrating their 40th birthday (as the article mentions). They are flexible, cheap (except for Oracle and DB2) and easy to manage.

My guess is that we will still be reading about the imminent death of the relational database another 40 years from now.


You could be right. But Stonebraker is not "just another guy" predicting the end of the era of the RDBMS. He's one of the guys who invented the RDBMS. If he thinks he's seen something better, well, I think it's worth at least taking a look.


He is selling something (he claims to be) better - Vertica.

I've no comment either way on whether it is, and I know who Stonebraker is and respect what he's done, but he has a vested interest here. Just like Netscape programmers needed to kill off their earlier work, Mosaic.


I agree. And he's been selling post-relational stuff for a while-- this is not a new observation on his part.

I'm not going to form an opinion on his word alone, but he's got big enough chops in my book to warrant taking what he says seriously, even if he has a vested interest.


he's also selling VoltDB (OLTP memory resident DBMS) and SciDB (Scientific Data).


Understood, this reads like anti-Oracle linkbait.


The reason you see these articles is that RDBMses are ill suited to take advantage of opportunities created by cloud computing.


RDBMs are perfectly consistent with cloud computing.


Why do you make this assertion? Other than with a proprietary solution (e.g. Oracle RAC), what options are there to easily add/remove capacity on demand from an RDBMS?

Now, you can just add additional "shards" of a MySQL database, but that breaks the whole "R" of "RDBMS" (can't do joins across the shards). In addition, you're limited to a ring-style Master-Master replication scheme vs/ a grid/mesh. There's also MySQL NDB_Cluster engine, but I have never (to date) seen it in production (there are just too many limitations).

Proprietary solutions like Oracle RAC may scale on demand, but require a DBA team. The cost overhead offsets any savings gained through on-demand provisioning of resources.

Not to mention Oracle RAC (and even MySQL) aren't exactly friendly to commodity hardware (which is what both official cloud computing solutions and de-facto clouds used in high-tech companies' data centers are built on).


Oracle RAC is stupidly complicated to administer. You are correct in your statement that a RAC environment generally does require a DBA team and the cost overhead is significant. Further, Oracle has very stringent hardware and OS certification and you will not receive any support from Oracle on uncertified systems.

However, I would not associate it with the "cloud" movement/paradigm/whatever. It is a cluster (the C in RAC stands for cluster) with 2 or more instances sharing memory structures across a network. To add another node to the cluster is excruciating; I would not classify it as dynamically scalable - the hallmark of cloud computing.


> Why do you make this assertion?

I make that assertion because no one actually uses arbitrary amounts of scaling, "in the cloud" or anywhere else.

For many users, the primary benefit of cloud computing is the ability to go quiescent and not pay. For everyone, there's a benefit to handling various levels of load with usage based payment, but the active range is reasonably limited.

All this is well within the capability of a professionally administered RDBMS.


I note that AWS instances themselves don't scale at all. They are what they are.

You "scale" by buying more and lashing them together as you see fit.

Amazon happens to provide a very high capacity datastore, but it could provide RDBMS instances just as it provides "compute" instances.


They do. Its called SimpleDB.


No, they are not. They can be stuffed into a cloud architecture, but they were not built with the cloud in mind. If you were to build a cloud database, from the ground up, what you would come up with would not look like a traditional RDBMS.


As long as they don't throw out the underlying idea that data can be deconstructed then reorganized into a form that adheres to mathematic principles, set theory and predicate logic in the case of RDBMS's, and from that you can perform logical operations on it that allow you to derive new, additional information from it. I think that was the major breakthrough of Relational Theory, hopefully that is built on, rather than thrown out.

Speaking of which, Chris Date, Hugh Darwen and some others were talking about a significant extension or evolution of the relational database, called the transrelational database, several years ago. Anyone heard of any progress on that front lately?



Ah, too bad. Thx.


Software Engineering would progress a lot farther as a discipline if we weren't so preoccupied with throwing out the "old" to make room for the "new".

My definition of "legacy" code: code that is field-tested, has a low defect rate, and generally works.


This is just a brief summary of the Michael Stonebraker's piece "The End of a DBMS Era (Might be Upon Us)" which appeared on HN a few days ago. As far as I can tell it adds nothing new, am I missing something?


I think this article makes some good points. In fact there are times when I've been very frustrated at the "We can't to better than RDBMS" attitude many times. Particularly that time we were storing a tree in a table. Specialized data stores are very neat.

I do however think there will always be a good place for the traditional RDBMS. For instance, there are times when I'm creating that I don't know what the layout of the data will need to be, and where the optimizations will be most needed, so I back an app with Postgres. Should I need to expand later, it will be easier as I have actual data to work with.

I guess the point I'm trying to make is: unless you know for sure that your data will benefit from a different model (e.g. known performance needs, or a better conceptual mapping), not using an RDBMS seems like a premature optimizaton.


It seems as a database engineer, I am going to have to bulk up my skills in non-RDBMS technologies, even if just to be able to say "yes, I am very familiar with <random non-SQL DBMS> and it's not an appropriate solution to this problem".


I don't understand, I thought relational DBMS were left for the dogs many years ago. . .http://en.wikipedia.org/wiki/Dbms#End_1970s_SQL_DBMS


In the online transaction processing (OLTP) market, a lightweight main memory DBMS beats a row store by a factor of 50.

I'm confused -- what does main memory vs. row store have to do with relational vs. non-relational?


Everything when you're talking about real world implementations and not some abstract theory in your head.


The actual article was already on here:

http://news.ycombinator.com/item?id=680881




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: