Hacker News new | past | comments | ask | show | jobs | submit login
Lesson: Oracle's driving MySQL to open core; don't sign contributor agreement (computerworlduk.com)
43 points by chaostheory on Sept 24, 2011 | hide | past | favorite | 47 comments



Good news! Postgresql scales really well, and will always be open source. Plus, it has more features and great compliance with the SQL standard.


One of things that puts me off Postgres is the constantly shouting "Postgres is better" (and please, we already know in which areas Postgres is better, nobody is debating that) without ever addressing the actual needs of the vast majority of MySQL-users.

Most of them aren't DBA's, and they don't need to be for the purpose for which they use the DB. For them, MySQL is a tool that just works, without issues. And they certainly don't have any need to deal with the less than helpful Postgres community, touting "advantages" that are completely irrelevant to them. Hell, most MySQL users only use a fraction of the features in MySQL.

If they ever migrate to a different DB, it's more likely to be a MySQL-fork. Although Postgres could very well replace MySQL, the Postgres community chooses not to address that audience. Which is fine, but than please stop pointlessly pissing on MySQL at every available opportunity...


One of the things that puts me off the Mysql community are all the people shouting "Mysql is good enough".

I've had to professionally administer both mysql and postgresql installations. Mysql is an awful piece of software. Every second spent using it is painful.

You say 'it just works', and in that case it really doesn't matter what you use. If you haven't run into bugs/quirks/problems with whatever datastore you use, whether sql, or nosql, whether open or closed source, you haven't pushed it very hard. All software has rough edges and bugs.

I think you will have a very hard time finding people who have used both mysql and some other database, whether sql-server, postgresql, or oracle (or even firebird) that would have a very high opinion of it.

Now, what are the 'actual needs' of the vast majority of MySQL users that Postgresql doesn't satisfy? I hope that requirement isn't 'open source' :)


Indeed. It's like a badge of ignorance worn by people that have never used a real database. Sometimes the people telling you something is better are just trying to do you a favor. The Postgres folks are a model of advocacy, in my opinion, mostly leading by example.


The people telling you MySQL is good enough / better are also doing you a favour ... You can more easily tell they don't know very much about databases :-)


"constantly shouting 'Postgres is better'"

The person to whom you replied did not say that. I read it more like "Oracle can mess up MySQL if they want, because we still have this great alternative called 'postgres'".

"without ever addressing the actual needs of the vast majority of MySQL-users."

Your comment struck me because I still have a piece of paper next to me with a list of all of the complaints and feature requests I saw in various discussions around the time of the postgresql 9.1 release. That doesn't mean that every single issue you have is addressed yesterday, but a lot of development is happening in response to the needs of the MySQL community.

The most obvious example is replication (9.0, but improved in 9.1), but there's also per-column collation. And you might also consider unlogged tables to be targeted at typical MySQL use-cases. And 9.2 is likely to have index-only scans (covering indexes), for which a patch has already been posted by Robert Haas -- I happen to know mysql users who have been waiting for that feature alone, but it took some groundwork development starting in 9.0.

Can you please cite something specific that you feel is not met or being addressed, so that I can add that to my list, too?

"touting 'advantages' that are completely irrelevant to them"

Well, they might be advantages to somebody -- they were developed for a reason. Postgres folks who have done something interesting like synchronous replication or K-nearest-neighbor indexing aren't going to whisper quietly about them.

"If they ever migrate to a different DB, it's more likely to be a MySQL-fork"

It's hard to get exact numbers, but a common theme at postgresql user group meetings is people migrating from mysql to postgres or mysql people starting their next project in postgresql. There was a noticeable uptick after 9.0 was released.

"than please stop pointlessly pissing on MySQL at every available opportunity"

People come here to learn new things and see (and take part in) progress. Please stop declaring the discussion over ("...we already know...nobody is debating that...") and products like MySQL good enough ("...it just works...").


You're certainly not alone in hoping that MySQL loses market share to PostgreSQL under Oracle's stewardship. I've always assumed that MySQL has been more prevalent due to some combination of historically earlier user friendliness, an incompatible SQL implementation, and feedback effects related to the previous two factors.

That said, is there something that would be lost if everyone just switched from MySQL to PostgreSQL tomorrow? What benefits does MySQL have over PostgreSQL these days?


Momentum and familiarity.

I run a big website and would love to switch; the only thing stopping me is that I'd have to manually fix hundreds of carefully hand-crafted queries to support pg syntax.

If someone wrote a "MySQL emulator" layer for postgres, I'd switch tomorrow. (And I'd work quickly to progressively replace emulated calls with real pg SQL -- it's a lot easier to justify the effort after you make the switch than before!)


It really depends how 'carefully crafted' your queries are. If there are things like index hints, you can just strip them out. If you are using non-standard mysql specific syntax, that will be a bit of a pain, but it won't take more than a day or two to fix a few hundred queries.

Where you are going to experience the most pain is that MySQL lets you write idiotic queries like this:

SELECT id, last_name FROM some_table GROUP BY last_name;

That is not valid SQL, for good reason.

Other things that will bite you: http://andreas.scherbaum.la/blog/archives/657-PostgreSQL-9.0...

God, looking at that link makes me so sorry for anyone using MySQL. Life is too short for that.


> If you are using non-standard mysql specific syntax, that will be a bit of a pain, but it won't take more than a day or two to fix a few hundred queries.

One issue is with non-standard MySQL functions (such as inet_aton/inet_ntoa) and aggregate functions (such as group_concat). While most have great and superior pg implementations, it's a lot of work to cross-reference each one, grep the source code, and hope the pg implementation is sufficiently identical.

Another issue is that MySQL considers the 'as' keyword to be optional. I'm sure pg has good reason for requiring it, but there's probably about a thousand 'as'es to be added.

And that's about where I gave up last time, if I recall.


And what stops you from using SQL standard syntax? You can disable MySQL extensions for that if you wish so.


Why isn't that valid SQL?


Well, lets say your rows are:

id|score|last_name 1|11|smith 2|22|jones 3|33|smith 4|44|jones

Query we are pretending is valid is: SELECT id, last_name FROM table GROUP BY last_name;

What rows are returned? I expect to see something like:

?|?|smith ?|?|jones

However, what ? is isn't clear. Could we get a row like 1|33|smith ?

In SQL, all selected columns must be part of the group by or inside an aggregate.


'id' doesn't appear in the group by list and it isn't being used in an aggregate function.

I'm actually curious: what id would a MySQL user expect to be displayed for this query?


> I'm actually curious: what id would a MySQL user expect to be displayed > for this query?

MySQL discourages using this feature if columns not included in GROUP BY are not constant in the group:

  > Do not use this feature if the columns you omit from the GROUP BY part
  > are not constant in the group. The server is free to return any value
  > from the group, so the results are indeterminate unless all values are
  > the same.
Their example in documentation:

  SELECT order.custid, customer.name, MAX(payments)
    FROM order,customer
    WHERE order.custid = customer.custid
    GROUP BY order.custid;


"MySQL discourages using this feature if columns not included in GROUP BY are not constant in the group"

That's what errors are for, not documentation. It is pretty easy to forget something in the group by, and documentation won't help with that.

The dangerous thing is that the result returned from such a nonsense query looks valid in many cases, while being wrong in subtle ways.

PostgreSQL detects when the query is valid, and executes it if so. So, if you do a GROUP BY customer_id (a key column), you can also see customer_name without adding it to the GROUP BY list. But if you group by customer_zipcode (not a key), and try to select the customer_name, it will throw an error.


Ponder: Which id is returned when you have more then one last name that is the same?


There are a number of benefits:

1. InnoDB has certain optimizations that PG lacks which can make a big performance difference at the high end: index-only queries, insert buffer (or change buffer in MySQL 5.5+), clustered index

2. Lightweight connection creation: MySQL can handle many more concurrent connections and also can create new connections much faster due to threading vs. process model

3. More flexible replication: PG is catching up, but MySQL still has the edge here imo.


1. while true, PG has other features (e.g. partial indexes) which can make a big difference as well. Completely depends on the scenario. Sometime mysql will 'win', sometimes PG. No real reason to pick one over the other, unless you know that in your situation a particular feature is a real must-have.

2. use a connection pooler (e.g. pgbouncer), problem solved, good practice anyway, even with mysql

3. what (practical) edge do you see given the replication features in 9.1?


I feel the lack of index-only queries.

As for lightweight connections, I see this as completely moot. While you might make tens of thousands of cheap connections to a mysql server, postgresql is much better at executing concurrent queries. Connection poolers like pgbouncer let you make as many cheap connections as you want if most of them are going to be idle anyway.


a) Clustered and covering indexes.

b) Non-transactional tables

Both allow you optimize the memory usage of some particularly problematic cases, specifically very large, simple tables. Postgres cannot return data directly from indexes (covering indexes). It always has to go back to the table itself to fetch the actual data. If the table is large, that can be inefficient for some types of queries.

Non-transactional tables use a lot less memory as well. For instance, if you have a large table that represents a N:M relationship (id1 int, id2 int), the two ints use 8 bytes of memory. A postgres table adds about 24 bytes per record, three times the actual data, plus some overhead per page.

Don't take this to mean that MySQL is faster than Postgres. That's not generally the case. The Postgres query optimizer is vastly better than MySQL's. So for complex queries and data models, Postgres is way superior. The big differences are always related to very specific data model and query combinations, so general benchmarks are utterly useless.


Unlogged tables are now in Postgres as of release 9.1: http://www.postgresql.org/docs/9.1/static/release-9-1


That's very good news! I have to look at the physical data structures and what it means for memory usage though. "Unlogged" as such only means they don't use WAL.


They are not available in Postgres at the moment, but I believe that covering indexes / index only scans will be coming some time next year, as theres a patch that should hopefully go into the next release.

http://rhaas.blogspot.com/2011/08/index-only-scans-now-there...

http://archives.postgresql.org/pgsql-hackers/2011-08/msg0073...


> b) Non-transactional tables

9.1 added unlogged tables. They are completely unsafe, and quite fast.


I'd love to migrate to postgresql. Unfortunately none of the published tools work well.


I'm migrating an old MySQL based app to Postgres at the moment. Heres some of the things I'm looking at in addition to the base tools.

1) taps ( done by some people from Heroku ) : https://github.com/ricardochimal/taps and http://adam.heroku.com/past/2009/2/11/taps_for_easy_database...

2) mysql2psql : https://github.com/maxlapshin/mysql2postgres

3) Just slurping in the data via the FDW feature for 9.1 , http://www.pgxn.org/dist/mysql_fdw/1.0.0/


OK, I'll bite: what are the tools that MySQL is supporting that PostgreSQL isn't?

PHP blog generation software, or some lower level, more general purpose tool???


I should clarify. None of the tools for migrating my (rather simple) mysql database to pg work well or at all...

But on that note PgAdmin3 is pretty crap. At least on my mac.


Thanks for the clarification. I've noticed a lot of that in this thread: people started with MySQL, and now they have a serious code base dependent upon it. That is a tough problem to solve. Sorry.


My code does not depend on mysql. If could reliably migrate the data from mysql to postgresql, i'd switch in a heart beat.


The biggest one is PHPMyAdmin. LAMP's a pervasive thing. (Not that I like LAMP, mind you, but it's important to be aware of why Pg hasn't made large dents in the MySQL market share over the past decade.)


phppgadmin? I've used both (using phppgadmin right now), and they're similar enough. It's not as nice as phpmyadmin, but that shouldn't be a determining factor over whether to use mysql over postgresql.


Thanks for the feedback. That explains a lot then. I always preferred to use the command line, so that I could either make scripts, or at least screen shots, of changes made. Thus, my utter lack of concern about the admin GUIs. PostgreSQL has a kick-ass command line tool, with online help, GNU readline (including auto-complete), annd good output formatting options. Contrast with Oracle's sql*plus tool (yee-UCK!). I seem to remember mysql having at least readline, but not sure about the rest (probably comparable, but who knows).


Good news! The Postgresql community is currently emulating the Linux community of the late 90s. That means in another 5-10 years, you too will have matured to the point where you don't have to go around screaming your perceived superiority and can allow your product to speak for itself.

Seriously, even overpriced, self-aggrandizing Oracle DBAs are less insufferable than you guys.


Wow, someone actually modded my parent post down. All I can say is, Haters gonna Hate, Mysql users gonna wait.

That is a reference to waiting for queries to finish.

Mysql is slow.


mysql being or not being slow, along with postgresql scaling well, have nothing to do with the story being posted. your comments are being downvoted for being off-topic.


The point of the article is that basic functionality you need to get MySQL to scale well are no longer available as 'open source', but only in the enterprise product. That seems to touch on 'open source' postgres scaling, as well as mysql being slow (well, unless you pay for enterprise it seems).


You're off topic, take this

>Haters gonna Hate

back to the rubbish bin you found in and don't come back.

Mindless, off-topic propaganda isn't welcome here.

I, for one, would like to see more migration away from MySQL, but that doesn't mean you get to act like a total nob.

Cue PG telling me to shut up again while I'm taking out the trash.


Any MySQL hackers can always just leave this Oracle nightmare behind and join MariaDB, it's a promising fork of the MySQL core and they're working hard on a new storage engine for it.


...or Drizzle.


Oracle has typically been a licensing centric company. The "named user plus" licensing thing can be unwieldy for most small firms. The past few years have been bad for it as a result. SAAS based innovation, unconventional databases, database scaling bottlenecks, parallelization, in-memory computing, mobility and a lot of the low cost innovations around these have emerged as a threat to Oracle. The options it has had are acquiring OpenSource and aggressively defending patents. All the while its marketing teams continue to whitewash offerings like exalogic as "cloud" offerings.

I feel that Oracle tends to explore options to corral innovation. Its OpenSource portfolio is the classic trojan horse. Expect all sorts of lock-in.


I'm sorry, you're sorely mistaken. I've worked at Oracle (been acquired) and i've seen numbers. Licensing is not the main source of profits, support is; they might sell it differently, but that's where they make the real dough. The margins are astonishing. And look at the numbers they just posted about Europe: +50%. At these levels, that's huge.


I agree. All I wanted to point out was the difficulty in using their stack if you are a small firm or an innovator.


I guess Oracle just don't see MySQL as enough of a threat, or enough of a profit opportunity, to shackle to the mothership with contributor agreements.

Indeed it might even speed up MySQL development, potentially undercutting Oracle's serious open source rivals.


Oracle has a problem in that PostgreSQL is somewhat of a threat, in terms of features and similarity. OK, the default stored procedure language is not exactly the same, but it's fairly close to PL/SQL.

Nobody is going to mistake MySQL for Oracle, and I suspect Oracle wants to keep it that way, while dragging MySQL along just enough to prevent an exodus of FOSS developers to PostgreSQL.

I should probably re-evaluate MySQL again, but they scared the Hell out of me back in 2001 when I found it did not support rollback, nor foreign key constraints, nor transaction isolation at the time. I KNOW THEY HAVE FIXED THIS STUFF SINCE THEN, but the mentality that thought it was OK to leave that stuff out??? I did enough xBASE stuff in the 80s to know I did not want to back to that confusion. I would rather use an ISAM interface than debug query planning in SQL, but having to use SQL, and getting none of the data integrity benefits?!? Screw that!

Y'all enjoy your MySQL, and I hope the whole source code license issue works out well for you :-)


I use Oracle at work, Postgres at home and MySQL only when I have to.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: