PostgreSQL needs a new load balancer

pilif · on June 30, 2013

The article is from 2012, but its contents is still very true. The only general purpose solution out there is pgpool, but from my own experience, unfortunately, it's less than stable at times and it has its own interesting quirks you have to live with (like having a SQL parser that lags behind PostgreSQL 1-2 releases).

Then there are bugs. Sure: As they are discovered, they get solved quite quickly, but if you are depending on certain functionality, it can still be very annoying.

It's also touted as a HA solution, but from my own experience last year, I greatly increased the overall stability of an installation by getting rid of pgpool - there's too much magic going on in there.

Finally, I was quite unhappy about them silently fixing a case where an unauthenicated client could request all the servers memory to be consumed. No security announcement - not even a release was made.

Unless you are very willing to spend a lot of time working around quirks and deeply familiarizing yourself with the source code, I would recommend against pgpool at this time, which leaves us, again, without any general-purpose load balancing solution. Other solutions have restrictions that make them quite situational (no prepared statements, non-deterministic cleanup after the client disconnects, etc).

Smerity · on June 30, 2013

I love PostgreSQL but many of the tools for it are nowhere near the same level of quality as PostgreSQL itself. Many of the past start-ups have used MySQL and whilst the base may not be as good the tooling on top makes their replication setups far simpler.

I've given up on using an explicit load balancer (pgpool scales badly and is temperamental as you write). The application itself now handles load balancing, swapping out database connectors depending on whether it requires writes / no-lag reads.

The current replicated PostgreSQL setup I have uses WAL-E for continuous archiving / backup (it recently got support for non-US-East S3 buckets[1]) and repmgr for replication management (hasn't seen a release in a year and the documentation is somewhat sparse). Don't get me wrong, I like and respect both projects, but I have less confidence in and more complications from them than PostgreSQL.

[1]: https://github.com/wal-e/wal-e/pull/3

jwr · on July 1, 2013

The article is definitely still true. When using Postgres in larger systems it's always a problem when the client asks "well what about redundancy?" — there is no good answer to that, for reasons outlined in the article and the post above.

pilif · on July 1, 2013

That's entirely not true. Postgres has built-in replication and you can even do synchronous replication if you want to pay for the latency - you can't have more redundancy than this.

Doing the failover is trivially done using something like keepalived or heartbeat as Postgres has built-in support for quickly promoting a slave to the master.

What we don't have, however is good load balancing. Also, because load balancing is f'ing hard to get right when you are in a master/slave replication configuration because it's really hard to know what queries you can safely send to the slave. After thinking quickly, here are some cases: use of nextval/currval, use of select after insert/update in a transaction, querying unlogged tables, and probably many more.

A good load balancer knows about all of this.

willvarfar · on July 1, 2013

I think in spirit the post you were replying to meant load balancing?

roncohen · on June 30, 2013

We ditched PgPool for PgBouncer for connection pooling from our webservers. PgPool tries to do way to much, and the documentation is horrible. PGBouncer been rock solid compared to PgPool.

I'd love for PgBouncer to poll the servers it's connected to and automatically talk to the master, so in case of master failure I would only have to promote a new master (and STONITH) and PgBouncer would auto-failover to the new master.

jaytaylor · on June 30, 2013

Yes, PgBouncer is a little better but could still be improved upon in significant ways. E.g. Getting PgBouncer working with SSL connections- You have to do it through a specific version of stunnel. Then stunnel always has to be running in addition to PgBouncer. Not friendly or fun to setup.

theatrus2 · on July 1, 2013

Out of curiosity, are you using Postgres with SSL for an offsite or multi-DC replica? Would switching to a site-to-site VPN be more efficient here?

stingraycharles · on July 1, 2013

Wouldn't using site-to-site VPN be adding a SPOF just as well, namely the VPN server (and, in addition to this, make the VPN server a bottleneck for transfer speed) ?

vidarh · on July 1, 2013

You can "easily" enough set up two VPN connections on separate pairs of machines and route to a virtual IP on each end and use ucarp or keepalived to have one or the other take over. It's not pretty, but it works.

And the VPN server may very well become a bottleneck for transfer speed at some point, but most of us won't ever need to deal with a level of bandwidth where that's an issue.

jaytaylor · on July 1, 2013

This was part of an attempt to improve our Django app, by using PgBouncer to do connection pooling with a Heroku database. Heroku requires SSL for all db connections.

Site-to-site VPN would not have been an option in this instance, and in general I have had poor experience with them - Flaky SPOF, poor throughput, and so on.

obviouslygreen · on June 30, 2013

Also using PGPool in production on one of our more complex projects here; it's been reliable over the last couple years with varying amounts of traffic. We haven't gotten what I'd call a lot of it, but what we do get generates fairly high DB load at times, and what problems we've had have not been due to PGBouncer.

mkhpalm · on June 30, 2013

I think the elephant in the room for postgres is its lack of bidirectional log replication. Once that finally happens, I'm sure dead-simple balancing tools will start popping up everywhere.

zanny · on July 1, 2013

"elephant in the room"

Heh.

Context: https://wiki.postgresql.org/images/thumb/a/a4/PostgreSQL_log...

wahnfrieden · on June 30, 2013

Is there progress on this front?

Someone · on June 30, 2013

Is that what http://wiki.postgresql.org/wiki/BDR_Project is about? They aim to "implement main BDR features into core Postgres" in versions 9.4. Looking at past history, that could be the third quarter of 2014 (http://en.wikipedia.org/wiki/PostgreSQL#Major_releases), but of course, that is just a guess.

Qantourisc · on July 1, 2013

Clustering is very hard to do correct. Clustering while maintaining unique auto_increments, solving split brain, rolling back transactions, ... while maintaining speed: even harder. So the cluster is never going to be "simple", might be simple to install, but never simple.

Side node: if you want load balancing, no replication, yes then it's easy to do :)

rpedela · on July 1, 2013

What about Postgres-XC for load balancing? I know it does several things, but isn't that one of the use cases?