Postgres can have some write-amplification issues for update-heavy loads where at least one of the updated fields is covered by an index. Famously, this triggered Uber to switch to MySQL. Its impact is noticeable in one of my projects as well -- and it isn't that large (a few dozen million rows).
Not trying to doubt your experience, just collecting datapoints for improvements. I'd appreciate if you could describe what your problems around write amplification are. Is it
This is a great question. I have to say I haven't profiled Postgres to that level of granularity, so I am making an assumption about why my updates are slow. What I notice is relatively slow UPDATE performance, with high total disk writes and a write-ahead-log that seems to be large. The bloat on disk isn't an issue for me (or, rather, I have started using lower fillfactors to reduce the variability in DB size as the rows are updated).
Do you have a favored resource that you can direct me to that would help me identify the factors that you cited and ultimately allow me to make more nuanced criticisms (or to find a fixable error in my approach, which would of course be the ideal result)?
> I have to say I haven't profiled Postgres to that level of granularity, so I am making an assumption about why my updates are slow. What I notice is relatively slow UPDATE performance, with high total disk writes and a write-ahead-log that seems to be large.
Hm, that could be quite independent things. Could you check what your checkpoint_segments (< 9.5)/max_wal_size & checkpoint_timeout settings are? Having very frequent checkpoints will trigger a lot of superfluous writes; both the checkpointing itself and increased percentage of full page writes (to prevent torn pages).
When you say your write-ahead-log is large, do you mean the pg_xlog directory, or the amount of writes to it?
Another common reason for slowness can be foreign keys, if you've not indexed both sides of it. E.g. if there's a foreign key referencing the table you're updating, and the referencing columns aren't indexed, you'll end up with seqscans...
> Do you have a favored resource that you can direct me to that would help me identify the factors that you cited and ultimately allow me to make more nuanced criticisms (or to find a fixable error in my approach, which would of course be the ideal result)?
I'd strongly suggest enabling log_checkpoints - that'll log information about every checkpoint, allowing at least some diagnosis.
Another useful thing would be to run pg_xlogdump --stats, that'd give you information about what takes up space in your WAL.
I'd also look at EXPLAIN (ANALYZE, BUFFERS) one_of_your_updates; - that'll tell you a bit more about the timing, including trigger runtime.