Not trying to doubt your experience, just collecting datapoints for improvements. I'd appreciate if you could describe what your problems around write amplification are. Is it
This is a great question. I have to say I haven't profiled Postgres to that level of granularity, so I am making an assumption about why my updates are slow. What I notice is relatively slow UPDATE performance, with high total disk writes and a write-ahead-log that seems to be large. The bloat on disk isn't an issue for me (or, rather, I have started using lower fillfactors to reduce the variability in DB size as the rows are updated).
Do you have a favored resource that you can direct me to that would help me identify the factors that you cited and ultimately allow me to make more nuanced criticisms (or to find a fixable error in my approach, which would of course be the ideal result)?
> I have to say I haven't profiled Postgres to that level of granularity, so I am making an assumption about why my updates are slow. What I notice is relatively slow UPDATE performance, with high total disk writes and a write-ahead-log that seems to be large.
Hm, that could be quite independent things. Could you check what your checkpoint_segments (< 9.5)/max_wal_size & checkpoint_timeout settings are? Having very frequent checkpoints will trigger a lot of superfluous writes; both the checkpointing itself and increased percentage of full page writes (to prevent torn pages).
When you say your write-ahead-log is large, do you mean the pg_xlog directory, or the amount of writes to it?
Another common reason for slowness can be foreign keys, if you've not indexed both sides of it. E.g. if there's a foreign key referencing the table you're updating, and the referencing columns aren't indexed, you'll end up with seqscans...
> Do you have a favored resource that you can direct me to that would help me identify the factors that you cited and ultimately allow me to make more nuanced criticisms (or to find a fixable error in my approach, which would of course be the ideal result)?
I'd strongly suggest enabling log_checkpoints - that'll log information about every checkpoint, allowing at least some diagnosis.
Another useful thing would be to run pg_xlogdump --stats, that'd give you information about what takes up space in your WAL.
I'd also look at EXPLAIN (ANALYZE, BUFFERS) one_of_your_updates; - that'll tell you a bit more about the timing, including trigger runtime.
- heap bloat
- index bloat
- CPU overhead of index manipulations
- Write-ahead-log overhead of updates
- increased overhead / frequency of VACUUM
- total amount of disk writes
Thanks!
Edit: formatting.