That gp3 volume is extremely slow compared to a $100 NVMe drive. If each txn doe...

ardentperf · on Feb 5, 2024

agree - network storage is slower than local NVMe - the choice was intentional for two reasons

1) the percentages would be different but the basic implications should hold true even with NVMe and 96 cores, as long as we scaled up the data size and workload

2) in addition making it a bit easier to demonstrate what we'd expect to see (there's not really anything surprising here), chose this setup because it's cheap so anyone else could replicate the exact results or play around with the scripts and try variations without having to spend much money. for example, someone on twitter was curious about uuidv7 in a text field - would be easy & cheap to try it out and see what happens - also could easily go to bigger hardware and local NVMe, changing client and row counts

20 years ago when i wanted to benchmark oracle RAC, i had to go out and buy dual-attach firewire drives and that was a hack because who wants to spend their personal vacation money on an old EMC clarion storage array from eBay [i might have bought personally an old sun server or two though!]

size results should be independent of hardware setup, but the perf results are specific to this setup, which is why the post includes detailed specs and scripts for transparency

also, FWIW, most production use cases for databases these days include some kind of high availability which means network involvement in the persistence path - so even when the database is on local NVMe, it's not uncommon to have a hot standby or patroni or something with sync replication

ndriscoll · on Feb 5, 2024

Yeah the high level information is good, and the buffer cache analysis is super neat. I haven't seen that kind of thing elsewhere. It's a great article to explain why performances differences exist. My list of gripes is probably more about Amazon marketing suggesting that something is big or high-performance or scalable when it's... not.

If you're something like a bank, you need synchronous replication, but a lot of use-cases would probably be fine with async with a couple ms RPO. Then again most people probably don't need more than a few thousand writes/second anyway. For banks, I worked on storage arrays at IBM ~10 years ago, and I think our synchronous replication was sub 100 us, but can't remember anymore.