Well, yeah. Disk cache can take hundreds of MS to retrieve, even on modern SSDs. I had a handful of oddly heated discussions with an architect about this exact thing at my previous job. Showing him the network tab did not because he had read articles and was well informed about these things.
At a previous job I worked on serving data from SSDs. I wasn't really involved in configuring the hardware but I believe they were good quality enterprise-grade SSDs. My experience was that a random read (which could be a small number of underlying reads) from mmap()'ed files from those SSDs took between 100 and 200 microseconds. That's far from your figure of hundreds of milliseconds.
Of course 200 microseconds still isn't fast. That translates to serving 5000 requests per second, leaving the CPU almost completely idle.
Another odd fact was that we in fact did have to implement our own semaphores and throttling to limit concurrent reads from SSDs.
(Anyone who is not running their OS and temp space on NVME should not expect good performance. Such a configuration has been very cheap for several years now.)
> Such a configuration has been very cheap for several years now.
This is a very weird comment, considering that a) it's cheaper than yesteryear but SATA SDD (or even modern magnetic HDDs) are still sold and are in active use and b) ignores phones completely, where a large number of sites would have mobile-dominated visitors and can't just switch to an NVMe-like performance even for those with large disposable incomes (because at the end of the day even with UFS phones are still slower than NVMe latency-wise).
The issue has nothing to do with disk speed. If you had read the article you'd see a very nice chart that shows the vast majority of cache hits returning in under 2 or 3ms.
I wish I had a clearer memory or record of this, but I think I’ve also ~100ms for browser cache retrieval on an SSD. Has anyone else observed this and have an explanation? A sibling comment points out that SSD read latency should be ~10ms at most so the limitation must be in the software?
OP mentioned specifically that “there have been bugs in Chromium with request prioritisation, where cached resources were delayed while the browser fetched higher priority requests over the network” and that “Chrome actively throttles requests, including those to cached resources, to reduce I/O contention”. I wonder if there are also other limitations with how browsers retrieve from cache.