What is the rational behind this? My guess is that when a buffer reaches its lim...

ubernostrum · on Nov 4, 2014

This is a problem that occurs in lots of places, not just memory allocation. To pull an example I'm a little bit familiar with, check out CVE-2012-3444 from Django a couple years ago.

The tl;dr there is that determining the dimensions of an uploaded image file was done by reading 1024 bytes at a time until enough information had been read to figure the dimensions. Which works on a lot of common stuff, but falls over hard on other files, since there are plenty of cases where you have to read the whole file to get the dimensions. At 1024 bytes per read operation, that can be a lot of operations, enough that it was a denial-of-service vector.

The solution was exactly what the linked blog post advocated: we switched from reading a fixed size in bytes on each operation to doubling the size each time. The exponential growth of successive reads means that the pathological case (having to read the whole file) becomes far less pathological.

IIRC Python's dictionary implementation does a similar trick, doubling in size every time it needs to grow. Turning lots of small read or write or realloc (or whatever) operations into fewer but larger-each-time operations is almost always correct, in my experience, and when not done from the start you'll almost always end up noticing sooner or later when you wonder why you have degraded performance.

vidarh · on Nov 4, 2014

You're absolutely right, and too few realise this.

In particular people doing IO with small buffers drives me crazy. People unfortunately don't seem to realise how expensive context switches are, and how brutal they are on throughput.

I've seen this so many places. MySQL's client library used to be full of 4-byte reads (reading a length field, and then doing a usually-larger-but-still small read of the following data). I believe it's fixed, but I don't know when. I also remember with horror how t1lib - a reference library Adobe released for reading type 1 fonts ages ago (90's) that spent 90%+ of it's time on the combination of malloc(4) and read( ... ,4) - for tables of a size known at the outset (basically some small table with one entry per glyph, that stored a pointer to a 4 byte struct instead of storing it inline).

Currently I'm hacking on SDL_vnc now and again, and it's full of 1-16 byte reads (seems to make sense at first glance: After all the VNC protocol packets are of a size that depends on values of different fields; but for high throughput networks or local connections it makes the small read/writes totally dominate overall throughput even when the protocol overhead is a tiny percentage of the bitmap data being pushed)

Basically pretty much anywhere where you want to read less than 4K-16K, possibly more these days, it's better to do buffering in your app and do non-blocking read's,so you can read as large blocks as possible at the time...

But the general problem is not paying attention to the number of system calls. People not paying attention to stat()/fstat()/lstat() etc. is another common one (common culprit: Apache - if you use the typical default options for a directory, Apache is forced to stat its way up the directory tree; it's easy to fix, but most people don't seem to be aware how much it affects performance)

cakoose · on Nov 4, 2014

Let's say you start with a small array, then add N elements, one by one. When you run out of space in your underlying, you allocate a newer array, copy all the elements over, then free the old array.

If your new array is always a constant size bigger than the old one, then the total amount of work you do is O(n^2). If your new array is a constant factor bigger than the old one, the total amount of work you do is O(n).

This is ignoring memory allocator costs, which can change things a bit.

Cthulhu_ · on Nov 4, 2014

The article states that growing by a fixed size every time creates a lot more memory churn and moving bits around - the operation is to create a new block of memory and move the existing collection to the new block. Doing that a few times isn't too expensive, doing it every time for every small increment adds up quickly.