Hacker News new | past | comments | ask | show | jobs | submit login

Summary of the Stroustrop video - inserting or removing an item at a point in an array of 100k items needs a linear scan from the start of the data structure to get to the point - an array does that very quickly, a linked list is much slower, lots of random memory indirection, a pointer for every list node blowing out the cache - in practice this linear scan to to the change point dominates the runtime, and arrays come out much faster.

While the array change does need on average 50k items to be shuffled up to make room or close the gap, modern caches are very good at that.

If the array is sorted it can be binary searched to get to the change point, which improves its performance even more, linked lists can’t do that.

Interesting.




I can definitely see in modern cpu array scanning being faster than pointer chasing but I wouldn't have expected that to survive insertion with 50K move wow!

And if you not doing ordered insertion you wouldn't have to move the data in the array anyway, you would keep track of the size and jump to the end, so not sure I understand the binary search comment.

The next question is at what level of growth the waste of empty space in the array becomes too much. Some kind of data structure (tree/linked list) with largish (whatever size applicable for modern cpu) as probably mentioned in other comments does seem the most versatile approach while keeping the performance. Or perhaps the handling of that data structure might overwhelm the array advantage?


> > If the array is sorted it can be binary searched to get to the change point, which improves its performance even more, linked lists can’t do that.

> And if you not doing ordered insertion you wouldn't have to move the data in the array anyway, you would keep track of the size and jump to the end, so not sure I understand the binary search comment.

It just means that if I want to view the nth element of an array, that's a constant time operation. I just take the pointer and add n times the size of the elements.

But for a linked list if I want to view the nth element of the list, I have to view the (n-1)th element first, all the way back until the first element I have a reference to.

> The next question is at what level of growth the waste of empty space in the array becomes too much.

Everything I've tried and everything I've seen from people testing on real hardware is that the gap in performance widens with larger values of n.

You might expect different performance as the system runs out of memory. But arrays have a size of n * element size, and linked lists have a size of n * (pointer(s) + element size), so the linked list would hit memory limitations more quickly regardless.


But the memory usage isn't considering growth. If I grow by doubling the size (seems common) at n+1 element where n is the last allocation, one would allocate n2 + n (the original until the copy finishes) vs. n (pointer(s) + element size).

But yes the size advantage is very reduced for most use cases. What would you imagine the cases where LinkedList is still a valid data structure?


> If I grow by doubling the size (seems common)

True. Then it comes down to the size of your data vs the size of your pointers.

> What would you imagine the cases where LinkedList is still a valid data structure?

Compared to an array? When the array doesn't have the behavior that you need.

For instance, if you need to keep a stable reference to a particular element in a list, even while that list is being inserted into, then an array is not going to cut it. The linked list handles this case with ease.

That's not to say you can't write a wrapper for the array that does the right thing, probably for cheaper. But out of the box the linked list can do things that the array cannot and if you rely on them, then use the right tool for the job.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: