Man, hold your horses. Assume you have 64GB of memory devoted to binary tree (of...

codebje · on Sept 3, 2019

L1 cache lines are typically 64 bytes long, and you're only reading 16 bytes of it per level of the tree. And a random pointer tree traversal is the worst case for CPU branch prediction, too.

There are methods for optimising the use of various tiers of cache, trying to pack multiple nodes into one 64 byte line, arranging nodes such that left and right children are usually adjacent in memory so prediction losses are minimised, and other tricks.

wahern · on Sept 3, 2019

> When doing recursion you just need to keep pointer to the level, so this is 32 times pointer or just about 256 bytes of your cache.

What language do you use where a recursive function invocation only requires a call frame large enough to hold a single parameter? By what magic does the function know where to return? And what are you doing in that function that doesn't require any temporaries whatsoever?

At best you're looking at a couple of kilobytes. For all but the most constrained environments (toaster, a 4Kb kernel thread stack, etc), this isn't really worth worrying about. And compared to the pointer chasing, all those pushes and pops aren't much of a problem, either.

However, in a language like C without function closures, non-recursive traversal can keep code much more concise and clear. This is the real win, IMO.

rrobukef · on Sept 3, 2019

You could simulate the recursion with an iteration, a depth byte and a stack-array of 32 pointers (64 for overkill). This removes the frame pointer and makes the temporaries clear too.

wahern · on Sept 3, 2019

But that's not recursion, which pretty much by definition implies the use of repeated invocations of the same function(s). Explicitly building a stack is exactly what the article was discussing, though it's far more clever than the obvious approach. I've seen AVL implementations that use an explicit array for maintaining a stack in lieu of recursion or parent pointers.

In any event, you always need a stack of some sort. Recursion is one way to accomplish that, but saying that you're simulating recursion by explicitly building a stack reverses the categories.

signa11 · on Sept 3, 2019

well, these are dynamic structures which are constructed over a period of time with pointers pointing all over the place...imagine creating a singly-linked-list of 20/30m integers, and then running a trivial sum/accumulate over the whole thing vs doing the same thing with a vector/array.