You could simulate the recursion with an iteration, a depth byte and a stack-array of 32 pointers (64 for overkill). This removes the frame pointer and makes the temporaries clear too.
But that's not recursion, which pretty much by definition implies the use of repeated invocations of the same function(s). Explicitly building a stack is exactly what the article was discussing, though it's far more clever than the obvious approach. I've seen AVL implementations that use an explicit array for maintaining a stack in lieu of recursion or parent pointers.
In any event, you always need a stack of some sort. Recursion is one way to accomplish that, but saying that you're simulating recursion by explicitly building a stack reverses the categories.