It definitely doesn't work if you don't change your in-memory data representation, and it requires you to change it in a way that forecloses the possibility of FP-persistent tree structures or efficient DAGs. Compared to that, representing your ordered-tree nodes as cons chains is a barely-significant change.
It does generalize, though, to ordered-tree nodes that aren't represented as cons chains. You just have to walk through the n child pointers when you're moving back up the tree to search for the one you're returning from, adding an extra runtime cost factor proportional to n but no extra memory. Moreover, if you devote three or four registers to an amnesic stack, you can avoid that cost for the bottom two or three levels of the tree, which will necessarily contain the vast majority of tree nodes.
It does generalize, though, to ordered-tree nodes that aren't represented as cons chains. You just have to walk through the n child pointers when you're moving back up the tree to search for the one you're returning from, adding an extra runtime cost factor proportional to n but no extra memory. Moreover, if you devote three or four registers to an amnesic stack, you can avoid that cost for the bottom two or three levels of the tree, which will necessarily contain the vast majority of tree nodes.