1. Storing nodes in a resizable array means as the input program grows the compiler will need a larger and larger block of contiguous memory (which may or may not be available). You could work around this by allocating page-sized blocks to pool from.
2. Care needs to be taken with how AST nodes are represented in code. For example, using a union type to store nodes is self-defeating as a union type is as large as its largest member and since not all AST nodes are equal size, this means small AST nodes will be pointlessly padded to account for the size of the largest AST node.
Sometimes I wish we didn't end up with flat address spaces. The "virtual memory" technique allows us to stitch fragmented regions of physical memory into a contiguous region of virtual memory and if virtual address space was segmented, it simply would not ever become fragmented; any memory region could always be grown in-place, without intersecting with any other region, so realloc() implementations could just drop their memcopy() paths... oh well.
Can be mitigated by a rope(?) structure (i.e. the vector is split into chunks of fixed size, no re-allocation required) with real pointers. It will be unsafe inside, but safe (read-only) interface seems to be possible.
Deallocation will be O(n), but still much faster than a tree.
1. Storing nodes in a resizable array means as the input program grows the compiler will need a larger and larger block of contiguous memory (which may or may not be available). You could work around this by allocating page-sized blocks to pool from.
2. Care needs to be taken with how AST nodes are represented in code. For example, using a union type to store nodes is self-defeating as a union type is as large as its largest member and since not all AST nodes are equal size, this means small AST nodes will be pointlessly padded to account for the size of the largest AST node.