You can implement those as part of a linking phase during bytecode generation: emit a placeholder value (e.g. 0) for the jump address and when you've finished compiling the block go back and fill-in the placeholder with the correct address/offset.
That's relatively easy when implementing an assembler for your opcodes. Just keep track of symbolic labels and their associated jump points (as a simple array or linked list) and process (i.e. finalize or "link") the jump points when the label address becomes known. My "assemblers" often have constructs like:
L0
...
J1
...
J0
...
L1
where L? registers a symbolic jump destination (i.e. x.label[0].offset = label_offset) and J? emits an unconditional jump and registers a link request (i.e. push(x.label[1].from, jump_opcode_offset)). When a block is finished all the offsets are known; you just process things like
for (i = 0; i < x.nlabel; i++) {
for (j = 0; j < x.label[i].nfrom; j++) {
patch_in_offset(x.label[i].from[j], x.label[i].offset)
}
}
Knowing when to emit symbolic label and jump instructions from the AST is a little more involved, but no more than analyzing the AST for anything else.
Supporting computed gotos would require much more bookkeeping, I'd imagine, and I'm not surprised few languages support that construct. Or maybe not... I haven't really thought it through.
One cool thing about this whole exercise is that it helps to demonstrate why generating some intermediate representation can be easier (conceptually and mechanically) than directly generating runnable code in a single pass. It seems more complex but it really makes things easier.
Computed gotos defer the search for the label from compile time to run-time. So you need to hash the targets, and then jump to it.
It all depends how your interpreter PC (program counter) works. Some have just opcode offsets (like a real PC, as %eip, e.g. lua, ruby, ..) some have opcode addresses (perl, lisp, ...).
With offsets you can encode jumps relative, which is a huge advantage. With addresses you have to remain absolute, which means the data needs a full pointer (cache), and you cannot move the code around for optimizations afterwards.
With offsets you have a cache-friendly op-array, with addresses you have a fullblown tree (AST) or linked list, with its cache-unfriendly pointer chasing overhead.
But with full addresses you can avoid the big switch loop overhead in an interpreter, you just pass around the next pointer instead if incrementing the PC. But then it gets interesting how to keep the state of the stack depth.
That's relatively easy when implementing an assembler for your opcodes. Just keep track of symbolic labels and their associated jump points (as a simple array or linked list) and process (i.e. finalize or "link") the jump points when the label address becomes known. My "assemblers" often have constructs like:
where L? registers a symbolic jump destination (i.e. x.label[0].offset = label_offset) and J? emits an unconditional jump and registers a link request (i.e. push(x.label[1].from, jump_opcode_offset)). When a block is finished all the offsets are known; you just process things like Knowing when to emit symbolic label and jump instructions from the AST is a little more involved, but no more than analyzing the AST for anything else.Supporting computed gotos would require much more bookkeeping, I'd imagine, and I'm not surprised few languages support that construct. Or maybe not... I haven't really thought it through.
One cool thing about this whole exercise is that it helps to demonstrate why generating some intermediate representation can be easier (conceptually and mechanically) than directly generating runnable code in a single pass. It seems more complex but it really makes things easier.