You possibly need some background concepts. Compilers work by taking a (computer...

touisteur · on June 27, 2018

It's not as clean as clang/llvm, but you can see the intermediate steps of GCC with the -fdump-tree-alland -fdump-rtl-all options. The GIMPLE IR is not so hard to follow (not as hardcore as the gcc docs...). Can be interesting to see the effect of front- or back-end compiler options on your code and track down how an optimization work.

You can also write compiler plugins in GCC. Not as fun as clang/llvm, but manageable.

yodsanklai · on June 27, 2018

> A connected graph of basic blocks (BBs) is generally how such IRs work

I was wondering. When is this graph actually built? is it before creating the IR or after? The way I see it, you can generate IR without worrying about control flow graph. Then you can build the CFG from the IR in order to perform common optimization or register allocation. Is that correct?

jcranmer · on June 27, 2018

In LLVM, a function is linked list of BBs, each of which is a linked list of instructions. The final instruction in each basic block is a special kind of instruction that includes a list of all possible successor BBs.

In effect, the IR is exactly a control-flow graph represented in adjacency list form, so it's not possible to construct the IR without constructing the control flow graph. You could theoretically write the IR in textual form, but that's definitely not the common case, and you generally need to construct the CFG anyways to properly emit the IR (particularly since adding the phi nodes for SSA requires knowing the predecessors of every basic block).