>the entire lookup table fits in L1 cache you can treat KIND_TABLE[id] as only t...

pmalynin · on Dec 31, 2016

Actually on Intel CPUs ALU operations take 1/3 of a cycle and the processor can schedule them back to back.

Coding_Cat · on Dec 31, 2016

AFAIK this is not quite right. They take only 1/4th of a cycle on average, but that is because of pipelining. if you have a dependency on the result of an ALU you will still have to wait the full latency (1 cycle) before you can continue.

This makes sense as the clock is also somewhat of the 'driving force' for pushing signals through the chip from one part to the other. (some architectures have 'zero cost' operations I believe, but these are usually baked into the pipeline and have to be turned on-or-off depending on need).