Because compilers need to work for any program, thus tend to the best common denominator, while you, the programmer, can design something for a particular use case. In this case, hot loops - at best a compiler could know about the runtime intensity by using a runtime profile and then deciding that, yes, in this case it's actually best to inline something. But having a compile -> link -> run (profiled) -> compile -> link workflow is much too bothersome and slow (so most don't do this) and it's much more manageable to sit down and turn on your brain when programming. This stance of "no premature optimisation" has gone overboard IMO.