I think your characterization of the GIL is not accurate. Show me ANY real world program that can achieve linear speedups on multicore or multi-processor systems. Humans have not sufficiently mastered multithreading to be able to make such a claim. I am not aware of any "CPU-bound" use cases that would actually use Python like this instead of, say, C or Fortran. And anyway, I submit that it would benefit (both from a design and an execution standpoint) from being multi-process (in other words, using explicitly coded communication).