Its worth watching the video in its entirety. That's not true. Python 3.6 beats ...

dman · on June 5, 2017

The startup time fiasco was one of the reasons why Java never took over for GUI apps. Even though its unscientific, time to first usable interaction goes a long way in establishing "speed" in the users mind. You should see the hoops that Chrome jumps through to excel on this metric.

kstrauser · on June 5, 2017

True, but:

  $ echo exit | time python2

gives an average time of 0.015s on my laptop, and

  $ echo exit | time python3

works out to about 0.039s. If you're spamming hundreds of processes from a looping shell script, the Python 3 overhead would probably start to grate. For anything manually launched, 40 milliseconds is still substantially close to instant.

microcolonel · on June 6, 2017

and I would point out that on my machine (GCC 6.3-based x86-64 Linux) the times are much closer

    ~ time python2 -c exit
    real    0m0.014s
    user    0m0.011s
    sys     0m0.004s

vs.

    ~ time python3 -c exit
    real    0m0.022s
    user    0m0.018s
    sys     0m0.004s

So for me it's more like an 8ms realtime difference, and four milliseconds are spent in the system either way.

python2 seems to do about 700 system calls (!) to start up and exit immediately, including enumerating things like gtk-2 and wxwidgets, which boggles the mind, but it is what it is.

python3 seems to do about 470 system calls, much less but still bonkers to my mind. Also weird that they take the same ~4ms in the kernel given that python3 calls into the kernel so much less.

mkl · on June 6, 2017

My times on x86-64 Linux are similar to yours, but on the Windows Subsystem for Linux (which emulates the Linux kernel) it's a different story:

  $ time python2 -c exit
  real    0m0.111s
  user    0m0.016s
  sys     0m0.078s

  $ time python3 -c exit
  real    0m0.063s
  user    0m0.016s
  sys     0m0.047s

Python 3 is consistently ~twice as fast. Both seem slower, but this is on a small laptop so the absolute measurements may not be comparable. This is on Creators Update.

kstrauser · on June 6, 2017

That's better yet - thanks for sharing! Most of my Python processes run for weeks at a go, so startup time isn't that important to me as long as it's not glacial. But 40ms (on my system) or 22ms (on yours) is totally acceptable for interactive shell usage.

Thanks for including the syscall counts. I know they'd been working to reduce that, but I hadn't seen how much progress they'd made yet. Are most of those to malloc() and open()?

joshuamorton · on June 6, 2017

It absolutely is. Consider that that means that python takes 2-3 frames to start up. I'm not sure one should or could reasonably consider that "slow" for interactive use.

microcolonel · on June 6, 2017

Mostly open, read, and fstat for python2:

     ~ strace python2 -c exit 2>&1 | sort | grep "^[a-z_]*[(]" | sed -e "s/^\([a-z_]*\)[(].*/\1/" | uniq -c | sort -rn | head -n 8
        199 open
         98 read
         94 fstat
         92 stat
         68 rt_sigaction
         63 close
         27 mmap
         16 mprotect

Mostly stat, rt_sigaction, and read for python3:

     ~ strace python3 -c exit 2>&1 | sort | grep "^[a-z_]*[(]" | sed -e "s/^\([a-z_]*\)[(].*/\1/" | uniq -c | sort -rn | head -n 8
         92 stat
         68 rt_sigaction
         57 read
         54 fstat
         35 open
         35 close
         28 mmap
         24 lseek

Some of this might be noise though, it's possible that it has something to do with the installed packages, not sure exactly.

philipov · on June 5, 2017

> If you're spamming hundreds of processes from a looping shell script,

You should absorb the loop into a single python process that instead calls the original script as a library...

Too · on June 6, 2017

Mercurial has another approach. They have something called command server launched once and communicati g over socket if you need to invoke it thousands of times. If start up time is a problem with python your biggest problem is start up time with python, not the difference between 2 and 3.

intchanter · on June 14, 2017

This may not make much difference to these benchmarks, but these commands and the nearly-equivalent:

  $ time python2 -c exit

cause Python to execute the following script:

  exit

This will look up the exit function, and then do nothing with it before the script exits due to reaching the end of the file. Interestingly, the representation of this function is set to generate the message that reminds you how to get out of the interactive interpreter if you type "exit".

You can get the results you want with:

  $ time python -c ''

meaning: Load Python, run a completely empty script, and clean up.

nomel · on June 6, 2017

I think the problem is pep8. For some reason, pep8 says you load your modules at the top of the file rather than when needed. Giant libraries often follow pep8, meaning something like "import pandas" loads hundreds of python files, which gives the absurd startup times.

I've seen some large code bases take 4 seconds to import, > 80% being unused code.

vosper · on June 6, 2017

Huh, I never thought about that before. Importing inside a function is very frowned-upon AFAIK, but now I wonder what kind of performance improvements you might get from importing things only when you need them.

I wonder if it would be possible to automatically rewrite a script to do that.

dragonwriter · on June 7, 2017

> For some reason, pep8 says you load your modules at the top of the file

Clarity of dependencies is the “some reason”.

Like most style rules, there are times when breaking it is justified by other considerations, but it's not an arbitrary rule.

smitherfield · on June 6, 2017

But couldn't that inefficiency be optimized away in the implementation? (Load imports lazily).

marmaduke · on June 6, 2017

It can't be done automatically because imported can execute code. Since that's part of the semantics, you have to be lazy by hand.

deckiedan · on June 6, 2017

Only on the first load ever. When modules are compiled to their byte code, they could get a "clean" flag if they don't execute code, which means they get loaded lazily next time.

marmaduke · on June 6, 2017

Ok but figuring out what executes code might require executing code: consider non trivial use of meta classes, where an inconspicuous subclass actually invokes e.g. A registration process like a Django model class.

smitherfield · on June 6, 2017

Sure, but, just spitballing here, could a substantial fraction of imported code benefit even with a very conservative heuristic?

In any case, given your points, it seems like a future Python version ought to introduce a new version of the import statement with lazy semantics (which, besides eliminating dead code, is also IMO the more correct/explicit behavior when importing symbols).

marmaduke · on June 9, 2017

It'd be pretty easy to do the experiment, since you can override the import logic and do lazy imports, eg.

    class LazyModule:
        def __init__(self, name):
            self.name = name
        def __getattr__(self, key):
            mod = {}
            exec('import %s as mod', mod)
            self.mod = mod['mod']
            return getattr(self.mod, key)

    class LazyImporter:
        def get_module(self, name):
            return LazyModule(name)

deckiedan · on June 7, 2017

That's kind of what I was hoping. I suspect a vast proportion of normal Python code doesn't use the meta programming and run on import and so on.

joshuamorton · on June 7, 2017

If a module imports anything, it cannot be assumed clean, since any of its imports may execute code, and they will not be available at compile time.

gshulegaard · on June 5, 2017

This is the right answer.

> One thing to keep in mind is that python 3 is still considerably slower than python 2.

IIRC this statement started to shift ~3.4. There are still areas where 2.7 is faster, but it is more of a gray area than black and white like it used to be.

dr_zoidberg · on June 6, 2017

I have a codebase at work (which arguably was optimized for 2.7) that is about 2-5% percent slower when running in Py3* . It's not a huge slowdown, but it is consistent.

Hope 3.7 finally puts it on par (or faster) than 2.7, but we've decided to migrate with 3.6 anyway. Don't take me wrong, I like having finally put 2.x behind, but it still bothers me a bit. Maybe we just have to get used to optimizing the "3.x series".

* this has been measured without startup time, just function calls, with horrible datetime.datetime.now() timings and %timeit magic-keyword from IPython -- always consistent.