Hacker News new | past | comments | ask | show | jobs | submit login
Realtime image processing in Python (morepypy.blogspot.com)
159 points by mattyb on July 7, 2011 | hide | past | favorite | 18 comments



To run on OSX:

grab the archive http://wyvern.cs.uni-duesseldorf.de/~antocuni/pypy-image-dem...

  brew install pypy
  brew install mplayer

  pypy demo/magnify.py
  pypy demo/sobel.py
For me, 24 average fps on magnify.py, 34 average fps on sobel.


What's the issue with using OpenCV?


Try using it sometime. I work for a startup that is in computer vision and managing our OpenCV dependent code is the least favorite part of my job.


Sure the C interface is clunky and the error messages can be a little hard to trace their source, but it sure beats having to write all that code from scratch. Most projects I have seen that make heavy use of OpenCV use a few C++ wrapper classes to make the usage a little smoother.


I concur. It isn't feasible to use the python opencv bindings for nontrivial tasks.

we use a c++ wrapper around opencv as well.

opencv isn't valuable for its algorithms or its api. The opencv value proposition is tied up with painstaking optimization of the inner loops of several high level operations using SIMD intrinsics.

Advances in compiler technology seem to be pointing towards generated code with similar levels of optimization especially in JIT generated code.


What's your opinion of this remedy? I was about to start implementing Viola-Jones detection through CV for a personal project.


Seriously, the benchmark says it's 590 times faster than regular python. So it's probably 59 times slower than optimized SIMD code?


Time to feed the troll:

First off, PyPy uses a JIT, so there's no obvious reason why it would have to be slower than 'optimized SIMD code' (whatever that is). The actual performance all depends on the quality of the JIT and the quality of the input into the JIT.

Second, they clearly state in the blog post that the PyPy version of the algorithm is easier to write than the equivalent C++, because the JIT can transform their polymorphic Python into efficient native code - doing the same with C++ would require the use of template expansion or a code generator.

Third, if the idea of sacrificing a tiny amount of performance in order to reduce the cost of development and maintenance is that abhorrent to you, Python is almost certainly Not For You.


First off, PyPy uses a JIT, so there's no obvious reason why it would have to be slower than 'optimized SIMD code' (whatever that is).

If you don't know what SIMD is, you probably shouldn't even talking about high-performance image processing in the first place. The parent is correct that this is probably still at least a full order of magnitude slower than proper SIMD code.

But comparing this to good SIMD is not quite fair, as the intent of this sort of JIT is to be competitive with naive compiled C, not to be competitive with optimized assembly routines. Neither is attempting to -- or has a chance of -- replacing proper hand-written SIMD for performance-critical code.

If you want to automatically generate SIMD code, you'd want something more along the lines of a special-purpose vector language, like orc.


My point was that 'simd' is far too generic. There are dozens of instruction sets for writing simd code, and numerous compilers targeting them. Lots of compilers generate subpar simd code, so were pypy to do the same that would be far from unusual. If the claim is that pypy generates worse simd code than compiler x + instruction set y, then we're getting somewhere. If the claim was merely "i can write faster asm by hand", then my response is "well, duh".


It's completely reasonable to generate SIMD code based on idiomatic uses of arrays, without requiring the programmer to use a special purpose vector notation. It's also reasonable to do this in a JIT.


It sounds "reasonable", but no compiler in existence seems to be able to do it efficiently. This suggests to me that it isn't in fact reasonable, as that sounds like a more reasonable conclusion than "everyone writing compilers is incompetent".


I think differently. Normal C loop codes are hard to optimize using SIMD as the context and it's freedom can be the limiting factor.

But in python, the vector operations are normally programmed using simple array semantics. For example in NumPy:

  >>> a = array( [2,3,4] )
  >>> b = array( [2,3,4] )
  >>> a+b
  [4,6,8]
These can be easily converted into SIMD operations.

It's matter of time before PyPy's new NumPy implementation take traction and make simply beautiful optimizing JITs using that.


I think it's more that the application of it is niche enough that it's not worth the engineering effort for most general purpose compilers - the people who need it are often experts anyway, and are willing to do the optimization by hand.

I see this as being similar to why we're only now taking abstractions and compilers geared towards parallelism seriously for mainstream programming: not enough people needed it to justify the effort required.


Sounds about right. In GPU drivers, we keep specialized shaders as hand-tuned GPU assembly, not as GLSL/HLSL. This is just a very cool example of Python running at C speed, not Python running at --omg-optimized x264 assembly routine speed. :3


This demo (like all demos) is only showing the best case performance. There are plans to add JIT to C Python ( http://www.python.org/dev/peps/pep-3146/ ), but the numbers listed there don't look very impressive.

Mathematically intense code may run faster when it is compiled JIT, but Python users have relied on specially made libraries (Numpy/Scipy/Gmpy/etc.) to get serious speedup.


Unladen Swallow is dead and won't be merged into CPython.

http://qinsb.blogspot.com/2011/03/unladen-swallow-retrospect...


Soon pypy will be compatible with numpy and scipy. Also it is interesting what other stuff you can now do with pypy, for example pyglet runs on it - think opengl 3d and 2d games - should be fast enough for many games, you can run pyramid web framework on it with great success including database drivers, so webdev is already possible - i've made some benchmarks of index page for my application and got 4x speed increase request/per second wise - without caching. So while its not ready for everyone yet, it may be ready for you, so don't think about it like some exotic thing no one will use it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: