Hacker News new | past | comments | ask | show | jobs | submit login
Making Pillow-SIMD, optimizing image processing in Python (uploadcare.com)
74 points by igordebatur on Nov 3, 2017 | hide | past | favorite | 24 comments



Interesting, I wonder if it beats VIPS yet.

VIPS page says they beat Pillow-SIMD 4.3.0 by a tad: https://github.com/jcupitt/libvips/wiki/Speed-and-memory-use

Pillow-SIMD's page says it beats VIPS: https://python-pillow.org/pillow-perf/ (you have to select "Full operations cycle", it doesn't let you look at VIPS tests separately for some reason.


Hi there, there's the 'see 1' thing in VIPS testing, it says the following: "Pillow is single-threaded, so the fairest comparison for raw processing speed would be against vips-1thread."


Can you please comment if pillow-SIMD works with Windows at this point? I was trying to make it work with py2.7 as I needed very fast image processing tools (for something that works real-time) and my experience was that pillow-SIMD does not work with Windows. It appeared at first that an Anaconda package would make it work but even that doesn't work because of various dependency issues.


It looks like it should, there's an installer here: https://www.lfd.uci.edu/~gohlke/pythonlibs/

But there's also a bug here: https://github.com/uploadcare/pillow-simd/issues/9

If you're using open source software like Python I'd really recommend either leaving Windows, using the bash subsystem or running a Linux VM though. Most of the open source ecosystem lives on *nix and you'll rarely find people who port their projects to Windows.


I would love to switch, but unfortunately the field of medical research is windows everywhere (high speed andor cameras, spectrometers, microscope stages, virtually everything needs windows).


Unless the Python is actually interacting with that hardware you should be able to get along much better with Windows' Linux subsystem or a VM still.


That's true, are the pillow-perf benchmarks parallelised at all?


For image resizing, wouldn’t you use the GPU itself (assuming the server has one)? Does Python have a library that works with the GPU for image processing?

I have a Skylake Xeon server with an integrated GPU. Any way to use that with Python?


There's a fairly common use case for resizing where you have a large image already in CPU memory, and want to create a thumbnail or otherwise downsize. Copying the large source image to GPU memory can be expensive enough that it's faster to simply do the resize on the CPU, since the data is already there.


There was a discussion about GPUs under the first article of the series, https://news.ycombinator.com/item?id=14712146 In terms of building a SaaS, CPUs seem more reliable


> Does Python have a library that works with the GPU for image processing?

tensorflow. It's not just deep learning library and you can implement any tensor operations there. 99% image processing is just tensor operations.


Intriguing option but I'm dealing with the integrated GPU here. It looks like TensorFlow is for NVidia discrete GPUs.

PythonOpenCL is probably the best option: https://mathema.tician.de/software/pyopencl/

One thing about integrated GPUs for server is that there's no transfer time, since the GPUs share memory with the CPU.


There is transfer time. You source image is in cacheable CPU memory. Integrated GPUs normally work with uncacheable memory allocated in a special region of system memory. Some GPUs can access cacheable memory too, but it is much slower (because it has to maintain coherency with CPU caches), and requires that you allocate such cacheable memory using special OpenCL driver calls, not your normal malloc. So, in practice, you would do a copy to GPU-optimized buffer (in shared with CPU, but uncacheable memory).


except just loading tensorflow takes half an hour on some machines. pytorch is better in that regards.

both are probably too hard to install though, cuda and cudnn are really unpleasant to install, unless you are using conda


you dont need to install CUDA or CuDNN to install pytorch. We package and ship all required dependencies with our binaries.


he's talking about gpus. you do need to install cuda to get pytorch to work on gpus yes


I am the lead devel of pytorch. No you dont need to install CUDA to get pytorch to work on GPUs, we ship the relevant bits with our binaries. All you need is a working NVIDIA driver.


really? very cool, thanks a lot, I didn't know


sm for soumith?


For low-quality resizing algorithms I suspect a good SIMD algorithm will be faster than transferring the image to the GPU and getting the result back.


[flagged]


I don't mind people criticizing the design or layout of content meant to be read, but... come on. You "cannot make yourself endure reading" it? Because the text is slightly too light and contrast slightly too low as a result? It's "unsettling and stressful" that there are code snippets and text interspersed?

Sure, it's annoying that there's an ever-present overlay at the bottom. And the contrast could be better. But I'm hoping your comment is over-the-top hyperbole, because an insurmountable refusal to read content because the text is two shades too light is just silly. Even if it is hyperbole, do we really need someone to make this style of comment every single time a content author strays away from absolutely optimal legibility?


To be fair to the OP it's much worse on a smaller (12" retina Macbook) screen. That's with maximised Firefox with the dock visible. However this is an issue with Medium, not the blogger.

https://imgur.com/a/qGYc3


I actually did stop reading it because it is an unpleasent experience. It is not refusal, it is simply skipping piece #15423 in every day's continuous stream of media.

Look at my screenshot. There is so much distractive disorder and tightening in layout with focus on medium's stupid aggressive buttons that the content suffers.


Hi, thanks for your comment. It's a good one, actually. We're using Medium as our blogging platform and are thinking of changing it. Especially when it comes to snippet-related stuff and inline code. It seems Medium is more about digital storytelling.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: