TensorFlow Image Recognition on a Raspberry Pi

alex_hirner · on Feb 9, 2017

Inference times for inception-v3 on a Raspberry Pi have been benchmarked recently [1]. They weigh in at around 2s. If you remove the python layer it gets down to 500ms. However, it's not yet clear if the python overhead is the only reason for that.

Nvidia's Tegra X1 should supposedly be capable of <10ms for imagenet grade models [2]. It's fair to assume though, that this must be for trimmed down and/or 16bit models as compared to full inception models.

And finally, Sam who also facilitated building TF on the Pi is about to host a 6 weeks half theory, half practice course on TF and deep learning [3] (me thinks he deserves this plug).

[1] https://github.com/samjabrahams/tensorflow-on-raspberry-pi/t...

[2] https://youtu.be/_4tzlXPQWb8?t=53m35s

[3] https://www.thisismetis.com/deep-learning-with-tensorflow

pjc50 · on Feb 9, 2017

Does this use the Pi's GPU? It's not clear and I think the answer is no?

floatboth · on Feb 9, 2017

That's really unlikely. There's no OpenCL for the RPi. GPGPU programming for the Pi is mostly assembler based:

https://github.com/nineties/py-videocore

https://petewarden.com/2014/08/07/how-to-optimize-raspberry-... (example involves deep learning!)

https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=7891... (someone actually tried making an LLVM backend!)

I really doubt someone integrated that into TensorFlow…

mrubashkin · on Feb 9, 2017

We only used the Pi's CPU, which is adequate for classifying images, but not powerful enough to train a model

mrubashkin · on Feb 9, 2017

Hey Alex, thanks for the comment! Have you tried out image classification with the Tegra X1 yourself by chance??

alex_hirner · on Feb 10, 2017

I haven't but the ressources on jetsonhacks.com are amazing to see robot vision on this board in action.

iaw · on Feb 9, 2017

If we get down to <10ms does that mean that real-time video processing becomes feasible?

deepnotderp · on Feb 9, 2017

But real life images are at least 640*480, not imagenet.

sja · on Feb 9, 2017

The training/testing data for ImageNet are not all tiny thumbnails. It's just that the models work better/are more cleanly defined with uniform inputs, so images are scaled/cropped down to the 224x224 or 299x299 or whatever dimensions. Inception-ResNet works fine on huge images, it just doesn't use all of the pixel data.

sja · on Feb 9, 2017

Very cool! It's good to see an example showcasing the importance of keeping a Session alive when using TensorFlow with Python on the RPi. Glad that the tensorflow-on-raspberry-pi repo was useful; let me know if you (or anyone) runs into any hitches or have any suggestions for improvement.

mrubashkin · on Feb 9, 2017

Hey Sam this is Matt, thanks for your comment and your help a few months back! And for anyone else reading this, Sam is great at getting back to filed issues about installing tensorflow on a Pi: https://github.com/samjabrahams/tensorflow-on-raspberry-pi/i...

Florin_Andrei · on Feb 9, 2017

Yeah, I'm using Sam's TF wheel on RPi3 and it works great.

> it was not feasible to analyze every image captured image from the PiCamera using TensorFlow, due to overheating of the Raspberry Pi when 100% of the CPU was being utilized

Just put a heatsink on the CPU. It's like $1.50 ... $1.95 on Adafruit. I glue a heatsink to every RPi3 unit I build.

https://www.adafruit.com/products/3082

https://www.adafruit.com/products/3083

> it was taking too long to load the 85 MB model into memory, therefore I needed to load the classifier graph to memory

Yeah, one of the first things you learn with TF on the RPi is to daemonize it, load everything you can initially, and then just process everything in a loop. That initialization is super-slow, but after that it's fast enough. YMMV

mrubashkin · on Feb 9, 2017

Hi Florin, thanks for the comment!

Even with the heatsink (which we install on all of the Pis), we were still having overheating issues. We tried a few other things too to mitigate the problem: 1. Reducing sampling rate for the image recognition (but if we reduced this beneath several seconds we could miss the express trains) 2. Using a cooling fan (https://www.amazon.com/gp/product/B013E1OW4G/ref=oh_aui_sear...) - still didn't prevent overheating if the CPU was continuously loaded at 100%. 3. Only sampling images where we detected motion (https://svds.com/streaming-video-analysis-python/)

We decided to use the 3rd option: Leveraging our motion detection algorithm, which while sensitive to false positives, allows us to use Deep Learning image recognition to eliminate those false positives.

Happy to chat more about your experiences daemonize-ing TF applications!

Florin_Andrei · on Feb 9, 2017

When you say "overheating issues", what do you mean exactly? IME, at 100% CPU usage with the heatsink on, either it does not throttle down the clock anymore at all, or it does it after a much longer time and the clock reduction is much less.

Are you seeing anything happen, other than some slight throttling?

The chip cannot fry itself. It's designed to slow down so as to stay below the dangerous temperature range.

> Happy to chat more about your experiences daemonize-ing TF applications!

Eh, that was just a fancy way of saying I do what you do. Launch the program once, and let it run forever. It performs initialization (which takes a long time), then it drops into a processing loop: wait for input / read / process / do something / repeat. Pretty basic stuff really.

teekert · on Feb 9, 2017

"Just put a heatsink on the CPU."

Ever had any problems without it?

kldavis4 · on Feb 9, 2017

The cpu throttles when it gets too hot. Won't cause damage to not have it, but will slow it down.

Florin_Andrei · on Feb 9, 2017

Right, that chip could never run at 100% CPU load for more than a fraction of a minute; after that it starts slowing the clock. Seems to me like it was meant to run with a heatsink on.

Either that or it was meant for outdoors operation in arctic regions.

annnnd · on Feb 9, 2017

Nice! Does anyone know, is it possible to train such models on RPi (probably not) or similar sized computer?

EDIT: I guess the question should be: is there a RPi-like sized machine available which is more suitable for training ANN?

adrianN · on Feb 9, 2017

How long are you willing to wait?

deepnotderp · on Feb 9, 2017

Baidu estimates a deep net takes ~20 ExaFLOPs to train on imagenet. The RPi GPU has like 100GFLOPs....

mrubashkin · on Feb 9, 2017

Hi annnd! We tried a few times to train a scaled down model on the Pi3, but got nowhere. We've found that the best strategy is to train on the beefiest hardware you have, then transfer the model and run it on the Pi3 for streaming applications.

delta1 · on Feb 9, 2017

It may be possible but take far too long?

donquichotte · on Feb 9, 2017

I love how he used trains for training data.

nojvek · on Feb 9, 2017

I'm under the impression that when pi4 will be released it will have a powerful gpu to run neural nets. Now that tf is getting optimised for low end devices and models are getting open sourced, it would be possible to run live offline image recognition and speech recognition on pi.

I imagine a proliferation of robots, security cameras and smart open source siri/alexas

gjem97 · on Feb 9, 2017

A bit off topic, but related: can anyone point me to a recipe for "low-latency" video with the RPi. I don't even really mean "low-latency", I had tried a couple of different setups/tools a few months ago, and the best I could do was a half or full second delay on the video.

mrubashkin · on Feb 9, 2017

My colleagues and myself have a blog post discussing how to do streaming video analysis on the Raspberry Pi: https://svds.com/streaming-video-analysis-python/

On the Pi3, our application processes 320X240 images at 10 FPS without any problems.

Let me know if you have any questions!

sdenton4 · on Feb 9, 2017

Wonder if they'll make a tpu extension board for the pi...

EternalData · on Feb 9, 2017

Using TensorFlow with Raspberry Pis is something I've always wanted to try. Time to classify whether I get Thai food or pizza delivered more.