Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

TensorFlow + TensorFlow Serving + Google ReCeption model plus optionally a SVN on ReCeption features for your custom detection. All that code and the pretrained model is Open Source. There's some engineering to glue it together and some extra work for the easier, non-image classification parts.

There is also http://www.deepdetect.com



+this. I've implemented a subset of this kind of pipeline before on AWS (image tagging + face identification) using the building blocks that existed last year (it was AlexNet at the time, with a pre-release version of MXNet, because Google hadn't released the trained inception model). Implementing this basic functionality at a basic working level, given the tools Google has released, isn't impossible.

Now, making it production-quality, efficient, scalable, and the rest -- well, y'know. That's why people use cloud-based services in the first place.

But I think there's less fundamental lock-in than you think. Cloudinary, for example, will let you upload an image and get a tag out. ABBYY and OmniPage/Nuance and others offer cloud-based OCR.

I'm biased - I'm at Google this year - so take this with a grain of salt, but while I have the feeling that Google can do it better and more affordably than a small team could do it on their own, I don't think that Google pulling the API would leave people up a creek without a paddle.


> the pretrained model is Open Source.

Google's face/landmark/label/text/logo detection models are open source? Or there exist open source pretrained models?

The quality and size of the training set is (at least) as important as the machine learning tools. I imagine Google has access to a pretty big data set, along with the computing resources to process it.


Google's face/landmark/label/text/logo detection models are open source? Or there exist open source pretrained models?

Google's Inception v3 pre-trained image recognition model is open source: https://www.tensorflow.org/versions/r0.7/tutorials/image_rec...

That's the hard part because as you note this is computational intensive (the training data is actually open source as the ImageNet dataset)

There is existing code for the others part that perform pretty adequately (with the possible exception of landmark detection).

Eg:

Face detection: http://docs.opencv.org/master/d7/d8b/tutorial_py_face_detect...

Logo Detection: http://www.pyimagesearch.com/2015/01/26/multi-scale-template...


can you provide some links ?


TensorFlow: https://www.tensorflow.org/

TensorFlow Serving: https://github.com/tensorflow/serving

ReCeption (actually they call in Inception v3. Not sure where I got the ReCeption name - though I'm sure I read it somewhere?): https://www.tensorflow.org/versions/r0.7/tutorials/image_rec...

Using a SVN on neural network extracted features: http://blog.christianperone.com/2015/08/convolutional-neural...

If you want a quick and dirty version here's some Python to create a web service that calls a Caffe based Image recognizer: https://gist.github.com/nlothian/c3519adb81b3452c1938


thanks!





Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: