Been there, done that. Put that car at an angle and see how well that works. Or maybe add rain. Or bright sunshine. Or make it dark.
Spoiler alert: these kinds of image processing Rube Goldberg machines fall apart real quick with real world conditions.
> The intent of this article is to sensibilize the reader that a machine learning approach in not always the best or first solution to solve some object detection problems.
The danger of this article is people end up wasting time & budget with brittle approaches.
(Though to be fair the danger of the deep learning/ML route is people can end up wasting just as much time when they don’t have an adequate amount of training data.)
There’s a pretty vast gulf between 70s era threshold, edge detect, morphology type methods and deep learning.
Robust feature detectors, voting methods, etc. can fill that gap for object detection and pose estimation. Also simpler machine learning models for classification like SVMs.
It increasingly feels like the slice of computer vision techniques I learned in the early 2000s is receding in relevance. But at the same time, there ought to be a more modern grab bag of robust tools to build engineered solutions.
In the end any algorithm is only as good as the dataset it’s developed/validated against. Ad-hoc methods may fail in bright sunshine, but so too would a neural network if that condition never arises in its training set.
After tiring of convoluted web 3.0 frameworks and Babel-esque dependency structures, programmers have finally fashioned a black box they cannot understand which relieves the anxiety of knowing they should.
Do you know how your brain recognizes objects in images? Not superficially, but truly understand how it works at a deep fundamental level?
I'll wait.
Until then, when it comes to the state of the art in things like reading license plates, it's the old algorithmic approaches (like described in the article) that result in convoluted, brittle, easily broken approaches - far closer to the "convoluted web 3.0 frameworks and Babel-esque dependency structures" you seem to despise.
DL may not be fundamentally understandable (yet), but it works a hell of a lot better than stringing together image processing algorithms.
While your examples are accurate for hand-held or casual ANPR, they're easy to mitigate against when you're planning a fixed installation.
Just look at existing ANPR installations. You get tight lanes. Barriers to arrest cars. High and low angle cameras working together. Heavy use of IR illumination (reads through all but the thickest mud IME).
That's not to say some people don't get it wrong. I've worked on systems where they've cheaped-out and put a single camera at bumper-level. It's completely ineffective 2-4pm in winter because of the glare.
But yeah, if you can avoid being a cheap idiot, you can easily take a system to 90+% confidence.
The first thing I thought about was - what if a tiny part of the plate is cut off by e.g. the side of another car? It would get the ratio entirely wrong and think it was anything but a plate!
I've never been involved in ML, so when you talk about 'training data' does that require feeding known license plates in various conditions into the software and having it extrapolate between them to cover unseen cases?
And doesn't it fail just as hard if it encounters a real-World example that it can't fit into its model?
Probably a very naive question but this is entirely outside my domain.
Not really. The algorithms described in the article are pretty worthless (edge, blob detector) until the very last step, where they use the width / height ratio to filter out all of the noise and get the license plate out. As the parent comment said, this doesn’t work in the real world because of transalation/rotation. You can spend ages tweaking parameters manually, but the CNN will beat you - actually we have discovered that two DL models work best, one to locate the license plate, then the other to OCR (the former is faster computationally)
The methods described in the article are dated, and worthless
Disclaimer: I build number plate recognition systems
Maybe not exactly those techniques but I've heard of non-ML techniques being used in real life situations incl. translation/rotation to identify the plate (not read the numbers, just identify the plates)
Edit: They are dated and might have made sense some 8 years ago which is when I heard of them being used
While not ideal for the described scenario, if you are say scanning from a pool of limited form templates, it could be very useful and faster than ML methods, at least for an earlier pass. But this is a 2D scenario even then.
Marginally faster.... maybe. A forward pass on our two neural nets is less than 1.5 seconds. Is embarrassingly parallel, and has a much higher accuracy.
In this task accuracy is much much more important than computational efficiency, and given a forward pass is already quick (less than 2 seconds) why wouldn’t you use the superior method that always works?
Couldn't you use this approach to generate better inputs for the ML so that it's less complicated? The convolutional nn ocr linked below needs 450 million parameters!
I successfully designed a commercial LPR system using non-ML methods that handles all those kind of weird real-world cases you get that are never mentioned in papers about LPR and OCR.
I couldn't use neural networks at the time, because 1) it had to run in a single core 200 MHz ARM at 20 fps, and 2) it took too long to "debug" the NNs (often you want to improve performance and the failure cases seem fine to you)
So the resulting performance of this system was and is state of the art in that in practice it captures everything correctly, so there is no point wasting cycles on an NN solution.
On the other hand, starting from scratch, an NN solution can allow you to advance quickly with little domain knowledge.
I would like to repeat a word of caution other posters mentioned as well - it takes years to collect the training data required for successful commercial LPR, both for NN and non-NN methods (as you need regression tests even if you don't need it for training), to get plates from all seasons and weathers and vehicle conditions.
Interestingly, now almost 10 years later, when I look at more modern deep NNs and analysing their initial stages with modern tools I see a lot of similarity to my original "standard" algorithms. In essence, the code I wrote did the same thing a modern convolutional deep NN would learn to do.
In particular, see this recent analysis of how (some) convnets learn:
> On the other hand, starting from scratch, an NN solution can allow you to advance quickly with little domain knowledge.
[...] In essence, the code I wrote did the same thing a modern convolutional deep NN would learn to do.
I noticed bent license plates aren't in the list of example "hard cases".
Are those not an issue or does the tech just fail so hard on those it's not even worth trying?
Every other older vehicle where I live has a front plate that's got a horizontal bend in it from when you park with the front end in a snow bank so while not a majority case it's not exactly an edge case either.
I never noticed this actually :) But we mainly tested and ran this in Europe and not in the US, maybe there is a difference in how the front plates are mounted. I don't think I can recall any European plate I've seen that is mounted so it could get bent in that way (the bumper usually covers the entire plate height).
Some bending and distortions are tolerated, so maybe it would work. One of the design goals I had was that if the system fails, it should fail in a way a human would understand (like dirt in the wrong place turns an F into an E, or something like that). And if a human can't immediately see the bent chars (and had to "reason" about them), probably my system would have problems as well.
When you mentioned pickups I immediately pictured bent and mangled plates from hitching up trailers. Not being from any sort of region that would have snow banks, let alone ones tall enough to get to the bumper of a truck... why is that bending license plates?
The plastic bracket doesn't usually support the bottom half and the result is a license plate that's formed to perfectly follow the contour of the bumper (for stuff that still uses a steel bumper) or just bent into the empty space of the lower grill opening (on many cars)[1]. Then there's all the work vans that just have the license plate affixed wherever on the front bumper the owner felt like drilling two holes.
Wouldn’t that harm the bumper the plate is on at some point? I mean if it’s hard enough to bend the plate it seems the same as me just bumping into the concrete wall in front of my parking space every day when I get home and I would never do that and think it’s ok...
Eventually, maybe, depends how hard you hit stuff and the bumper in question. Depending on the vehicle age and owner there might just not be any fucks around to give (with work vehicles being driven by people making less than $20/hr this is almost guaranteed). Snow doesn't tend to be very damaging because you can't really build a straight up wall with it and the outer layers generally conform to the bumper so it's more like a soft push on all the available surface area. It's just that if the license plate doesn't have anything substantial backing it up it will bend.
There's also a significant materials difference between a bumper (either plastic or metal) and the thin piece of metal a license plate is made out of.
I would imagine there's probably some mental disconnect about how damaging ice could be because "hey, it's just frozen water, it's not like it's concrete or something"
I wish the work you did could be open sourced, including the data set. Work like that is lost to humanity because it's done for only a few entities, and once they are being replaced, the initial work disapear with it.
> ... it takes years to collect the training data required for successful commercial LPR, both for NN and non-NN methods (as you need regression tests even if you don't need it for training), to get plates from all seasons and weathers and vehicle conditions.
I suspect this won't be the case for much longer (or perhaps already isn't). It seems successful generalization from simulated environments is within reach.
Yeah that would certainly be interesting. At least it would work to expand a dataset. I guess the tricky part is to realize at all that some problem cases can appear though, and code them into the simulation.. for example, the bumper-bent license plates someone wrote about further down :)
Since license plates are "simple" (unlike photos of faces for example), training data can be generated. Similarly how Dropbox developed their OCR system based on NN.
> The intent of this article is to sensibilize [sic] the reader that a machine learning approach in not always the best or first solution to solve some object detection problems
This is a good lesson to teach people! But I don't think that this page does so successfully. By far and away the biggest issue is that they are only demonstrating on a single image. You can get almost any method to work on a single image, especially when you're interactively designing the method to work on that specific image. If you want a method that works in unforeseen conditions, you need to think carefully about your assumptions...
* Threshold on 0.5–will that work at night? What if you're driving into the sun and the image is overexposed?
* Dilating canny edges and finding connected components–what if the car is white with black trim, and the dilation merges the license plate CC with another one? What if the car is further away, and the dilation factor merges the entire car into one big blob?
* Filtering the blobs by ratio–what if the car is at an angle (turning, or starting up a hill), so the ratio is off? What if they have bumper stickers that are the same shape, and fool the filter?
While you can make a reasonable license plate detector with traditional methods (see, for example, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.455...), the advantage of using ML methods is that, if done correctly, they actually have a chance of working in most the situations you encounter (including the assumption-violating ones above), rather than just the ones you can think of while making the system. That lesson is probably just as important as the "don't jump straight to ML" one.
These sorts of posts are just so ridiculously off base.
People are absolutely desperate to show that ML isn’t the answer to everything, which nobody in ML claims it is. But it is the very clear answer to some things, and problems like license plate detection fall so unambiguously into that category that, quite frankly, if you think otherwise you’ve really disqualified yourself from the set of people who ought to be asked about these things.
In my team, we don’t use deep learning for every problem. We don’t even use machine learning for every problem. I’m a Mathematician, and frankly I’m just not that immature. The fact is, ML is a great solution to a lot of problems, and Deep Learning is a particularly good solution to an ever-increasing category of problems.
If you can’t tell the difference between such cases, then you should probably not conclude that everyone is doing it wrong, and instead conclude that you have more to learn.
On the contrary, people are absolutely desperate to show that ML isn't the ONLY answer to some things.
This article shows how by following a number of logical steps (and explaining how the algorithms work), you can achieve numberplate recognition.
These techniques have been used for DECADES. They work on machines your ML model likely wont fit on. In some cases they are faster. In some cases they are slower. In some cases they are more accurate, and in some cases they are not.
If you think ML is the magic bullet to any task you've really disqualified yourself from the set of people who ought to be asked about these things.
you were claiming that number plate recognition is a ML task and if we don’t agree with you we don’t know what we are talking about. seems like you are the one commenting in bad faith. if you reread, you will see i used your own words, which may have confused you into thinking mine was the bad faith comment. good day.
A particularly bad solution that requires data of still huge amount and still has terrible failure modes. It is not yet ready, not until we can reason about those failure modes better than using highest end statistics.
You’re free to have this opinion. I don’t see how it could possibly be justified.
> that requires data of still huge amount
This is true for some problem spaces, but not true in general. If your exposure to Deep Learning is relatively casual, then I can see why you would think this. So while it’s not a totally unfair criticism, if you’re in a problem space wth lots of data, and you have a method that performs well under those circumstances, then you’ll have to do more work to convince me it’s a bad idea to use it.
> and still has terrible failure modes. It is not yet ready, not until we can reason about those failure modes better than using highest end statistics.
This just feels like parroting others’ criticisms. Yes, our primary method for understanding failure modes of stochastic function approximators are statistical, the same as they are for stochastic processes. Statistics is precisely what is used to rigorously describe behaviour in the aggregate that cannot be well explained in the particulate.
It also ignores the fact that there is a huge amount of theory that is currently being developed around deep learning. You won’t see it linked on HN, and you likely won’t find many in the typical software engineering crowd who know about it (which is fine! - software engineers are highly skilled specialists, who should not be expected to closely follow the mathematical literature), but it does exist, and several of my friends who remained in acadaemia are building careers on developing it.
As a general aside, I have to say that the glibness of the responses objecting to my comment really does speak volumes.
I used OpenCV about 15 years back to read football pools tickets. It eventually worked but my (admittedly atrocious) code was slow and clunky: a soup of mixed C and C++ whose sources I luckily lost:)
What made the process even slower with then available hardware was the necessary deskewing if one put the ticket not perfectly aligned with the camera, which is almost always the case.
Not having worked with OpenCV since then, my question is that todays newer versions and available low cost hardware (namely *PI like boards) could allow a similar but more complex process in almost realtime and reading whole digits instead of bullets. That is, an user puts a piece of paper under the camera, the software recognizes what it is by looking at common patterns it has with other similar documents, giving it proper orientation when needed, then it reads text and numeric information applying OCR code, then according to that information, fills/updates one or more fields in this or that database.
Doable? Caveats?
>my question is that todays newer versions and available low cost hardware (namely *PI like boards) could allow a similar but more complex process in almost realtime and reading whole digits instead of bullets. That is, an user puts a piece of paper under the camera, the software recognizes what it is by looking at common patterns it has with other similar documents, giving it proper orientation when needed, then it reads text and numeric information applying OCR code, then according to that information, fills/updates one or more fields in this or that database. Doable? Caveats?
It is certainly do-able and has already been implemented by banks for image based clearance/processing of cheques[1] and fraud detection. The caveats in this case would include a large set of training data and a supervised model for handwritten examples.
I’ve tried many of the same techniques myself for doing OCR. It’s a very brittle pipeline. You’ll need ML to do the job well. The state-of-the-art is Jaderberg’s Text Spotting in the Wild.
My work used to involve a bit of 'conventional' computer vision back in the day. Brittle is the right word for it. In a controlled environment with well designed lighting it can be fantastic but "in the wild" it's incredibly hard to make anything reliable.
I worked on a project on Optical inspection of PCBs on assembly lines in early 2004 and it was all image processing. Not a peep about ML in those days.
Having a signpost with the same aspect ratio, let's forget the failure rates for skewed angles & lighting issues will also make this fail.
ML approach which takes context is the way to do this but it will only improve in the future with performance.
It's essentially an approach based on heuristics. It can be brittle and cannot deal with ambiguity well, and that's where ML shines. It's not uncommon that people combine heuristics with ML in real world, since rule-based models are fast and cheap.
This approach fails e.g. The morphological operations you need for an infrared night time image with washed out white is different than what you need to detect the plate in the day time etc.
Spoiler alert: these kinds of image processing Rube Goldberg machines fall apart real quick with real world conditions.
> The intent of this article is to sensibilize the reader that a machine learning approach in not always the best or first solution to solve some object detection problems.
The danger of this article is people end up wasting time & budget with brittle approaches.
(Though to be fair the danger of the deep learning/ML route is people can end up wasting just as much time when they don’t have an adequate amount of training data.)