I'm just going to call this out as bullshit. This isn't YOLOv5. I doubt they even did a proper comparison between their model and YOLOv4.
Someone asked it to not be called YOLOv5 and their response was just awful [1]. They also blew off a request to publish a blog/paper detailing the network [2].
Hey all - OP here. We're not affiliated with Ultralytics or the other researchers. We're a startup that enables developers to use computer vision without being machine learning experts, and we support a wide array of open source model architectures for teams to try on their data: https://models.roboflow.ai
Beyond that, we're just fans. We're amazed by how quickly the field is moving and we did some benchmarks that we thought other people might find as exciting as we did. I don't want to take a side in the naming controversy. Our core focus is helping developers get data into any model, regardless of its name!
YOLOv5 seems to have one important advantage over v4, which your post helped highlight:
Fourth, YOLOv5 is small. Specifically, a weights file for YOLOv5 is 27 megabytes. Our weights file for YOLOv4 (with Darknet architecture) is 244 megabytes. YOLOv5 is nearly 90 percent smaller than YOLOv4. This means YOLOv5 can be deployed to embedded devices much more easily.
Naming controversy aside, it's nice to have some model that can get close to the same accuracy at 10% of the size.
Naming it v5 was certainly ... bold ... though. If it can't outperform v4 in any scenario, is it really worthy of the name? (On the other hand, if v5 can beat v4 in inference time or accuracy, that should be highlighted somewhere.)
FWIW I doubt anyone who looks into this will think roboflow had anything to do with the current controversies. You just showed off what someone else made, which is both legit and helpful. It's not like you were the ones that named it v5.
On the other hand... visiting https://models.roboflow.ai/does show YOLOv5 as "current SOTA", with some impressive-sounding results:
SIZE: YOLOv5 is about 88% smaller than YOLOv4 (27 MB vs 244 MB)
SPEED: YOLOv5 is about 180% faster than YOLOv4 (140 FPS vs 50 FPS)
ACCURACY: YOLOv5 is roughly as accurate as YOLOv4 on the same task (0.895 mAP vs 0.892 mAP)
Then it links to https://blog.roboflow.ai/yolov5-is-here/ but there doesn't seem to be any clear chart showing "here's v5 performance vs v4 performance under these conditions: x, y, z"
Out of curiosity, where did the "180% faster" and 0.895 mAP vs 0.892 mAP numbers come from? Is there some way to reproduce those measurements?
Crucially, we're tracking "out of the box" performance, e.g., if a developer grabbed X model and used it on a sample task, how could they expect it to perform? Further research and evaluation is recommended!
For size, we measured the sizes of our saved weights files for Darknet YOLOv4 versus the PyTorch YOLOv5 implementation.
For inference speed, we checked "out of the box" speed using a Colab Notebook equipped with a Tesla P100. We used the same task[1] for both - e.g. see the YOLOv5 Colab notebook[2]. For Darknet YOLOv4 inference speed, we translated the Darknet weights using the Ultralytics YOLOv3 repo (as we've seen many do for deployments)[3]. (To achieve top YOLOv4 inference speed, one should reconfigure Darknet carefully with OpenCV, CUDA, cuDNN, and carefully monitor batch size.)
For accuracy, we evaluated the task above with mAP after quick training (100 epochs) with the smallest YOLOv5s model against the full YOLOv4 model (using recommended 2000*n, n is classes). Our example is a small custom dataset, and should be investigated on e.g. COCO. 90-classes.
This is why I have so much doubt. To claim it's better in any meaningful way you need to show it on the same framework, varied datasets, varied input sizes and you should be able to use it in your detection problem and also see some benefits from the previous version.
> SIZE: YOLOv5 is about 88% smaller than YOLOv4 (27 MB vs 244 MB)
Is that a benefit of Darknet vs TF, YOLOv4 vs YOLOv5, or did you win the NN lottery [1]?
> SPEED: YOLOv5 is about 180% faster than YOLOv4 (140 FPS vs 50 FPS)
Again, where does this improvement come from?
> ACCURACY: YOLOv5 is roughly as accurate as YOLOv4 on the same task (0.895 mAP vs 0.892 mAP)
The difference in 0.1% accuracy can be huge, for example the difference between 99.9% and 100% could require an insanely larger neural network. Even much less that 99% accuracy, it seems clear to me that there can still be some limitations on accuracy from neural network size.
For example, if you really don't care so much for accuracy, you can really squeeze the network down [2].
It's about time for Roboflow to pull this article. It seems highly unlikely that a 90 % smaller model would provide a similar accuracy, and the result seems to come from a small custom dataset only. Please make a real COCO comparison instead.
> It's about time for Roboflow to pull this article.
The article still adds value by suggesting how one would run the network and in general the site seems to be about collating different networks.
Perhaps a disclaimer could be good, reading something like: "the speed improvements mentioned in this article are currently being tested". As a publisher, when you print somebody else's words, unless quoted, they are said with your authority. The claims are very big and it doesn't feel like enough testing has been done yet to even verify that they hold true.
Very cool business model! How long have you been at it? I've been pushing for a while (unsuccessfully, so far) for the NIH to cultivate a team providing such a service to our many biomedical imaging labs. It seems pretty clear to me that this sort of AI hub model is going to win out in at least the medium term versus spending money on lots of small redundant AI teams each dedicated to a single project. What sort of application sectors have you found success with?
Nice, I really respect research coming out of NIH. (Happen to know Travis Hoppe?) Coincidentally, our notebook demo for YOLOv5 is on the blood cell count and detection dataset: https://public.roboflow.ai/object-detection/bccd
We've seen 1000+ different use cases. Some of the most popular are in agriculture (weeds vs crops), industrials / production (quality assurance), and OCR.
Do you know of any battery-wired drones that can pick out invasive plants? I've been looking for this to use on trails but since the plant's sap is highly poisonous, drones seem to be the logical solution.
I somewhat agree on the naming issue. I don't think yolov5 is semantically very informative. But by the way, if you read the issues from a while back you'll see that AlexeyAB's fork basically scooped them, hence the version bump. Ultralytics probably would have called this Yolov4 otherwise. This repo has been in the works for a while.
For history, Ultralytics originally forked the core code from some other Pytorch implementation which was inference-only. Their claim to fame is that they were the first to get training to work in Pytorch. This took a while, probably because there is actually very little documentation for Yolov3 and there was confusion over what the loss function actually ought to be. The darknet repo is totally uncommented C with lots of single letter variable names. AlexeyAB is a Saint.
That said, should it be a totally new name? The changes are indeed relatively minor in terms of architecture, it's still yolo underneath (in fact I think the classification/regression head is pretty much unchanged). The v4 release was also quite contentious. Actually their previous models used to be called yolov3-spp-ultralytics.
Probably I would have gone with efficient-yolo or something similar. That's no worse than fast/faster rcnn.
I disagree on your second point though. Demanding a paper when the author says "we will later" is hardly a blow off. Publishing and writing takes time. The code is open source, the implementation is there. How many times does it happen the other way around? And before we knock Glenn for this, as far as I know, he's running a business, not a research group.
Disclosure: I've contributed (in minor ways) to both this repository and Alexey's darknet fork. I use both regularly for work and I would say I'm familiar enough with both codebases. I mostly ignore the benchmarks because performance on coco is meaningless for performance on custom data. I'm not affiliated with either group, in case it's not clear.
> But by the way, if you read the issues from a while back
> you'll see that AlexeyAB's fork basically scooped them,
> hence the version bump.
Yeah that sucks, but it does mean they should have done some proper comparison with YOLOv4.
> This took a while, probably because there is actually very
> little documentation for Yolov3 and there was confusion
> over what the loss function actually ought to be. The
> darknet repo is totally uncommented C with lots of single
> letter variable names. AlexeyAB is a Saint.
Maybe I'm alone, but I found it quite readable. You can quite reasonably understand the source in a day.
> The v4 release was also quite contentious.
Kind of, I am personally still evaluating this network fully.
> I disagree on your second point though. Demanding a paper
> when the author says "we will later" is hardly a blow off.
Checkout the translation of "you can you up,no can no bb" (see other comments).
> And before we knock Glenn for this, as far as I know, he's
> running a business, not a research group.
I understand, but this seems very unethical to take the name of an open source framework and network that publishes it's improvements in some form, bump the version number and then claim it's faster without actually doing an apples to apples test. It would have seem appropriate to contact the person who carried the torch after pjreddie stepped down from the project.
On the whole I agree about darknet being readable, it seemed well written and I've found it useful to grok how training libraries are written. I think they've moved to other backends now for the main computation though.
But.. it was still very much undocumented (and there were details missing from the paper). I think this almost certainly led to some slowdown in porting to other frameworks. And the fact its written in C has probably limited how much people are willing to contribute to the project.
> Checkout the translation of "you can you up,no can no bb" (see other comments).
That's from an 11 day old github account with no history, not Ultralytics as far as I know.
> Kind of, I am personally still evaluating this network fully.
Contention referring to the community response rather than the performance of the model itself.
Ah, I misspoke. I meant prjeddie. prjeddie kind of endorsed YOLOv4. Did he endorse YOLOv5?
Although YOLOv4 isn't anything new achitecture-wise, it tried all the tricks in the book on the existing YOLO architecture to increase its speed performance, and its method and experiment results were published as a paper; it provided value to humanity.
YOLOv5 seemed to have taken the YOLO name to seemingly only to increase the startup name value without giving much(it did appear to provided YOLOv3 Pytorch implementation, but that's before taking YOLOv5 name) back. I wonder how prjeddie would think of YOLOv5.
> Someone asked it to not be called YOLOv5 and their response was just awful [1]
I don't see any response by them at all. Do you mean the comment by WDNMD0-0? I can't see any reason to believe they're connected to the company, have I missed something?
I've not heard that one before either. Is it a reference to the Dark Tower? ("[he] has forgotten the face of his father") or did Stephen King borrow it from somewhere else?
This is an old punchline in China for many years and I doubt it comes from English literature. I guess the meaning is similar (last name ~= name of the father)
Edit: obviously I should google dark power first lol.
Also a slight edit, I wrote name initially. Of course in the books it's "face of his father", but it still sounds similar [1]. To admit to forgetting the face of one's father is to be deeply shameful, to accuse someone of it is insinuating they should be ashamed of themselves.
> Edit: Although as yeldarb explains in a comment here[3],
> it's probably a bit more complicated than that.
Legally speaking I'm not sure anything wrong was really done here.
Morally speaking, it seems quite unethical. AlexeyAB has really been carrying the torch of the Darknet framework and the YOLO neural network for quite some time (with pjreddie effectively handing it over to him).
AlexeyAB has been providing support on pjreddie's abandoned repository (e.g. [1]) and actively working on improvements in a fork [2]. If you look at the contributors graphs, he really has been keeping the project alive [3] (vs Darknet by pjreddie [4]).
Probably the worse part in my opinion is that they have also seemingly bypassed the open source nature of the project. This is quite damning.
So, the question I have is whether AlexeyAB got some sort of endorsement from pjreddie, or if they just took over the name by nature of being the most active fork? If it's the latter, ultralytics' actions don't seem quite as bad (although they still feel kind of off-putting, especially with how some of the responses to calls for a name change were formulated).
I guess given the info I have now, to me it boils down to whether there's precedent for the next version of the name to be taken by whoever is doing work on it? If the original author never endorsed AlexeyAB (I don't know one way or another), then perhaps AlexeyAB should have changed the name but references or payed homage to YOLO in some way?
Eh, this is all starting to feel a bit too close to youtube drama for my liking.
I welcome forward progress in the field, but something about this doesn't sit right with me. The authors have an unpublished/unreviewed set of results and they're already co-opting the YOLO name (without the original author) for it and all of this to promote a company? I guess this was inevitable when there's so much money in ML but it definitely feels against the spirit of the academic research community that they're building upon.
It exports the data in yolo format (e.g. it has coordinates in yolo's [0..1] range), so it's straightforward to spit it out to disk and start a yolo training run on it.
I don't think we could've paid human labelers to create tags that thorough or accurate.
All the tags for all experiments can be grabbed via https://www.tagpls.com/tags.json, so over time we hope the site will become more and more valuable to the ML community.
tagpls went from 50 users to 2,096 in the past three weeks. The database size also went from 200KB a few weeks ago to 1MB a week ago and 2MB today. I don't know why it's becoming popular, but it seems to be.
I'm a bit worried about the bill. It's up to $50 and rising: https://imgur.com/ZgmXsWU almost entirely egress bandwidth. Be gentle with those `curl` statements. :)
(I think that's due to a poor architectural decision on my part, which is solvable, and not due to egress bandwidth via the API endpoint. But it's always fun to see a J curve in your bill... It's about $1 a day right now. https://imgur.com/4gUTLO7)
Can you set it up so that it's only available via cloud? I'm sure that would bother people, but is a better alternative to losing access or you going broke :)
We're motivated to keep this as open as possible. I really like the idea of an open dataset that continues to grow with time. If it keeps growing, then within a couple years it should have a vast quantity of tags on a variety of diverse datasets, which we hope might prove helpful.
Thanks! We've decided to license the data as CC-0. We'll add that to the footer.
We don't host any images directly – we merely serve a list of URLs (e.g. https://battle.shawwn.com/tfdne.txt). But any data served via the API endpoints is CC-0.
I need a dataset and tags for hair, face, neck, arms, left breast, right breast, nipple, torso. Any tips? I'm training a GAN, but I need to specifically segment the parts, as I don't want nipples in the middle of a face. I don't want to have to manually annotate 1,000 images
Those are also drawings/anime, not photos. We have an /r/pics experiment (SFW, 99 tags https://www.tagpls.com/exp?n=r-pics) and /r/gonewild (NSFW, 57 tags https://www.tagpls.com/exp?n=r-gonewild) but currently I haven't gathered enough urls to be very useful -- it only scrapes about 100 or so images every half hour. So there is a lack of tags right now on human photos. We also have a pps experiment (NSFW, exactly what you think it is, 306 tags https://www.tagpls.com/exp?n=pps) but I assume that's not quite what you were looking for.
I love that it's porn (and specifically furry/hentai) which pushes the limits of image recognition and creativity within computer vision. Between this and the de-censoring tool "DeepCreamPy" I can't look most data scientists in the face anymore .
that's a great name, turning jagged edges back to smooth and applying reverse Gaussian blur /s
on a serious note, kind of interesting the authenticity/accuracy if it's just filled in... eg. turning black and white pictures back to color eg. was it actually green or blue
Yeah, I mean, the tagging is awesome, but I'm thinking I'll need more image segmentation than object recognition. With a segmentation map, I can make a great image->image translator.
It looks like an HN user on an EC2 server decided to fetch data from our firebase as quickly as possible, running up a $3,700 bill. Once (or if) that's sorted out, and once we verify tagpls can handle HN's load without charging thousands of dollars, we'll add an "about" page to tagpls and submit it.
The idea with the site is that you can tag your own datasets, and then get the data suitable for yolo training. We've done that ourselves to train an anime hand detector, and other users have reported similar successes. I could've been a bit clearer about that.
Has anyone (beyond maybe self-driving software) tried using object tagging as a way to start introducing physics into a scene? E.g. human and bicycle have same motion vector, increases likelihood that human is riding bicycle. Bicycle and human have size and weight ranges that could be used to plot trajectory. Bicycles riding in a straight line and trees both provide some cues as to the gravity vector in the scene. Etc. etc.
Seems like the camera motion is probably already solved with optical flow/photogrammetry stuff, but you might be able to use that to help scale the scene and start filtering your tagging based on geometric likelihood.
The idea of hierarchical reference frames (outlined a bit by Jeff Hawkins here https://www.youtube.com/watch?v=-EVqrDlAqYo&t=3025 ) seems pretty compelling to me for contextualizing scenes to gain comprehension. Particularly if you build a graph from those reference frames and situate models tuned to the type of object at the root of each each frame (vertex). You could use that to help each model learn, too. So if a bike model projects a 'riding' edge towards the 'person' model, there wouldn't likely be much learning. e.g. [Person]-(rides)->[Bike] would have likely been encountered already.
However if the [Bike] projects the (rides) edge towards the [Capuchin] sitting in the seat, the [Capuchin] model might learn that capuchins can (ride) and furthermore they can (ride) a [Bike].
I've been wondering these same thoughts for years. I don't do much work in the neural network subfield, but have done a lot with computer vision, and always found myself wanting more robust physical estimation techniques that didn't require external data.
Yeah I wish the flagship phone manufacturers would put the hardware back into the phone to take 3d photos...even better if you can get point cloud data to go with it. The applications right now are kind of cheesy but they will get better and if the majority of photos taken pivot to including depth information i think it could really drive better capabilities from our phones.
Eyes are very hard to make and coordinate, yet there are almost no cyclops in nature.
In theory you could also do this with visual-inertial odometry eg monocular SLAM. But this is definitely something we're looking at in my group (I do CV for ecology), especially for object detection where geometry (absolute size) is a good way to distinguish between two confusing classes. A good candidate here is aerial imagery. If you've calibrated the camera and you know your altitude, then you know your ground sample distance (m/px).
Most flagships can do this though, any multicamera phone can get some kind of stereo. Google do it with the PDAF pixels for smart bokeh (they have some nice blog posts about it). I don't know if there is a way to so that in an API though (or to obtain the depth map).
I work mostly with RGB/Thermal, if that counts. My PhD was in stereo/lidar fusion, so I've always been into mixing sensors :)
I've also done some work on satellite imaging which is 13-band (Sentinel 2). Lots of people in ecology use the Parrot Sequoia which is four-band multispectral. There really isn't much published work in ML beyond RGB, which I find interesting - yes there's RGB-D and LIDAR but it's mostly for driving applications. Part of the reason I'm so familiar with the yolo codebases is that I've had to modify them a lot to work with non-standard data. There's nothing that stops you from using n-channel images, but you will almost certainly have to hack every off the shelf solution to make it work. RGB and 8-bit is almost always hard coded, augmentation also often fails with non RGB data (albumentations is good though). A bigger issue is there's a massive lack of good labelled datasets for non rgb imagery.
On the plus side, in a landscape where everyone is fighting over COCO, there is still a lot of low hanging fruit to pick I think.
I've not done any hyperspectral, very hard to (a) get labelled data (there's AVIRIS and EO-1/Hyperion maybe) (b) it's very hard to label, the images are enormous and (c) the cameras are stupid expensive.
By the way, even satellite imaging ML applications tend to overwhelmingly use just the RGB channels and not the full extent of the data.
Whoa that's awesome! Love hearing contemporary technology used to detect/diagnose/monitor the environment and our ecological impact. Boots on ground will always be important but the horizontal scaling you can get out of imaging I would imagine really helps prioritize where you turn your attention. Thanks for the info and best of luck!
There seems to be an unfair comparison between the various network architectures. The reported speed and accuracy improvements should be taken with a bit of scepticism for two reasons.
* This is the first yolo implemented in Pytorch. Pytorch is the fastest ml framework around, so some of YOLOv5's speed improvements may be attributed to the platform it was implemented on rather than actual scientific advances. Previous yolos were implemented using darknet, and EfficientDet is implemented in TensorFlow. It would be necessary to train them all on the same platform for a fair speed comparison.
* EfficientDet was trained on the 90-class COCO challenge (1), while YOLOv5 was trained on 80 classes (2).
Great points, and hoping Glenn releases a paper to complement performance. We are also planning more rigorous benchmarking nonetheless.
re: PyTorch being a confounding factor for speed - we recompiled YOLOv4 to PyTorch to achieve 50 FPS. Darknet would likely top out around 10 FPS on the same hardware.
In February 2020, PJ Reddie noted he would discontinue research in computer vision.
He actually stopped working on it because of ethical concerns. I'm inspired that he made this principled choice despite being quite successful in this field.
Yeah, they made the most popular PyTorch implementation of YOLOv3 as well so they're not entering out of the blue, though. https://github.com/ultralytics/yolov3
The author of YOLOv3 quit working on Computer Vision due to ethical concerns. YOLOv4, which built on his work in v3, was released by different authors last month. I'd expect more YOLOvX's from different authors in the future. https://twitter.com/pjreddie/status/1230524770350817280
Latency is measured for batch=32 and divided by 32? This means that 1 batch will be processed in 500 milliseconds.
I have never seen a more fake comparison.
Why benchmark using 32-bit FP on a V100? That means it’s not using tensor cores, which is a shame since they were built for this purpose.
There’s no reason not to benchmark using FP16 here.
If you click around enough you’ll see they benchmarked in 32-bit FP. Glad they have a mixed precision training option but I really think it’s a mistake in 2020 to do work related to efficient inference using 32-but FP.
The problem is that your conclusions aren’t independent of this choice. A different network might be far better in terms of accuracy/speed tradeoffs when evaluated at a lower precision. But there is no reason to use 32-but precision for inference, so this is just a big mistake.
Yeah, it's pretty unethical. Looks like they just stole the name without any care. There doesn't seem to be any relationship between these guys and the original YOLO group.
If it's not trademarked, perhaps not much? I think it's pretty misleading, but the fight for attention is on! Using an established brand in your title will get more clicks.
I really like the work done by AlexAB on darknet YOLOv4 and the original author Joseph Radmon with YOLOv3. These guys need a lot more respect than any other version of YOLO.
This is not the first time something is fishy. Back in the early stages of the repo. They were advertising on the front page that they are achieving similar MAP to the original C++ version. But only to be found out they haven't train it on COCO dataset and test it.
YOLO is a neural network, Darknet is the framework. Without both YOLOv4 and "YOLOv5" on the same framework, it makes it near impossible to make any kind of meaningful comparison.
I am very interested on loading YOLO into a Raspberry Pi + Coral.ai, anyone knows a good tutorial on how to get started? I tried before and with Darknet it was not easy at all, but now with pytorch there seem to be ways of loading that into Coral. I am familiar with Raspberry Pi dev, but not much with ML or TPUs, so I think it'd be mostly a tutorial on bridging the different technologies.
(might need to wait a couple of months since this was just released)
Hm on this page it has something written in an eastern language under YOLO, https://github.com/ultralytics says Madrid, Spain, but then they say "Ultralytics is a U.S.-based particle physics and AI startup"
Just recently IBM announced with a loud PR move that the company is getting out of the face recognition business. Guess what? Wall Street doesn't want to keep subsidizing IBM's subpar face recognition technology when open source and Google solutions are pushing the state of the art.
Not something to brag about. Facial recognition has very few applications outside of total surveillance. We should not respect those who lend it their time and effort.
Its not exclusive. Bad actors are working on whatever they are paid to build, by other bad actors with less technical acumen and more money.
Edit: I should add, that most of the actual progress is being made by smart people who think its an interesting problem and are unaware or uncaring of the clear outcome of such tech.
Being able to distinguish between people is pretty foundational to being able to personalize AI applications. If you wanted to make a smart home actually smart and not just full of inconvenient remote controlled appliances, this is pretty necessary.
There are obviously privacy concerns with this example, it’d ideally be fully on-prem.
>Facial recognition has very few applications outside of total surveillance.
That's not really for you to decide, is it? You're absolutely free to have that opinion of course.
>We should not respect those who lend it their time and effort.
Also your choice of course. Facial recognition is essentially a light integration of powerful underlying technologies. Should 'we' ostracize those working on machine learning, computer vision, network and distributed computing, etc?
The question is always the same: is every technical/scientific progress desirable ? But it seems that this question isn't asked anymore, "move fast and break things" am I right ?
I'm much more worried about people using your arguments to try and shut down the discussion than people trying to open the debate, because once the mass surveillance/face recognition mass adoption pandora's box is open there won't be any way to go back.
When I see predator drones and FBI stingray planes above every major us cities during protests I already know we're not going in the "let's talk about this before reaching the point no return" direction.
Once the tech is out there it's simply a question of "when" will it be used for borderline illegal activities, especially in the US where you have these different entities (fbi, cia, nsa, dea, &c.) basically acting in their own bubble and doing whatever they want until it's leaked and/or gets outrageous enough to get the public attention.
I mean, there were unidentified armed forces marching in US streets last week, if people don't se this as the biggest red flag in recent US history I don't know what they need.
You didn't really address the author's point which was that there don't appear to be compelling uses of facial technology beyond mass automated surveillance.
I can't think of other uses and I'd be interested if you can come up with some.
> 2) ensure candidate X is actually candidate X and not a paid person to take the exam in name of candidate X
Can you imagine the bureaucratic nightmare that would be unleashed upon yourself if "the system" decides you aren't who you say you are because of the way you aged, an injury, surgery or a few new freckles?
This already happens sometimes with birth certificates and identity theft, and it's awful for those who have to experience it. I'd hate to have a black box AI inflicting that upon others for inexplicable reasons.
Biometric authentication is one that comes to mind. Facial recognition running locally on my own photo library would also be useful for organizing photos. A cloud-free local-only home automation system that can tell the difference between owners/housemates/guests and customize behavior accordingly would also be nice.
I'm looking into YOLO for this, but it's moreso to verify your selfie == image on document, and we want to avoid sending highly sensitive information to third party providers.
The current service we use, while accurate, costs 50 cents per verification...
Edit: reading through this thread, if the model isn't super massive, we could offer on-browser verification! 27MB is still a hefty download though.
> Facial recognition is essentially a light integration of powerful underlying technologies. Should 'we' ostracize those working on machine learning, computer vision, network and distributed computing, etc?
Couldn't you argue the same way against just about any kind of IED or booby trap? Yet people tend to ostracize those who make them more than they do people who make ball bearings and nails.
Someone asked it to not be called YOLOv5 and their response was just awful [1]. They also blew off a request to publish a blog/paper detailing the network [2].
I filed a ticket to get to the bottom of this with the creators of YOLOv4: https://github.com/AlexeyAB/darknet/issues/5920
[1] https://github.com/ultralytics/yolov5/issues/2
[2] https://github.com/ultralytics/yolov5/issues/4