I'm just going to call this out as bullshit. This isn't YOLOv5. I doubt they even did a proper comparison between their model and YOLOv4.
Someone asked it to not be called YOLOv5 and their response was just awful [1]. They also blew off a request to publish a blog/paper detailing the network [2].
Hey all - OP here. We're not affiliated with Ultralytics or the other researchers. We're a startup that enables developers to use computer vision without being machine learning experts, and we support a wide array of open source model architectures for teams to try on their data: https://models.roboflow.ai
Beyond that, we're just fans. We're amazed by how quickly the field is moving and we did some benchmarks that we thought other people might find as exciting as we did. I don't want to take a side in the naming controversy. Our core focus is helping developers get data into any model, regardless of its name!
YOLOv5 seems to have one important advantage over v4, which your post helped highlight:
Fourth, YOLOv5 is small. Specifically, a weights file for YOLOv5 is 27 megabytes. Our weights file for YOLOv4 (with Darknet architecture) is 244 megabytes. YOLOv5 is nearly 90 percent smaller than YOLOv4. This means YOLOv5 can be deployed to embedded devices much more easily.
Naming controversy aside, it's nice to have some model that can get close to the same accuracy at 10% of the size.
Naming it v5 was certainly ... bold ... though. If it can't outperform v4 in any scenario, is it really worthy of the name? (On the other hand, if v5 can beat v4 in inference time or accuracy, that should be highlighted somewhere.)
FWIW I doubt anyone who looks into this will think roboflow had anything to do with the current controversies. You just showed off what someone else made, which is both legit and helpful. It's not like you were the ones that named it v5.
On the other hand... visiting https://models.roboflow.ai/does show YOLOv5 as "current SOTA", with some impressive-sounding results:
SIZE: YOLOv5 is about 88% smaller than YOLOv4 (27 MB vs 244 MB)
SPEED: YOLOv5 is about 180% faster than YOLOv4 (140 FPS vs 50 FPS)
ACCURACY: YOLOv5 is roughly as accurate as YOLOv4 on the same task (0.895 mAP vs 0.892 mAP)
Then it links to https://blog.roboflow.ai/yolov5-is-here/ but there doesn't seem to be any clear chart showing "here's v5 performance vs v4 performance under these conditions: x, y, z"
Out of curiosity, where did the "180% faster" and 0.895 mAP vs 0.892 mAP numbers come from? Is there some way to reproduce those measurements?
Crucially, we're tracking "out of the box" performance, e.g., if a developer grabbed X model and used it on a sample task, how could they expect it to perform? Further research and evaluation is recommended!
For size, we measured the sizes of our saved weights files for Darknet YOLOv4 versus the PyTorch YOLOv5 implementation.
For inference speed, we checked "out of the box" speed using a Colab Notebook equipped with a Tesla P100. We used the same task[1] for both - e.g. see the YOLOv5 Colab notebook[2]. For Darknet YOLOv4 inference speed, we translated the Darknet weights using the Ultralytics YOLOv3 repo (as we've seen many do for deployments)[3]. (To achieve top YOLOv4 inference speed, one should reconfigure Darknet carefully with OpenCV, CUDA, cuDNN, and carefully monitor batch size.)
For accuracy, we evaluated the task above with mAP after quick training (100 epochs) with the smallest YOLOv5s model against the full YOLOv4 model (using recommended 2000*n, n is classes). Our example is a small custom dataset, and should be investigated on e.g. COCO. 90-classes.
This is why I have so much doubt. To claim it's better in any meaningful way you need to show it on the same framework, varied datasets, varied input sizes and you should be able to use it in your detection problem and also see some benefits from the previous version.
> SIZE: YOLOv5 is about 88% smaller than YOLOv4 (27 MB vs 244 MB)
Is that a benefit of Darknet vs TF, YOLOv4 vs YOLOv5, or did you win the NN lottery [1]?
> SPEED: YOLOv5 is about 180% faster than YOLOv4 (140 FPS vs 50 FPS)
Again, where does this improvement come from?
> ACCURACY: YOLOv5 is roughly as accurate as YOLOv4 on the same task (0.895 mAP vs 0.892 mAP)
The difference in 0.1% accuracy can be huge, for example the difference between 99.9% and 100% could require an insanely larger neural network. Even much less that 99% accuracy, it seems clear to me that there can still be some limitations on accuracy from neural network size.
For example, if you really don't care so much for accuracy, you can really squeeze the network down [2].
It's about time for Roboflow to pull this article. It seems highly unlikely that a 90 % smaller model would provide a similar accuracy, and the result seems to come from a small custom dataset only. Please make a real COCO comparison instead.
> It's about time for Roboflow to pull this article.
The article still adds value by suggesting how one would run the network and in general the site seems to be about collating different networks.
Perhaps a disclaimer could be good, reading something like: "the speed improvements mentioned in this article are currently being tested". As a publisher, when you print somebody else's words, unless quoted, they are said with your authority. The claims are very big and it doesn't feel like enough testing has been done yet to even verify that they hold true.
Very cool business model! How long have you been at it? I've been pushing for a while (unsuccessfully, so far) for the NIH to cultivate a team providing such a service to our many biomedical imaging labs. It seems pretty clear to me that this sort of AI hub model is going to win out in at least the medium term versus spending money on lots of small redundant AI teams each dedicated to a single project. What sort of application sectors have you found success with?
Nice, I really respect research coming out of NIH. (Happen to know Travis Hoppe?) Coincidentally, our notebook demo for YOLOv5 is on the blood cell count and detection dataset: https://public.roboflow.ai/object-detection/bccd
We've seen 1000+ different use cases. Some of the most popular are in agriculture (weeds vs crops), industrials / production (quality assurance), and OCR.
Do you know of any battery-wired drones that can pick out invasive plants? I've been looking for this to use on trails but since the plant's sap is highly poisonous, drones seem to be the logical solution.
I somewhat agree on the naming issue. I don't think yolov5 is semantically very informative. But by the way, if you read the issues from a while back you'll see that AlexeyAB's fork basically scooped them, hence the version bump. Ultralytics probably would have called this Yolov4 otherwise. This repo has been in the works for a while.
For history, Ultralytics originally forked the core code from some other Pytorch implementation which was inference-only. Their claim to fame is that they were the first to get training to work in Pytorch. This took a while, probably because there is actually very little documentation for Yolov3 and there was confusion over what the loss function actually ought to be. The darknet repo is totally uncommented C with lots of single letter variable names. AlexeyAB is a Saint.
That said, should it be a totally new name? The changes are indeed relatively minor in terms of architecture, it's still yolo underneath (in fact I think the classification/regression head is pretty much unchanged). The v4 release was also quite contentious. Actually their previous models used to be called yolov3-spp-ultralytics.
Probably I would have gone with efficient-yolo or something similar. That's no worse than fast/faster rcnn.
I disagree on your second point though. Demanding a paper when the author says "we will later" is hardly a blow off. Publishing and writing takes time. The code is open source, the implementation is there. How many times does it happen the other way around? And before we knock Glenn for this, as far as I know, he's running a business, not a research group.
Disclosure: I've contributed (in minor ways) to both this repository and Alexey's darknet fork. I use both regularly for work and I would say I'm familiar enough with both codebases. I mostly ignore the benchmarks because performance on coco is meaningless for performance on custom data. I'm not affiliated with either group, in case it's not clear.
> But by the way, if you read the issues from a while back
> you'll see that AlexeyAB's fork basically scooped them,
> hence the version bump.
Yeah that sucks, but it does mean they should have done some proper comparison with YOLOv4.
> This took a while, probably because there is actually very
> little documentation for Yolov3 and there was confusion
> over what the loss function actually ought to be. The
> darknet repo is totally uncommented C with lots of single
> letter variable names. AlexeyAB is a Saint.
Maybe I'm alone, but I found it quite readable. You can quite reasonably understand the source in a day.
> The v4 release was also quite contentious.
Kind of, I am personally still evaluating this network fully.
> I disagree on your second point though. Demanding a paper
> when the author says "we will later" is hardly a blow off.
Checkout the translation of "you can you up,no can no bb" (see other comments).
> And before we knock Glenn for this, as far as I know, he's
> running a business, not a research group.
I understand, but this seems very unethical to take the name of an open source framework and network that publishes it's improvements in some form, bump the version number and then claim it's faster without actually doing an apples to apples test. It would have seem appropriate to contact the person who carried the torch after pjreddie stepped down from the project.
On the whole I agree about darknet being readable, it seemed well written and I've found it useful to grok how training libraries are written. I think they've moved to other backends now for the main computation though.
But.. it was still very much undocumented (and there were details missing from the paper). I think this almost certainly led to some slowdown in porting to other frameworks. And the fact its written in C has probably limited how much people are willing to contribute to the project.
> Checkout the translation of "you can you up,no can no bb" (see other comments).
That's from an 11 day old github account with no history, not Ultralytics as far as I know.
> Kind of, I am personally still evaluating this network fully.
Contention referring to the community response rather than the performance of the model itself.
Ah, I misspoke. I meant prjeddie. prjeddie kind of endorsed YOLOv4. Did he endorse YOLOv5?
Although YOLOv4 isn't anything new achitecture-wise, it tried all the tricks in the book on the existing YOLO architecture to increase its speed performance, and its method and experiment results were published as a paper; it provided value to humanity.
YOLOv5 seemed to have taken the YOLO name to seemingly only to increase the startup name value without giving much(it did appear to provided YOLOv3 Pytorch implementation, but that's before taking YOLOv5 name) back. I wonder how prjeddie would think of YOLOv5.
> Someone asked it to not be called YOLOv5 and their response was just awful [1]
I don't see any response by them at all. Do you mean the comment by WDNMD0-0? I can't see any reason to believe they're connected to the company, have I missed something?
I've not heard that one before either. Is it a reference to the Dark Tower? ("[he] has forgotten the face of his father") or did Stephen King borrow it from somewhere else?
This is an old punchline in China for many years and I doubt it comes from English literature. I guess the meaning is similar (last name ~= name of the father)
Edit: obviously I should google dark power first lol.
Also a slight edit, I wrote name initially. Of course in the books it's "face of his father", but it still sounds similar [1]. To admit to forgetting the face of one's father is to be deeply shameful, to accuse someone of it is insinuating they should be ashamed of themselves.
> Edit: Although as yeldarb explains in a comment here[3],
> it's probably a bit more complicated than that.
Legally speaking I'm not sure anything wrong was really done here.
Morally speaking, it seems quite unethical. AlexeyAB has really been carrying the torch of the Darknet framework and the YOLO neural network for quite some time (with pjreddie effectively handing it over to him).
AlexeyAB has been providing support on pjreddie's abandoned repository (e.g. [1]) and actively working on improvements in a fork [2]. If you look at the contributors graphs, he really has been keeping the project alive [3] (vs Darknet by pjreddie [4]).
Probably the worse part in my opinion is that they have also seemingly bypassed the open source nature of the project. This is quite damning.
So, the question I have is whether AlexeyAB got some sort of endorsement from pjreddie, or if they just took over the name by nature of being the most active fork? If it's the latter, ultralytics' actions don't seem quite as bad (although they still feel kind of off-putting, especially with how some of the responses to calls for a name change were formulated).
I guess given the info I have now, to me it boils down to whether there's precedent for the next version of the name to be taken by whoever is doing work on it? If the original author never endorsed AlexeyAB (I don't know one way or another), then perhaps AlexeyAB should have changed the name but references or payed homage to YOLO in some way?
Eh, this is all starting to feel a bit too close to youtube drama for my liking.
Someone asked it to not be called YOLOv5 and their response was just awful [1]. They also blew off a request to publish a blog/paper detailing the network [2].
I filed a ticket to get to the bottom of this with the creators of YOLOv4: https://github.com/AlexeyAB/darknet/issues/5920
[1] https://github.com/ultralytics/yolov5/issues/2
[2] https://github.com/ultralytics/yolov5/issues/4