Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Tiny quadrotor learns to fly in 18 seconds (ieee.org)
172 points by Brajeshwar on Feb 9, 2024 | hide | past | favorite | 136 comments


This is interesting, because it seem to be a start at solving the Fulmar Problem.

A fulmar is a cliff nesting sea bird (with a defensive habit of noxious projectile vomiting). They spend their early life on a ledge, but one day they have to start flying. And if you are a fulmar you have a very limited time to learn to fly.

So the question is, how does a baby fulmar learn to fly in 10s!

It would be interesting to know how much computing power is required for training (compared to the power reqùred to run the controlling NN).

My own view is that the network architecture is important, so fulmar brains have evolved with a neural architecture that enables extremely quick learning to stable flight.

I played around with a few ideas on using GAs to evolve NN architectures for rapid learning during my PhD 25 years ago, but ended up going in another direction.


What's the problem? It's inherent to the hardware. We don't have to learn how to breathe. Animals don't have to learn to walk. It's the same thing. Are you suggesting that behavior can't be hardwired? I don't think anyone believes that.

What am I missing? Genuinely confused.

Edit: After googling, I can't find any "Fulmar problem". This is just basic evolution 101.


> Animals don't have to learn to walk.

Have you seen baby deer?


A baby deer is standing within minutes of being born and walking later that day. I don’t think humans are in any position to insult a baby deer’s ability to walk. A human takes about a year and is also terrible at it when starting out.


Shouldn't humans have far longer gestation periods (due to it being born 'early' as it otherwise would not fit though the birth canal) and maybe that's the issue at fault?


It seems there are two different classifications for species when it comes to how mobile they are near birth and how much care the newborns need.

Deer are precocial, while humans are altricial.

https://en.m.wikipedia.org/wiki/Precociality_and_altricialit...


Yet without any external inputs, the deer will eventually learn to walk by itself, suggesting some kind of internal hardwiring.


I guess I don’t understand what you mean “with no external inputs”. It’s in the physical world, with gravity. Isn’t that an external input? Driving the feedback loop of “I fall on my face if I do it wrong”?


That falling on the face is irrelevant, as it's just a byproduct of some pre-determined programming that is not learned. Surely if this was a learned experience the deer after failing on its face would quickly learn it's best to not try and stand up. Why am I standing when I just keep hitting my face into the ground? The reality is the drive programmed into the brain is to stand up at all costs, even if that means landing flat on your face once in a while.


I guess in my mental model, the deer continues to try and stand because it wants to things like food and water, and it sees other deer standing and walking around. Psychologists do talk about a state of "learned helplessness" if a living creature fails too often.

I don’t know anything about the state of research here, this is just what I always assumed. I’m sure there are built in drives and reflexes, etc. It seems perfectly reasonable that they could be "higher level" than the ones I assumed (food, water)


I think it's more a "if I fall I'm gonna get eaten/injured and not be able to reproduce" kind of thing


Yeah, they "learn" to walk pretty quickly. Far too fast for it to be purely learning.


Animals that are ready to go shortly after birth are called precocial. This includes chickens and most of the equines.

If you've ever seen a newborn foal stand up for the first time, it's clear that's a built-in behavior. Standing up on those long spindly legs usually works the first time. So does walking, and within hours, running.

Lying down, however, is not built-in. I've seen a foal try to get back down on the ground, which is clearly trial and error, often ending in a fall. There's no evolutionary pressure to have that work right the first time.


Yeah but deer are legitimately bad at walking.


"Are you suggesting that behavior can't be hardwired?"

Like others mentioned, there is a difference between species who can learn it more easily and those who learn slower.

https://en.m.wikipedia.org/wiki/Precociality_and_altricialit...

But some sort of learning (or callibration) is always required, because no individual is the same, so you cannot have it prehardwired everything. The basic movement of the muscles to fly, yes, but the exact movement and coordination needed to fly, needs the information of the specific mass and lengths of the wings etc. which are going to be different.

In short, "the hardware" is organic and not standartized.


A paper airplane can learn to fly on a breeze in being folded. I posit that a fulmar is closer to a paper airplane than to a drone.


If that were true lots of young fulmars would drown in the sea at the bottom of a cliff when they fledge. That doesn't happen. They fly, with control, and enough understanding to land safely on water.


Human babies are born knowing how to swim, I suppose they learn that in 3 seconds as well? Both of these are far more likely to just be genetic memory. If you can encode how to fold proteins to make wings, then packaging a control algorithm along is completely trivial in comparison.


It's a false believe. Throw a baby in water and he won't survive.


Are you not aware of the throwing babies in pools fad that comes and goes in popularity? They do better than you'd think.


The swimming isn’t what you think. It’s also called the diving reflex for good reason: https://en.wikipedia.org/wiki/Infant_swimming#/media/File:Ba...

As a baby I fell into a pool and needed to be rescued. I didn’t inhale water, but as generally happens I very much did just sank to the bottom. Babies can be taught to float at around 5 months, but it’s not reflexive.


Infants do have a "diving reflex" but that is very different from knowing how to swim. If you throw a baby in a pool, the baby will drown and die.

There are undoubtedly similar reflexes that assist fulmars in learning to fly quickly, but that doesn't mean that rapid learning isn't also required.


On a very special episode of MythBusters…


I mean they kind of have to when they're part of a species that's curious enough that they'll hear 'babies can swim' and throw babies in a pool en masse to see if it is be true.


I'll test that out and get back to you /s

I can't say what the extent of those swimming reflexes really is, but water births are not unheard of. Besides there's literally countless examples of this sort of thing, most prey animals can walk or even run mere moments after being born, cetaceans and ram ventilator sharks all know how to swim immediately or they would literally drown, insects can fly as soon as they hatch, etc.


All good examples, but humans are relatively underdeveloped at birth compared to most mammals due to the necessity of limiting skull size to allow an easier birth. An infant cannot lift its own head or locomote for several months, for instance. Just like marsupials finish up outside the womb in a pouch, we have own external phase.


How baby deer learns to walk in a few hours after birth? It needs to control four limbs to keep balance. Animals tend to have some forms of behavior that they learn in a few tries or even from the very first try. But at the same time they may be very bad at learning other kinds of behavior that seem no more difficult for us.

Homo Sapience is special in this regard with almost all their firmware broken. Homo Sapience needs to learn hard how to focus their eyes or to hold head or to walk and even how to crawl. Seems like our abilities to learn are linked with brokenness of our innate software, probably it is a double headed causal arrow: our software is broken because we could overcome it with our general abilities to learn, and our general abilities to learn evolved because without them and without working genetic programs for specific forms of behavior we would be doomed.

I can't help but think that autism is the next step at breaking innate software and developing more robust general learning mechanisms.


Practically speaking humans are born relatively underdeveloped and spend much of their first year catching up to where other animals start. If we spent more time baking we would have trouble getting out of the birth canal because we have such large heads (we already have enough trouble as is). It's easy to conjecture how that may have come to be. Marsupials are an interesting comparison with similar but different traits.

For what it's worth there are also countless animals which start out very helpless, including many birds which are wholly incapable of flight for some time, and for which learning is not always an easy process. The bird in question is closer to an exception than a rule.


> The bird in question is closer to an exception than a rule.

It doesn't flop out of the nest immediately after leaving the egg, though, right? This is more like a 3 year old child learning to ride a bike on the first go (still impressive) than a newborn infant doing so.


> if we spent more time baking we would have trouble getting out of the birth canal because we have such large heads

Human birth seems strangely difficult. If you’ve never witnessed a live human birth it’s very traumatic with lots and lots of things that can/do go wrong. I can’t think of another mammal that struggles as hard as humans do during birth. It makes me wonder how humans have survived as long as we have.


Well, historically, we didn't. Infant mortality is continuously at record lows. Modern medicine means that we can handle many things that can/do go wrong.

We're effectively propagating that difficulty and it compounds generationally (imagine your bloodline has unusually large heads...) but we're smart enough to circumvent fate and the net result would appear to be positive.

Much of what humans do approaches the limit of cutting off the nose to spite the face. Is life better than it has ever been, practically speaking, everywhere? With some exception granted to the last couple of years, yes, and even without, probably yes.

I find it rather disturbing to think about. Diversification is really our only long term hope (hedging so to speak) and in many ways we're constantly moving away from that. If we were suddenly space faring colonizers, or if there were another dark age, then that would cease to be true, for better or for worse.


Our heads are big to fit our brain. Our pelvis is shaped and structured for upright gait, and so not big. The struggle is indirectly due to the traits that drive or success.


Are our heads actually unusually big, for a mammal scaled to our size? I've often wondered about this.

Especially for a mammal with front-facing eyes.


https://en.wikipedia.org/wiki/Brain–body_mass_ratio

It's interesting to note our peers in this regard. Mice have a two-piece pelvis, and dolphins have no pelvis at all.


OMG of course there's a wikipedia page for it!

Thank you :-)


"If you’ve never witnessed a live human birth it’s very traumatic with lots and lots of things that can/do go wrong."

I think that is a specific cultural view, to assume it is traumatic. I certainly did not think so.

And that human birth is hard, has the evolutionary roots in going bipedal and walking upright as far as I know. Being upright blocks the pelvis, being down, opens it. Apes who go 4 feet, do not struggle (so hard).


> I certainly did not think so.

Were you the one giving birth?


No, but it was implied, witnessing alone is traumatic. (And I did witnessed other becoming fathers, who couldn't handle it and required medical attention themself)

And certainly quite some women experience it as traumatic, but not all of them.


I don't think it was necessarily implied, actually. Depends how you parse that sentence. For example it goes on to say "and lots of things can go wrong". But that is very much about the birth, not the observation.

Either way, I am not sure that watching a couple of births qualifies anyone to say whether birth being "traumatic" is a question of cultural assumption :-)


"Either way, I am not sure that watching a couple of births qualifies anyone to say whether birth being "traumatic" is a question of cultural assumption"

No, but talking with women who say convincingly, they did not perceive it as traumatic is enough for me to conclude that births are not traumatic by itself. They can be, but I have the suspicion, that is culturally enforced, but that seems to change slowly.


I don’t think firmware is broken. Baby deer also doesn’t have to learn anything - it is just starting up muscles that is hardware problem. The same with baby humans you have to get your hardware so muscles prepared to get rest of the body up.

Of course there is that bit where control of the muscles also has to align so there is building up of neural pathways that can be also more like getting hardware wired up. Getting it wired up takes longer in humans.


I remember something like humans are born earlier underdeveloped compared to other mammals because are pelvis is smaller to allow standing up. As a side effect some brain development happens after birth.

This allows for more social/environmental impact on development than pure genetics, and we have developed better language and understanding.


There are no "because", just speculation.

There is no physical limit why we can't have both. .. and we don't have datapoints for time to evolve for every trails.

I hate it when biologists talked as if they knew it all


The paper plane has a lot less extending the wings and then flapping them to learn to avoid death.


I would imagine it activates some motor neuron ganglia that is somehow connected to a flap reflexive action, and then sometime afterwards fine motor control is gained by practice; akin to plantarflexor muscles in a walking stride.


Why can’t evolution build in the ability of flight and what it needs to learn in the 10 seconds is not how to fly but which way to go?


Instinct and coordinated muscle memory are two very different things. Animals can evolve to develop that muscle memory faster (i.e. see how long it takes a calf to walk versus a human infant) but it still needs time to develop and that development requires active practice.


A spider does not learn to make a web either. This is arguably more complicated than flying. And its brain is much smaller than a bird's.


A spider spinning web is an emergent behavior built up from a bunch of simpler ones like excretion, simple movement, and electrostatic hopping from point to point. A bird flying requires the coordination of a much larger number of muscles at the same time, which is much more complex from a nervous system perspective.

Spiders reproduce much faster and have a much smaller survival rate so they're well selected for that kind of instinct. Birds less so.


Excretion and simple movement from point to point, accurately describes most of my life.


I mean. I believe spiders eat lots of much smaller, flying creatures?


It can, obviously.


Well, but then the Fulmar "problem" doesn't seem like much of a problem to me. It doesn't learn. It's born with the knowledge, which was in turn "learned" (indirectly, through natural selection) by the fulmar's ancestors.


Now you've got a different problem! How are things born with knowledge? Where does that knowledge come from? Are humans?


I actually wonder how animals like sea turtles or fish know what to eat...They are not raised or taught by their parents, so is it just trial and error? What stops them from eating poisonous stuff ?


I can offer a example of bird. seem like the toxic butterfly just let birds feel terrible, so they actually learning from trying different butterflies.


Heh - that sounds like a really bad deal for the butterfly ?


warning colors work well. birds don't want to eat distasteful butterfly again. so even just looks like distasteful, butterfly still can enhance fitness.

at least this is a good deal all butterfly have same color...


Your epigenome (the sum of the external changes made to your DNA) affects your DNA with time, and some of these changes are even carried over to your offspring.

Behaviours can absolutely be learned and hardwired, given 1000s of years

As someone mentioned above, a deer is not born with random weights on its NN


How do babies know how to breathe, or drink milk? We're obviously born with some instinctual "knowledge".


I suspect birds don't start with random weights in their NN


Why does it have to be learning? Because we're currently putting all our eggs in the basket of machine learning, so we must convince ourselves that everything that intelligent animals can do is some sort of learning from data?

There is no reason to assume any of that.


Don't most animals innately know how to locomote in their environment? Fish in water, baby deer etc?


No one wants to admit that DNA holds programming because it'll mean there's a Creator.

I have raised animals apart from their kind from birth. They know instinctively how to do things that cannot be taught or learned. How can two spiders for example know how to spin a web without every learning from another being?


I don't see how genetics allowing for instinctual/"inherent" knowledge requires a creator.

Once upon a time there was a bug that had a little sticky bit on it's butt, it had a billion descendants when one of them randomly had a gene that affected it's behavior such that it, I don't know, rested on high surfaces using the sticky bit and was safer from predators or something, and then had slightly more kids than the rest. And one of those kids randomly had a gene that gave it more sticky stuff, and one of it's kids used it a little more effectively, and so on and so forth. And that probably takes somewhere from tens of thousands to a few million years.

That's not meant to be literal, but that's the story in my head, and it doesn't feel like a huge leap to get to complex behaviors like webs.


That's the story. Genetic lottery propagates and amplifies (life-)winning traits forward, but who knows.


Name an example of an advantage spontaneously manifesting outside of directed breeding by humans.


I really don't follow. Define programming in your context? Is it distinct from encoding physical structures like wings? Is it meaningful to separate physical structures and structures invoking physical behavior?

A column of rock erodes, and a piece falls. The piece did not learn to fall. It just _did_, subject to the rules that governed the process of its creation (different meaning) and its environment.


Specifically birds know how to fly or swim without being taught. Having raised birds that haven't seen or known another bird they just "do". They don't practice. It's like they are just waiting. Same with the language. They can understand like kind. When introduced to their own kind they aren't afraid but reunited.


I'd expect aerodynamic stability from wings to have a huge effect. Drones are inherently unstable. A glider is easier to figure out than a helicopter.

On the other hand, I could see a drone being easier for a computer because it behaves the same in any orientation (projectile motion + directional thrust). Unlike birds with their ears and their eyes, computers don't really have an inherent sense of direction.


Based on a lot of bird nesting videos, they most likely practice in the nest ahead of time. One doesn't have to outright fly to practice generating lift.


They don’t all appear to need this practice in order to be able to fly.

https://royalsocietypublishing.org/doi/10.1098/rspb.2020.066... is a recentish overview of existing literature and touches upon this in the section Wing flapping before flight.


It can fly in the same way a newborn foal knows how to walk and even run if it has to.


I find your analogy mentally stimulating, I bet the bird wins by a country mile though.


Any pubs in the area? GAs are a super interesting case area for this.


I think the title is a bit misleading, it's thousands/millions of robots that try to learn flying (in simulation) and only one of them gets really good at it and is selected, then flashed onto the real quadcopter.

It is kind of impressive that it works that well. But also shows how much at the beginning machine learning still is. Theoretically a robot could learn flying in the same way as a human, without crashing million of times first. And the robot could learn much faster than a human does.


No human has ever, not even once, learned to perform the kind of flying that this neural network learned. Because the kind of flying involved here involves subtly - and simultaneously - tweaking the speed of four independent motors by infinitesimal amounts hundreds or more times per second in order to maintain stability. Get the speed of just one motor wrong for even a fraction of a second and you might end up overcorrecting, losing control, and suddenly flipping upside down. No human pilot in history has ever had to perform such a feat. There are certainly flying machines that humans do learn to pilot fairly manually, but a quadrotor is not one of them.


Absolutely true, I guess we would need an experiment like that with a regular model air plane that also a human can learn to fly. I think the learning process for the AI would still be rather poor compared to a human.

But also in other machine learning experiments the AIs need way more practice than humans. They just can do repetitions much faster, and clone themselves.


I'm not sure why you believe that humans are better at anything.

https://spectrum.ieee.org/ai-drone-racing


I've wondered why we haven't seen a lightweight model on quads for PID tuning, betaflight firmwares great but pid tunings such a pain in the ass if you want it tuned well how about an AI that on the fly adjusts the pid rates.


Why would you use PID controllers if you can use NN directly?

Insisting on PID is rigidifying part of your control setup. It's forcing NN to partially not learn the optimal control.

It's a bit like transformers for LLMs. If we could train raw NN in reasonable time it would outperform transformers. We are currently using transformers just because it makes learning of large models feasible not because it's somehow the best architecture for predicting language. The same way PID is something that sort of works if you lack computing power to create better control schemes.


Arducopter has auto-tune (no ML, though). BF flies great out of the box, and there are presets for most popular types/sizes. Oh, and I don’t think they have devs on their team who are experienced in ML to the point of writing that code. I think the way to go to would be to first record the response to pre-defined control inputs in level mode, then tune offline from blackbox data. But with BF quads mostly lacking GPS, you would have to pick a large empty field to be able to run this without hitting anyone/anything.


Technically, PID only makes sense for linear time-invariant systems. A quadrotor drone is inherently a nonlinear system but people use linear controllers on it anyways.

I mean if we're going to use "AI" on controls, it better be looking for tuning something else, better still, we should only utilize the "optimization algorithms" parts of "AI" hence optimal control.


incorrect, most serious "PID controllers" for quadrotors are using geometric PID controllers that take nonlinearity into account, e.g. [1].

[1] Taeyoung Lee, Melvin Leok, N. Harris McClamroch. Geometric tracking control of a quadrotor UAV on SE(3). https://ieeexplore.ieee.org/abstract/document/5717652


Well geometric PID controllers are not vanilla PID controllers for LTI systems that I was talking about...

Also, and correct me if I'm wrong, I think in industry these vanilla PD/PID controllers are used for attitude and position control of quad-copters. Like, I don't think PX4 or BetaFlight implement any geometric controllers in their code bases.



Apart from this project, does anyone know any other DIY quadrotor building instructions with cheap off-the-shelf materials?


The quad these guys are using is a Crazyflie. Small, mostly prebuilt, targeted at researchers.

It has a few "addons" like optical flow sensors and external position trackers that let you extract ground truth attitude/position, which is helpful for research like this.

Not really "off-the-shelf," but it's a good thing to get if you care more about the controls, learning, trying new algorithms stuff more than sourcing parts, getting RC receiver/tx, sizing motors, that kind of thing.

https://www.bitcraze.io/products/crazyflie-2-1/


Interesting how they're using a PCB as the frame.


I got a tinywhoop kit for my nephew and my sister hated it, so it's probably good.

There is an entire ecosystem around those, so you can go piecemeal if you want.


+1 TinyWhoop

It's a company, but also the name of that whole style of drones. Anything in the 65, 75, or 85 mm range can whoop, maybe more, idk.

I'd start with a 75mm brushless if I were to do things over again.


You have a link for a model you’d recommend?


Sorry, not really. I love my Beta75X2S, but it's almost a decade old at this point, and it's a fast moving field.

I do have a source I trust, and here's his guide for 2023 (apparently it's someone he endorses):

https://www.youtube.com/watch?v=lgeeR8TiuP0

https://www.fpvknowitall.com/ultimate-fpv-shopping-list/

Looks like Mobula6 or Mobula7 is still a solid choice.

Also, you'll need a controller and FPV goggles. I would get suggest picking up something nice-ish. They're very general and can be used with many different vehicles/projects. I got cheap goggles and upgraded immediately.


I’d be interested in a kit. Containing everything.

Any reason DJI products don’t make your list? Would have thought they are the market leaders, no?

I’m just looking to use it for fun. Occasional use. No vlogging or racing or anything like that. Thing is, with the many AliExpress drones on the market, I can’t tell what is trash and what will actually be fun to use.

Thanks.


Should probably have asked you to clarify your intent...

"I want to take areal photos/videos": Cinematic Drone - Go with DJI stuff, Mavic probably

"I want to pilot a spaceship": FPV Drone - Get a TinyWhoop or simmilar

I've been describing the latter. If you go that route, unfortunately I'd recommend skipping the kit. I don't like the box goggles that will come with it because their FOV is aweful. The EACHINE EV200D is what I'd consider entry level, but it'll set you back about $300. Fatshark has some good stuff too.

https://www.fpvknowitall.com/fpv-shopping-list-goggles-video...

DJI does have a full FPV kit too and it's very nice and very expensive. Only reason I didn't mention it is it isn't what I'd consider a Whoop. It's in the "can give you stitches" class of drones and is not usable indoors. Still, really nice drone.

DJI's FPV googles and Air Unit digital VTX are worth a mention on their own because they're awesome and one of the few DJI products you can occasionally use with non-DJI systems. It's heavy though, so not likely to fit in a Whoop.


As a parent your deduction resonates well with my experience. How old is you nephew BTW?


I work for a drone company.

PX4 is ubiquitous in the industry and in prosumer devices.

PX4 provides the autopilot stack. There's all kinds of developer drone releases with all of the parts working and assembled.


What is your take on the market share between PX4 and Ardupilot, and in which domains? Are any operations rolling their own firmware?


> What is your take on the market share between PX4 and Ardupilot

Both of them use mavlink and mavlink sucks. Mavlink is probably the worst protocol I've seen used outside of a hobby environment.

In my experience, Ardupilot is more of a hobby autopilot. That doesn't make it bad, but it does make it less useful to industry and prosumers.


What problems do you have with MavLink? I will say I went on a (thankfully successful) goose chase this morning to figure out how to calculate the "CRC extra byte". I found the answer 2 ways: (1) by asking ChatGPT how to do it using the Python lib, and (2) by compiling ArduPilot and diving 10 folders deep into its build folder.

The main issue other than this I've found is it requires 12 bytes of overhead; could be shortened to half this. What problems have you found?

DroneCAN... now that's a hot mess. I think you will be pleasantly surprised with MavLINK after trying to implement that. I can go into details, but it's a more exciting experience if the jump scares aren't ruined!


> What problems do you have with MavLink?

Where do I start?

The C library has many problems. First, many of the constants are provided as macros (completely valid for C) but this can become a problem in C++ which is generally migrating towards constexpr functions or objects instead of macros. Second, many of the macros specify de/serialization to bitfields and the bitfields raise compiler warnings about safety when compiled in C++ with `-Wall -Wextra`. That's on top of the library being far too complex. I understand the need for an XML generator, but as far as I understand, the C library and C++ library do not provide an easy way to dynamically specify message types at runtime instead of at compile time (contrast with other message protocols). The library's headers are sensitive to the order they're included and they provide configuration via declarations. One of the configuration items is "how many" global "channels" to allocate. These global channels are not thread safe and are used by, eg, QGroundControl.

The mavsdk library for C++ is a wrapper around the C library (for the most part) and it brings additional problems. Using the C library means that things aren't very type-safe under the hood. Mavsdk's use of C++ wants to be modern but makes several design choices that I disagree with, particularly around threading (instantiating the C++ library creates a thread to handle its own event queue and creating sockets via mavsdk C++ library creates additional one thread per socket) and serialization (the C++ code does not usefully provide type safe serialization. It's a common pattern but definitely not helpful on power-limited devices: power consumption and latency is higher compared to asynchronous socket programming, both of which are important for flight duration and control feedback. These design choices also make it difficult to unit test with Googletest's EXPECT_DEATH tests which uses `fork()` and can be sensitive to thread problems.

Auterion is writing a Mavlink library, libmav. I haven't yet looked at its internals, but I understand they want to address a lot of the shortcomings of the official mavsdk C++ library. So I have high hopes for that ... but alas already have things written to mavsdk's API.

Speaking of threading problems, I've encountered problems with QGroundControl's use of threads. A pattern common among these libraries is poor use of lifetime management, and poor use of shared_ptr and/or mutex to guard against races. It's the typical kind of thing that even experienced engineers make ... when they don't use tooling such as Thread Sanitizer or Address Sanitizer to warn (and raise the warning to an error) and/or insufficient test coverage.

Past the libraries, Mavlink protocol itself also leaves a lot to desire. It's a protocol that wants to be at nearly every layer in the OSI model. It smells of reinventing the wheel at all of those layers. At layer 2, there's the fact that mavlink is designed to be transmitted directly over a radio, telephone modem, a serial bus such as RS232, or even a multi-component bus such as I2C. At layer 3 we have device and component addressing, and forwarding. At layer 4 we have message sequences, checksums, and retransmissions. Then layer 5 is a little fuzzy. Even at layer 6 there is a custom file transfer protocol. It has custom implementations of a terminal session, !

It stinks to the core of a hobby protocol which "matured" by reinventing every wheel there could be. That's just my observation as a 20-year software engineer who entered the hardware/embedded field a few years ago.

Some components or implementations/firmwares have different interpretations of the meaning of data fields. If a coordinate is XYZ then what is the frame of reference for the XYZ? If you've got an orientation with yaw/pitch/roll, then what is the frame of reference? Is altitude based on AMSL or AGL or pressure instrumentation? Are we using WGS84 or something else? Even some messages themselves are deprecated, and there are some custom messages that vendors will use (remember: using custom messages requires a recompile of the library, with a customization to add the message definitions).

That's just what I have time to write off the top of my head. There's a ton more problems with mavlink, from an experienced software engineer's perspective.


Hey - Thank you for the detailed insights! I appreciate all the info. For some context, I am writing bare-metal in Rust. I've been using MavLINK for some basic stuff that is self-contained; trying to use a MAVLink Gimbal this weekend. The naive approach didn't work, but we'll see. I will hint that DroneoCAN: #1: Uses bit-alignment. Even if that means throwing off the alignment on an 80 byte message to save 3 bits. #2: Uses something similar and as poorly-document as the CRC_EXTRA byte based on message format. #3: Uses inscrutable ID assignment and Get/Set APIs. I have posted on various Githubs and discords, and the verdict is I'm the only out of touch. Ok. Fine. I will and have complied with the standards, but I am not obliged to like them, and I think they have severe engineering problems. Thank you for your insight on MavLINK. I am tolerating it for now for gimbal operations, but see the writing on the wall for the custom messages I'm using it for. Like, I need 1 byte for payload size, 1 byte for message type, maybe 1-2 for CRC... the rest of the header is a liability for OTA transmission time.

I propose we don't even talk about MSP; Holy fuck.


Don't misunderstand me: mavlink gets the job done which is better than nothing. But its use requires careful consideration about scope and error handling -- more careful than should be necessary if COTS were used instead. I seriously wish I could have a few years to rewrite the whole stack.

> I am tolerating it for now for gimbal operations

Make sure you're using latest (there was a recent fix to a crash in the gimbal manager and reconnect code recently). The mavlink guys are usually pretty responsive to mavlink support questions on Github issues and discord. Many of the vendors are also on discord, useful if you're integrating with components.

> I need 1 byte for payload size, 1 byte for message type, maybe 1-2 for CRC... the rest of the header is a liability for OTA transmission time.

Yup, exactly. If the component and you are both already connected to an IP network then there are more industry standard protocols to use which perform better (battery, bandwidth, latency, the whole shebang) perform in a more expected behavior (a single message to a single IP destination is has well-defined routing rules, but a single mavlink message is likely broadcasted and possibly broadcasted multiple times (even when deployed to a single system), and can be secured better because security tools talk IP.

You mention DroneoCAN - I assume that's a typo for DroneCAN [0]. Can you use ROS2 [1]?

Contrast ROS2 with mavlink. The communication layer between different ROS2 components (nodes) can be changed for different implementations of transport and I've seen things like shared-memory (same-host, saw it at a drone conference) or replacing with another COTS (I don't remember which one though, but I think it was ZeroMQ or based on it).

If you can provide a ros service then that's a step in the right direction as far as I'm concerned. ROS already has existing services to interface with CAN [2]. Even if you have some specific reason to need DroneCAN, there appears to be a an example for using ROS2 to talk to DroneCAN [3].

[0]: https://dronecan.github.io/

[1]: https://github.com/ros2

[2]: https://kagi.com/html/search?q=ros2+canbus

[3]: https://github.com/ARK-Electronics/ros2_dronecan


Interesting in regards to ROS2; I'd not heard of it. It sounds like a type of RTOS or library? I'm not sure if I can use it; my devices are bare metal currently. I am using DroneCAN because it is #1: A bus. #2: Differential signaling. #3: plug+play with an ecosystem. It's a great way to set up decoupled systems, where the flight controller isn't doing all the IO.

I should note that my perspective subtly clashes with the designers of both MavLINK and DroneCAN: think these protocols should operate as you'd read a hardware datasheet: It should be easy to implement in any language by referencing a byte-aligned table, and have explicit instructions. Because, it's just bytes down the wire. The DroneCAN creators, and it sounds like MavLINK as well consider it something where you should rely on an official library. The Get/Set API is a hot mess? It doesn't matter if you expect no one to implement it because the expectation is to use the official library. You will also note that DroneCAN (and maybe mavlink? Not sure) devices have poor or no protocol documentation, which is astounding to me.


ROS2 is not a RTOS library. It's a set of libraries and tools. Though it can be installed in a real-time OS, it can also be installed in eg a Dockerfile.

> You will also note that DroneCAN (and maybe mavlink? Not sure) devices have poor or no protocol documentation, which is astounding to me.

Astounding, yes. Surprising, no.



Honestly anything with `ardu` in the name screams "I just learnt to code and I don't know what I'm doing". I know nothing about this space but I'd definitely check out the PX4 link that they other guy posted before this.


Don't dismiss Ardupilot so easily. Yes, it grew out of Arduino Megas running the code with some stapled-on sensors, but that was many years ago - it's an extremely powerful and arguably more "hobbyist"-friendly platform that's very comparable to PX4. In my experience PX4 lends itself very well to more scientific or industrial use, especially when integrating with other on-board compute units that are part of the payload, whereas Ardupilot is much easier to get working and capable off-the-shelf. Both software stacks run on essentially the same hardware nowadays.


I disagree about ease: ArduPilot is not easy to get working. It's a pain in the ass to get a flyable config, for say, a basic quadcopter. I now have a flyable config and notes, but it took me a lot of trial + error to get there. It has... surprising default settings , like going into a tumble if you 0 the throttle. Betaflight and PX4 are easier.

I have a lengthy notes file with exact settings required to make ArduPilot work well with a small quad. The result is a great experience, but getting there was not!

Also of note for ease: The ArduPilot code base is a mess compared to PX4's.


> I have a lengthy notes file with exact settings required to make ArduPilot work well with a small quad.

There are easier ways, like using an existing config file someone posted on AP forum or rcgroups..

It’s not made specifically for small quads and it will never fly as well as BF or even iNav. I wouldn’t expect an agricultural drone to support turtle mode, so no air mode by default shouldn’t be surprising :)

Re code base: AP supports LUA so for small mods you won’t have to change the code base at all, compared to PX4..


Ardupilot is very very mature software. It stopped being able to be run on an arduino about a decade ago. There's a guy who has been doing aireal waypoint missions for miles and miles, as well as terrestrial boat missions lasting days. It's been adapted for sailboats as well.


You referencing rctestflight?


Yes, but I've also used it to do my own RC sailboat missions


For freestyle betaflight is the goto firmware


At least click the link before submitting your premature damnation


The only advantage of PX4 is its license - it’s BSD vs Ardupilot’s GPL, so companies can use the code without giving back. The mindshare just isn’t there - PX4 is orders of magnitude more obscure, so instead of YouTube tutorials you’ll be trying to get help on PX4 specific forums or digging through the code base.


It depends what quad you are after, fpv, normal, something else? But they are all have some essential components and other extra ones, I am planning to make a full course soon including the software that will enable it to fly over the internet (cellular like LTE/5G).


Does anyone know of an affordable COTS drone one could try this out with? I’m very interested in the intersection of machine learning and control theory.


There's a link in the article to the open source drone they were using: https://www.bitcraze.io/products/crazyflie-2-1/


For anyone who might want to buy that - it’s a horrible craft to use for anything but in a lab environment set up with a motion tracking system (you won’t be able to do indoor navigation using just the IMU and barometer) - using PCB as frame is not good for flight performance or durability. Any whoop would fly much better.


Author of the referenced work here. I have to disagree with this, at least with the motion tracking system part. The optical flow deck [1] works like magic and can maintain low drift positioning even when doing relatively fast and agile maneuvres.

For an example you can have a look at our video [2] (e.g. the disturbance rejection tests at 3:00 and after are solely using the optical flow deck)

[1] https://www.bitcraze.io/products/flow-deck-v2/

[2] https://www.youtube.com/watch?v=NRD43ZA1D-4&t=198s


Awesome work! Can you share the titles of some of your favorite books or courses in controls and RL (for controls) that helped you in doing this project?


To everyone going on about animal learning... Just a reminder that this has pretty much nothing to do with animal learning.

The real interesting parts is, A) rapid simulation that just works when loaded into a real quadrotor and B) that the NN is generating the PWM pulses! Eliminating all those separate control systems into a NN is an interesting leap.


The people developing this tech could end up having to face off against it one day. This is where modern warfare is now. A swarm of these could probably clear out an entire trench of soldiers. Run Forest Run, doesn't really work against drones, they'll just hunt you down, Terminator style. Brazen Bull.


This. Ukraine war is a testing ground for these next gen war devices and very soon they will be flying and locating targets completely autonomously.. Combine that with genai models that anyone can train...


Scary, but not completely true; Thankfully, at the moment, they can only fly for about half an hour. And if you're going to have the thing carry weapons too, then it's going to be even heavier.

One day though, it'll be true. And sunlight will be blocked by swarms of them. And obviously, the AI will win.


Put up a net.

Actually drone interception is also a research topic.


This shit should be banned from the battle field. I've only seen one guy surviving a drone attack by throwing a stick at it. AI already mockingly reduced us to being helpless apes throwing sticks at it. Foreshadowing.


On the contrary. It should dominate battlefields. To the point of making humans there completely obsolete. Any human loses in war should be treated the same way as civilian loses. It's dumb that we create special class of people that we are fine they are getting killed.


Ah yes, one step closer to Slaughterbots reality: https://www.youtube.com/watch?v=O-2tpwW0kmU


I thought exactly the same thing, this video is probably among the most impactful I've seen in the past decade.


I had the exact same thought but you beat me to the share.


I know this nit-picks but... it didn't "learn" anything.

It progressively improved its algorithm using a series of feedback sessions.

It _improved_ but it didn't _learn_. Someone pre-programmed basic control and feedback parameters.

Don't get me wrong—this is still amazing and has useful applications. But can we please stop calling this improvement/refining/tuning process "learning"?


Author here. We did not pre-programm basic control or feedback parameters. The trained policy (fully connected neural network) takes the state estimate (position, orientation, linear and angular velocity) and a history of the previous actions as the input and directly outputs the PWM/RPM setpoint (uint16 that sets the pwm interval).

The code is open source, you can verify it yourself ;)

https://github.com/arplaboratory/learning-to-fly


Do you have any idea what level of effort would be involved to modify the quadcopter? Or, say, use a different one? Would it require a lot of experience developing physical models for simulation?


Yeah my thoughts exactly, but you know how it goes in robotics, the buzzier the word the better!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: