How Statisticians Found AirFrance Flight Two Years After It Crashed

raverbashing · on May 27, 2014

It was not really "statisticians" who found it.

The initial searches based their original search area on backdrift from debris found, which is very unreliable

Instead, if they had focused their search area on the original route and estimated flight time before crash they would have found it sooner.

THEN if you don't find it you try different things.

Their Bayesian analysis was basically considering the originally scanned area (looking for ULBs - very unreliable as well) and where they didn't find it.

I'm not trying to rain on their parade, but I think the main mistake was (not theirs - the search team) relying more on backdrift than other information (basically figure 14 here http://isif.org/fusion/proceedings/Fusion_2011/data/papers/1... - the highest probability area is, obviously, near the last known position)

mschuster91 · on May 27, 2014

The biggest problem with MH370 is that there is not a single "higer quality" location datapoint available. In AF447 you had dead bodies and iirc debris, which at least helped to roughly identify the area of the crash site.

With MH370, the possibilities are next to infinite. Nothing is known except radio data with dozens kilometers of accuracy.

raverbashing · on May 27, 2014

Correct

With AF447 you had the exact position at approximately 10min before the crash (and debris).

With MH370 and the Inmarsat data you have something that is much more uncertain

anigbrowl · on May 28, 2014

Agreed, though I was happy to see that the dataset for that has finally been made public which at least menas there will be more people working on it: http://www.dca.gov.my/mainpage/MH370%20Data%20Communication%...

ACow_Adonis · on May 27, 2014

This was actually sent round a couple of weeks ago where I work by one of our executives (I work at a place that has Statistics in its name).

I didn't mention it then, because I didn't want to be a debbie downer and hurt everyone's "rah rah statistics yay!" feeling. Or send something round that contradicts an executive :P But the paper seemed dodgy as hell to me.

This wasn't solved with Statistics. Nor was it solved by Bayesian statistics. It was solved when people did lots of searching, failed to find anything for a while despite searching where the plane was, and then they eventually found the plane near its last known location despite having already searched there.

If anything, the paper was just an application of the texas sharpshooter fallacy. I believe the authors made several models, and then included the one in the paper that showed the result they wanted.

See paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.370...

Indeed, see the graphical representation of the probability distribution in the paper, you'll notice that the plane was found in a section of the probability distribution that had already been searched earlier: an area that should now have a lower probability than the opposite half of the probability distribution if we're using semi-reasonable bayesian techniques. Indeed, this is what we see in figure seven.

In figure 8, we see their new probability distribution under the assumption that the beacons did not work at all, where they try to say their method pinpointed where the plane was. But since this was not known at the time, and figure 7 is the far more reasonable bayesian model given previous searches in the area (because figure 8 assumes a 100% probability of both beacons failing, something practically no bayesian would do), i posit that they either made that model after the fact, or they indeed had several models fitting several different scenarios, and after the plane was found, they chose the one that best fit the data post-facto. Odds are, if the plane was found, it would fit one of their scenarios, and they would then write a paper saying how their model was such a success. Figure 7 is the far more reasonable bayesian distribution, and it actually tells you to now search in the wrong area.

If they followed bayesian methods, there is in fact a bottom half of their distribution that should of been searched next (where the plane wasn't), and they in fact found the plane in an area that had already been passively searched: an area that should have downplayed by bayesian probabilities for future searches because of the unlikely-ness of this area being searched and yet still finding nothing.

I actually like Bayes, A LOT, but this is not a good example, except perhaps of the precept: if you want to find a lost plane, its probably a good idea to start looking near where it was last located.

lotsofmangos · on May 28, 2014

More or Less covered this much better.

The company had already been called in once and had completely failed to find anything because their assumption was that the plane would be unlikely to be anywhere that had been searched. This was their second go and they changed their assumptions.

"It still was a minor miracle that we found it," says Keller.

http://www.bbc.co.uk/news/magazine-26680633

Someone · on May 27, 2014

Also, even if this paper were written before the plane was found, this N=1 example would not be strong proof that the method at hand works.

truncate · on May 27, 2014

For those interested in more technical details here is paper I found -

http://isif.org/fusion/proceedings/Fusion_2011/data/papers/1...

https://www.informs.org/ORMS-Today/Public-Articles/August-Vo...

curtis · on May 27, 2014

The thing that I find most interesting about this part of the AF447 story is this:

> But Stone and co chose to include the possibility that the acoustic beacons may have failed, a crucial decision that led directly to the discovery of the wreckage. [Emphasis mine]

spitfire · on May 27, 2014

The actual paper on the topic.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.370...

dba7dba · on May 28, 2014

Another similar case of using stats and science to find something (in this case German U-boats) in the middle of ocean. http://www.amazon.co.uk/Blacketts-War-Defeated-U-Boats-Broug...

UK/America/other allies were suffering horrendous shipping losses caused by German U-boats. What really helped turn the tide was scientists using 'science' to predict where to find the German U-boats.

ajb · on May 27, 2014

Yeah, bayesian search theory is an interesting topic. A while back I wrote a program to search for intermittent bugs using it (a sort of bayesian version of 'git bisect') It hasn't seen real use (Intermittent bugs tend to involve real hardware, so you can't just try it out on any bug database) . But if anyone wants to try it out, it is at https://github.com/Ealdwulf/bbchop