How Statisticians Could Help Find That Missing Plane

What happened to Malaysia Airlines Flight 370, and where is it now?

Statistical tools can’t answer those questions any more definitively than Malaysian officials have. Yet they can help refine and focus the hunt for the plane and for a solution to the deepening mystery of its March 8 disappearance.

Bayesian statisticians are particularly helpful in a search operation. Their methods allow hunters to update their estimates of the probability of finding their target in any latitude-longitude combination — or even in three dimensions, accounting for depth in the water. Bayesians helped hunt U-boats in World War II, a U.S. submarine in the 1960s and an Air France jet in 2011.

There’s a fourth dimension to the current search: the cause of the disappearance. New developments, such as information about how the plane’s communication systems were shut off, have lowered the probability that the plane disappeared because of an accident and increased the likelihood of deliberate diversion. Which explanation is the current leader, in turn, affects the probability of finding the plane at any given location: A deliberate act has made spots farther from the takeoff point of Kuala Lumpur more likely.

Bayesian inference formalizes what will seem, to many unfamiliar with it, like common sense. Its founding principle is that most new situations can be assessed and assigned probabilities: How likely is this restaurant to be good? How likely is this cough to be a cold? How likely is Duke to win the NCAA title?

Our first estimate of these probabilities may be no better than an educated guess. For example, we know that 60 percent of our restaurant meals in town have been good, or that Duke has won titles in four of the last 25 seasons (16 percent).

Then we start layering new information. The restaurant is full. Now we can feel more confident in our choice: All of our good meals in town have been in full restaurants, but just half of our bad meals have been. What is the chance of a good meal, given that a restaurant is full? It’s 75 percent, based on this new information, since 75 percent of meals in full restaurants have been good.¹ Before ordering, we check our favorite food-review website and see that the place has four and a half stars out of five. Every meal we’ve eaten at restaurants rated that highly has been good, but just half of our meals at restaurants with lower ratings have been. So we update our probability again, accounting for any overlap between full restaurants and highly rated ones — until we eat, when probability is no longer a relevant concept because our mouths are full.

Apply the same ideas to Duke, and you might examine the Blue Devils’ current ranking, their recent games, the probability FiveThirtyEight’s model and others assign to the team’s title hopes, and other tools to update that coarse, 16 percent probability.

These examples require calculation of a single probability. Targeting a search in an area requires a probability estimate for every point in that area, really a probability distribution. Initially, we might guess that the probability is uniform: The object of interest is equally likely to be at any point. Then we update that distribution based on new information, such as — in the case of a missing plane — flight path, wind, ocean flow and which areas have been searched already.

This approach has informed water searches for sunken treasure, for men overboard and for plane-crash debris, said Lawrence D. Stone, chief scientist at Metron Scientific Solutions, who has worked on many of these searches. Among them: the hunt for the remnants of Air France Flight 447, which crashed in 2009 en route from Rio de Janeiro to Paris, killing all 228 people on board.

Stone and his team’s methods helped inform a fifth Air France search two years after the crash, after four other efforts failed. Within six days, they found the wreck, and helped to show that the crash likely occurred because of pilot error in response to autopilot mode disengaging. After waiting nearly two years to understand their relatives’ disappearance, passengers’ families finally had answers. “We were very pleasantly surprised,” Stone said by email. “It doesn’t always happen this way.”

Among the challenges is finding basic information about which areas have already been searched. “Search managers typically assume the search will be resolved quickly; only when it drags on do they realize they should have employed better record-keeping from the start,” Colleen M. Keller, a Metron senior analyst who worked with Stone on the Air France hunt, said by email. “Without good records, it will be very difficult for us to reconstruct and credit the current search effort.”

Even with ample data, Bayesian hunters must quantify subjective judgment, “using expert testimonies and imagination,” as Nozer Singpurwalla, a professor of risk analysis and management science at the City University of Hong Kong, put it in an email.

The Metron team outlined its success in a paper that makes plain the subjectivity inherent to the approach. For instance, the team had to account for the possibility that an earlier search covered an area including the wreckage site but missed it. So they examined each prior search, one by one.

In the second search, in June and July 2009, two U.S. Navy ships listened for acoustic signals from beacons on the flight data recorder (also called the black box) and the cockpit voice recorder. Searchers designed the ships’ path to ensure that they got within 1,730 meters (a little over a mile) of every point in the area of the Atlantic they were scanning. Metron searchers had to calculate the probability that this search failed not because it was in the wrong area but because the beacons malfunctioned. This step alone, critical in determining how likely it was that a repeat search in that area would yield the wreckage, required several intuitive estimates, or educated guesses, if you like.

First, the scientists calculated the probability the beacon would be heard within 1,730 meters as at least 90 percent. Then they capped that probability at 90 percent, based on learning from past searches that “detection estimates based on manufacturers’ specifications and operator estimates tend to be optimistic.”

“I had been burned a couple of times before by optimistic sensor performance estimates,” Stone said.

Next they had to account for the possibility the beacons had been damaged in the crash. This, too, required extrapolating from past crashes, which yielded an 80 percent probability each beacon was unscathed. But how close together were the beacons — and if one beacon was damaged, was the other more likely to be? Or was the probability of either one being damaged independent? The scientists figured there was a 25 percent chance of independence and 75 percent chance of dependence. That, in turn, yielded a probability of 77 percent that if the wreck were in that area, the beacon search would have turned it up.² And that was just one piece of one adjustment to the scientists’ estimate of where they might find the plane wreckage.

This method may sound like making up numbers, but to advocates of Bayesian techniques, applying a rigorous framework to expert subjective judgment is valuable. “This is one of the strengths of the Bayesian method,” Stone said. “We did not have a thousand examples of AF447-like crashes to guide us. This is when the Bayesian approach is most useful.”

There are far fewer than 1,000 examples of past missing airplanes to guide the current search — just 80 since 1948, according to the Aviation Safety Network. Only two planes in the last seven years disappeared for at least 10 days: Adam Air Flight 574, in 2007; and Air France Flight 447. That makes subjective judgment a crucial input into the search, whether it’s being done qualitatively, or quantitatively using Bayesian methods.

Keller said Metron isn’t involved in the Malaysia Airlines hunt. If it were, the same principles would apply: Start with all data, such as radar, visual or acoustic measurements, transmissions from the plane and so on. Then update to account for unsuccessful searches, and keep updating as new information comes in. “Bayesian search theory allows flexibility in this way and even accommodates conflicting information,” Keller said. “Nothing is discounted.”

The extra layers of complexity in the Malaysia Airlines search — the new estimates of the plane’s location, mounting evidence that a deliberate act caused the disappearance — complicate the Bayesian calculations and estimates.

Bradley Efron, a Stanford University statistician, said the complications make Bayes a bad fit for the Malaysia Airlines hunt. “Bayes’ Rule is good for refining reasonable (or at least not unreasonable) prior experience on the basis of new evidence,” Efron, who also expressed skepticism to Al Jazeera America, wrote in an email. “It is not good when new evidence changes the situation drastically.”

But Tony O’Hagan, professor emeritus of probability and statistics at the University of Sheffield in the U.K., said that’s the perfect situation for Bayesian techniques, which should make searchers most effective in adapting to changing information, so long as they properly assume from the get-go that the plane might not be in the initial search area.

Other advocates of Bayesian stats pointed to their usefulness in bringing discipline to what can be a difficult search process. Citing the work of Daniel Kahneman and Amos Tversky on decision-making, O’Hagan said, “People have two modes of thinking. There’s a quick, instinctive mode and a slow, thoughtful mode. When problems are important enough, we need to force ourselves into the second mode — and Bayesian methods are exactly what we need.” He added, “It is likely that Bayes’ theorem would adjust faster than people would tend to do using quick-mode thinking” — because people can get locked into their quick conclusions, while Bayes slowly and steadily tacks.

Arnold I. Barnett, a statistician at the MIT Sloan School of Management, worries that people who use the tools without fully understanding them may be led astray, “that the very act of quantifying a probability obscures the point that the numerical estimate is itself subject to uncertainty,” he said. “Thus the estimate might be taken more literally than is warranted.”

Despite the widespread use of Bayesian methods in searches, not every airline or government uses them — there’s no evidence the Malaysian government nor Malaysia Airlines has in the current search, though they are getting help from a team of French government investigators who worked on the Air France hunt. (The Malaysian embassy in Washington, D.C., didn’t immediately respond to an emailed request for comment, and calls to the embassy on Monday yielded an automated reply that the office was closed because of snow.)

“I suspect that they just guess, like professional baseball managers used to do before ‘Moneyball,’” said Peter Thall, a biostatistician at MD Anderson Cancer Center in Houston.

CORRECTION (March 18, 11:30 a.m.): An earlier version of this article incorrectly said Metron joined the search for Air France Flight 447 two years after the crash. Metron joined earlier; a search based on its Bayesian calculations that found the wreck began two years after the crash.

CLARIFICATION: An earlier version of this article garbled an example of Bayesian inference about choosing a restaurant. The example should have said that meals were good at half of restaurants with ratings below four and a half stars, not that half of lousy meals have been good.

Footnotes

Suppose you’ve eaten at 10 restaurants. Six of them have been good, and four bad. All six of the good meals have been in full restaurants. Half of the bad meals — or two meals — have been in full restaurants. So the conditional probability of a good meal, given a full restaurant, is 6/8, or 75 percent.
The probability of 77 percent is a weighted average of two other probabilities. The first is that the search would have found the wreck, given that the two beacons’ chances of survival were independent. The second is that the search would have found the wreck if the two beacons lived or died together.

The second probability is simpler to calculate, so we’ll start there. It’s equal to the probability the search would have found the wreck if the beacons survived (0.9) multiplied by the probability they survived (0.8). So it’s 72 percent.

The first probability can be calculated as the sum of three probabilities. The first is the probability that both beacons survived, and that at least one was detected. The probability of both beacons surviving is 0.8^2, or 0.64. The probability, given their survival, that both were detected is one minus the probability that neither was detected. There’s a 0.1 chance of a live beacon going undetected in this search, so the chance of two live beacons going undetected is 0.1^2, or 0.01. Subtract that from one and get 0.99. Multiply by 0.64, and you get 0.6336, or 63.36 percent.

The second and third probabilities are identical: The probability that the first beacon or second beacon, respectively, survived while the other failed, and that the sole surviving beacon is detected. The chance of a specific beacon surviving while the other fails is the product of the probability of one surviving — 0.8 — and the probability of the other failing — 0.2. That yields 0.16. Then multiply that by 0.9, the probability that a single beacon is detected, and get 0.144, or 14.4 percent. That’s both the second and the third probabilities. So add them together, and add 63.36 percent, and you get 92.16 percent.

Now take the weighted average of the two: (0.75*0.72)+(0.25*0.9216), or 0.7704.

Footnotes

Comments