## Tuesday, 24 November 2015

### The Red Button Problem

Liverpool University Professor Simon Maskell created the 'red button problem' as a simple but messy conundrum that can be used to elucidate the similarities and differences between different approaches to handling uncertainty, such as Bayesian inference, Dempster-Shafer, frequentism, 'fuzzy logic' and so on. We'll outline the problem here, and present our answer, which comes from a squarely Bayesian angle.

#### The Red Button Problem

You are in charge of security for a building (called 'B' in the problem, but let's call it 'The Depot'). You are concerned about the threat of a VBIED (vehicle-borne improvised explosive device, or car bomb) being used against The Depot.

An individual (called 'A' in the problem, but let's call him 'Andy'), has previously been under surveillance because of 'unstable behaviour'. He drives a white Toyota.

 A white Toyota of the kind that may be owned by 'Andy'

10 minutes ago, a white Toyota was spotted on a video camera on a road 200m away from The Depot. An analyst ('Analyst 1' - let's call him 'Juan') with ten years' experience views the video footage and states that Andy is 'probably' near the Depot. An automated image recognition system, analysing the number plate, states that there is a 30% probability that the Toyota in the image is Andy's car.

5 minutes ago, a white Toyota was spotted on a video camera on a road 15km away from The Depot. A second analyst ('Analyst 2', who we'll call 'Tahu'), who is new in post, reviews the footage an concludes that it is 'improbable' that Andy is 'near The Depot'.

There is a big red button in your office, that if pressed will order The Depot to be evacuated.

 A big red button yesterday

Based on the evidence you have, should you press the button?

#### Key Uncertainties

This problem is deliberately full of ambiguity. There are many questions that, in real life, you would want to know, and probably would already know or could easily resolve, including:

• Is Andy's 'unstable behaviour' of a kind that is in any way empirically linked to VBIED attacks? Why was he under surveillance?
• How many other individuals like Andy pose a security concern?
• Which country and city is The Depot in, how easy is it to obtain the materials to manufacture a VBIED, and how prevalent are VBIED attacks?
• Is The Depot in some way a salient target, in general and for Andy?
• Do we have video feeds from the streets adjacent to The Depot, so we can scan them for the white Toyota?
• Are analysts on the lookout for Andy?
• What kind of analysis is Juan experienced in?
• To what extent is Juan already factoring in the NPR data?
• When Tahu says he thinks it's improbable that Andy is near The Depot, does he mean that 15km away (where he spotted the Toyota) is not near, and that he thinks it's Andy's car, or that he thinks 15km is near, but that it isn't Andy's car?
• Given the roads and traffic concerned, could Andy conceivably have driven at 180kmph - a fair lick, but not physically inconceivable - and thus be in both videos? Is this even possible in a Toyota?
• Under what circumstances is it safer to evacuate The Depot rather than stay inside or perhaps retreat to the basement?
• How many people work in The Depot, and how costly is evacuation?

Thankfully, instead of requiring concrete answers or forcing us to make cavalier assumptions, the Bayesian approach to tackling these sorts of uncertainties is simply to expose and quantify each of them, based on reasonable comparison with similar situations and drawing on all the available evidence including the existence of the problem itself. For example, although we don't know anything about Andy's 'unstable behaviour', the fact that there are analysts keeping an eye out for him, and that we (the person in charge of the red button) are involved in the problem at all, suggests that he is at least more likely than the average person to commit a VBIED attack. Similar kinds of reasoning can be applied to think about likely answers to each of the questions above, so these uncertainties can be factored into our response. In this case, though, it's largely unnecessary to do this for all of the questions above, as we gain most of what we want for a decision through order-of-magnitude judgements, using a simple model.

All analysis, particularly in the realm of business or government, adds value because it reduces uncertainty and therefore risk. The more closely one's object of analysis - the key uncertainty, or target hypothesis - maps to one's decisions, the more valuable the analysis will be. In most cases, this means that analysing the decision needs to precede the analysis of the problem

In the case of the Red Button Problem, the decision is a simple binary one: evacuate, or don't. If we evacuate, there are two stages to think about - during the evacuation, and after the evacuation. While people are filing out of The Depot, one assumes to a remote assembly point, they might be more vulnerable to a VBIED attack. The US National Counterterrorism Centre helpfully provides some guidance on how to respond to VBIED threats. In the case of an explosive-laden sedan (e.g. a Toyota) the guidance suggests that sheltering in place is the optimal strategy at a radius of 98m or more (98m is curiously, and implausibly, specific here - why not say 100m?) but that 560m is the safest distance. Closer than 98m, evacuation is always optimal. If The Depot is located in the centre of a compound, staying inside would therefore be better. But if it's roadside, it would be better to evacuate, presumably because being inside a building at that radius would be as dangerous as being caught outside during a blast.

The decision is therefore time- and distance-sensitive. If there's a VBIED closer than 100m to The Depot, evacuation is best under any circumstances. If there's a VBIED between 100m and 560m from The Depot, sheltering in place would be best if it's going to explode soon, but evacuation would be better if there's enough time to get people to a safe distance. This might take in the order of ten minutes. Beyond 560m, you certainly shouldn't evacuate as people will be safer inside.

So we have three mutually-exclusive hypotheses that we're interested in here:

H1. There is no VBIED.
H2. There is a VBIED that is timed to explode in roughly the next ten minutes.
H3. There is a VBIED that is timed to explode in more than ten minutes.

If The Depot is near the road, evacuation is optimal under H2 and H3. If The Depot is more than 100m from the road (but less than 560m), evacuation is optimal under H3. Under all other circumstances, evacuation isn't optimal.

#### Costs and Benefits

Optimal decision-making involves at the very least a comparison of costs and benefits with probabilities. A bomb might be very unlikely, but if the risk of staying put is significantly higher than the cost of evacuating, it might still be right to press the button.

In this case, the cost of evacuating, whether or not a bomb goes off, will be equal to the lost productivity from workplace absence. How much is that worth? Depending on what sort of work The Depot does, it won't be too far off £10-100 an hour per person. Assuming, in the event of a false alarm, that the area can be confirmed safe (and people return to their desks) in about an hour, and that there are (say) 100 or so people at The Depot, the cost of an evacuation would be of the order of £1,000-10,000.

What if we don't evacuate - or we evacuate at the wrong time, and people are still walking to their muster point - and a bomb goes off? There are two things we need to know: how many deaths are likely, and how much is a death worth?

It's hard to find statistics about numbers of people killed by car bombs at various distances. But we have a few reference points. The Oklahoma bomb killed 168 people, in an office building that accommodated about 500. It was a massive bomb, and close to the building. As an upper limit, this suggests that something like one third of The Depot's occupants would be killed. Then you have other costs, such as ongoing medical costs for those injured. If The Depot were further away from the road, this figure would be lower, and we can assume that at 500m or so, the probability of being killed is negligible. So we are perhaps looking at something around 10% fatalities from a nearby VBIED if The Depot were not evacuated. For the trickier situation in the 'shelter-in-place' zone, we may have to do some handwavey guesswork if the decision turns on the relative probability of a bomb in the next 10 minutes (the evacuation time) compared to a bomb afterwards.

How much is a life worth? Although some organisations avoid putting explicit values on human lives, the 'yuck' factor aside, you have to do it somehow or you won't be able to make a decision. If you use lifetime productivity (at the lower end) or willingness-to-pay to avoid death (at the upper end), estimates of the value of a life are typically in the order of £1m-10m. Assuming there are around 100 staff, and that not evacuating when there's a bomb next to The Depot would lead to around 10% fatalities, this gives us an order-of-magnitude estimate for the maximum cost of an un-acted-upon roadside bomb of around £10m-100m.

The ratio between the two figures - the cost of evacuation and the cost of failing to evacuate when there is a bomb - could therefore be somewhere between 1000:1 and 100,000:1. This ratio - between costs and benefits - gives us a 'critical probability' of a bomb, above which we should evacuate, and below which we should sit tight, for those situations in which evacuation may be optimal.

#### On to the Probability

The decision analysis has given us some useful information about roughly what we need to establish. If The Depot is close to the road, we should certainly evacuate if the evidence suggests a bomb probability of more than 1 in 1000, and certainly not evacuate if it's less than 1 in 100,000. If it's in between, we need to think a bit harder but we might take the risk-averse option and evacuate anyway. If The Depot is further from the road, but less than about 500m away, we have a trickier decision that will depend on our assessment of bomb timing and relative safety inside or outside the building.

In the usual Bayesian fashion, we'll take the approach of splitting our probability estimate into a prior probability, typically using background frequencies, and a set of conditional probabilities that take into account the evidence. This provides a useful audit trail for our estimate, to identify key sensitivities in our assessment, and to focus discussion on the main areas of disagreement.

First, then, what's the background frequency of VBIED attacks? Well, clearly it depends. In Iraq, there were over 800 VBIED attacks in 2014 alone, out of around 1400 worldwide (according to the ever-useful Global Terrorism Database). But assuming The Depot isn't in a troublespot - Iraq, Yemen, Nigeria etc. - the prior probability of a VBIED will be minuscule. There are - generously - a few dozen such attacks in stable countries worldwide. There are a number of back-of-the-envelope ways we could derive a prior probability from this, but it would be of the order of 1,000,000-to-1 a year that a VBIED attack would hit a particular office building, and therefore of around 1,000,000,000-to-1 or less that a VBIED was parked outside a particular building and timed to go off within the next (say) eight hours.

The question, then, turns on the power of the evidence - the relative likelihood of that evidence under the 'bomb' and 'no bomb' hypotheses. Is this evidence sufficiently powerful to raise the odds of an imminent VBIED by a factor of 10,000 or more - to at least 100,000-to-1 - as would be needed to make evacuation potentially optimal?

First, we will discount Tahu's evidence entirely. We don't even know what he means. Does he mean that the car he saw 5 minutes ago (15km away) was Andy's car, and that 15km is 'not near', or that although 15km is 'near', the car he saw wasn't Andy's car and so Andy is unlikely to be near The Depot simply because there's no reason to think he would be? We don't know. Both interpretations seem equally likely, and they pull in different directions: if the car was Andy's car, we can pretty much rule out his presence at the Depot, but if it wasn't Andy's car, the other evidence becomes more important. Action point: Tahu to be enrolled on an intelligence analysis communications course.
 The more of these you have, the less diagnostic they are
The next piece of evidence is the video of the white Toyota, possibly Andy's, spotted 200m from The Depot ten minutes ago. We'll combine it with the evidence of Andy's previous instability to form a single piece of evidence:
• E: "An individual with a history of instability was in a car near The Depot five minutes ago."
In fact, this evidence should only believed with a probability of either 30%, or whatever Juan means by 'probably', or some other number depending on how credible the NPR system or the analyst are. But we're going to pretend that E is known with certainty. This is the most diagnostic case. If it's still insufficient to push the probability into the 'evacuate' zone, then we don't need to worry about the finer points.

So, how likely is E if there's going to be a VBIED attack? Let's assume it's pretty close to 1. Some VBIED attacks will be carried out by individuals not known to have been unstable, but let's not worry about that. The key question here is the second probability - how likely is E under the assumption that there isn't going to be a VBIED attack? The lower this probability, the more powerful the evidence.

What this boils down to is how likely it is that an unstable person will be 'near' The Depot during the vast majority of time that there isn't about to be a VBIED attack. According to this article, the Security Service ('MI5') are 'watching' 3000 potential 'Jihadists' in the UK. Let's assume, including other types of threat, that there are something like 6000 people of 'concern' in the UK. This is about 1 in 10,000 people. The security infrastructure supporting The Depot may well have a similar proportion of people covered - after all, they have intelligence analysts, collection assets and so on.

Finally, we need to guess roughly how many people are 'near' The Depot every day, and how probable it is that one of them is an individual of concern. Is it in the middle of a town, or out in the countryside? Let's give it the benefit of the doubt and assume that it's somewhere quiet, which again increases the power of the evidence. Let's say one car a minute is on the road nearby. This equates to about 500 cars during a working day.

And here's the key point: even with just 500 cars going past a day, each with just one occupant, you expect to see an individual of concern on average every twenty days. In an average eight hour stretch, the probability of seeing one of these individuals is about 5%. Even in the worst case - with a set of assumptions that make the evidence particularly diagnostic - the evidence presented raises the probability of an attack by a factor of just 20 or so - nowhere near the factor of 10,000 that would be needed to make evacuation optimal at any distance.

#### Conclusion

Don't press the button. Andy might be nearby, but the probability that he's about to conduct a VBIED attack is negligible. Instances of 'unstable person near a building' are far, far more frequent than instances of 'VBIED attack', to a multiple that greatly exceeds the ratio of costs and risks associated with the decision problem itself. The nature of the evidence is such that it simply cannot be diagnostic enough. In real life, you'd perhaps want to pursue further investigation - perhaps eyeball the street outside to see if the car's there. But ordering an immediate evacuation would be very jumpy indeed, and your tenure as head of security would probably not be a long one.

Some professional analysts would baulk at the approach taken above. It seems too full of assumptions and guesswork. It is, of course - in real life you would have a lot more information that would help guide the decision. But the broad approach taken - to start by analysing the decision, then ask whether the evidence could conceivably be sufficient to change it - is a robust one, and might save analytical resources that would otherwise be used up on estimating a probability that would not, in fact, make any difference to anything.