Tuesday, 25 November 2014

Smoking Guns: Binary Tests with Asymmetric Diagnosticity

A binary test is one which produces one of two outputs, which are usually (but not necessarily) thought of as 'positive' or 'negative'.  Metal detectors, pass/fail quality control tests, tests of foetal gender, and university admissions tests are all binary tests.

We often want binary tests to strongly discriminate between cases.  We'd like a metal detector to have a high probability of bleeping in the presence of metal, and a low probability of bleeping in the absence of metal.  Tests with these characteristics will tend to have a broadly symmetric impact on our beliefs.  If our metal detector is doing its job, a bleep will increase the probability of metal being present by some factor, and the absence of a bleep will concomitantly reduce it by a similar factor.  If the detector has, say, a 99% chance of bleeping when metal is present, and only a 1% chance of bleeping when it isn't present, the test will strongly discriminate between cases; a bleep will increase the odds of metal by a factor of 99, and the absence of a bleep will decrease the odds that metal is there by the same factor.

Not all tests are like this.  Some tests have an asymmetric impact on our beliefs.  Some tests for circulating tumour cells (CTCs), for example, have a strongly positive effect on the probability of cancer if detected, but only a relatively weak negative effect if they are absent.  According to this data, just under half of patients with metastatic colorectal cancer (mCRC) tested positive for CTCs, compared to only 3% of healthy patients.  Assuming this data is right, what does this mean for the impact of this test on the probability of mCRC?

Let us suppose that a patient's symptoms, history, circumstances and so on indicate a 5% probability of mCRC.  If a subsequent CTC test is positive, the probability would rise to around 30% - a change in the odds by a factor of about ten (roughly 20:1 to 2:1).  But if it came back negative, the probability would only fall to about 3% - a change in the odds by a factor of about one-and-a-half.  The impacts of positive and negative results are therefore asymmetric.

"Don't wait for the translation, answer 'yes' or 'no'!"

In the realm of security and law enforcement, tests of asymmetric diagnosticity are often called 'smoking guns', apparently in homage to a Sherlock Holmes story.  Perhaps the most celebrated example in modern times is the set of photos of Cuban missile sites, taken from a U2 spy plane, that was presented in the UN by Adlai Stevenson in 1962.  These photos made it near-certain that the USSR were putting nuclear weapons into Cuba.  But if the US had failed to get these photos, it wouldn't have proved that the USSR wasn't doing that.  Incriminating photos are asymmetrically diagnostic.

Evidence that is asymmetrically diagnostic forms an interesting and ubiquitous category.  The enduring but generally-misleading dogma that 'absence of evidence is not evidence of absence' is in fact only true (and even then only partly so) of asymmetrically diagnostic evidence.  Frustratingly though, it's easy to prove that tests like this must only rarely provide a positive result.  If positives were common, their absence would provide exonerating evidence - and we've assumed that isn't the case.  In other words, smoking guns are necessarily rare.  Not because the universe is contrary, but because of the fundamental nature of evidence itself.

No comments: