Thursday, 16 July 2015

Confidence and Probability Part 8: The 'Prior Weight' Theory

Does analytical confidence measure the extent to which an assessment is based on 'background information' rather than case-specific data?

This is a tempting idea. There seems to be a qualitative difference between the following assessments:

"The probability of rain in Las Vegas tomorrow is 7%, because it rains on average there 25 days a year."


"The probability of rain in London tomorrow is 7%, because satellite imagery suggests that the weather front moving across the Atlantic is heading for Scotland and the north of England."

The first of these is based on a simple frequency, which by itself is a perfectly good approach to estimating a probability in the absence of any other information, but it doesn't take into account any specific information including about the time of year. The second statement seems to be based on some situation-specific information. We might therefore have more confidence making the second kind of statement than the first.

In a previous post, we considered the possibility that this kind of confidence measures having more information. This theory is problematic in a number of ways, most particularly in that there are no powerful alternatives to probability as a measure of information, implying that in both cases above (where the probability is 7%), the information content would the same.

The 'prior weight' theory is a subtle alternative to the 'information' theory of confidence. According to the 'prior weight' theory, confidence is associated not with quantity of information, but with the extent to which the information is subject-specific, rather than about general, background frequencies. The 'prior weight' theory therefore rests on a meaningful distinction between these two types of information.

Is it coherent?

There are a few possible ways of attempting to demarcate 'background' from 'specific' information, but we'll consider two here: prior v posterior information, and frequency v belief. These are fairly technical distinctions, and if you're not familiar with the slightly-more arcane elements of probability theory, you might want to skip over this part of the discussion.

Prior / Posterior

The prior / posterior distinction comes from the theory of inference. The prior probability of a hypothesis summarises all the information you've already accounted for up to a certain point. When you get new information, it's added to the pool and your probability is transformed according to the relative likelihood of your having received that information under the hypothesis compared to its alternatives - according to Bayes' Theorem, in other words. The diagram we used in a previous post illustrates this process:

The trouble with this theory is that the distinction between 'prior' beliefs and new evidence is a purely practical one used to describe the impact of new information. One would end up at the same beliefs considering the evidence in a different order, or all at once in a single bite. The prior / posterior distinction does not correspond to a fundamental distinction between types of evidence.

Frequency / Belief

The frequency / belief distinction tries to draw a divison between evidence derived from statistical frequencies, and that derived from probabilistic judgements. This is a distinction which echoes the debate between Bayesians and frequentists, and rather tries to accommodate both schools in the same faculty. The idea is that 'background' information is derived from statistical frequencies, while 'specific' information is derived instead from Bayesian-style 'beliefs' that may not be statistical in nature. In the Las Vegas example, we were effectively picking a single day (tomorrow) from a sort of bag of days of which 7 out of 100 were rainy ones. In the London example, we were using a set of information that only applied to the specific tomorrow in question, and which had never been seen before nor would be likely to be seen again. In other words, 'specific' information is of a kind such that there is a reference-class consisting of just one situation, to wit the one you are in.

The problem with this approach to dividing 'background' and 'specific' is rather involved, but it boils down to this: it's always possible to decompose supposedly-'specific' information into a conjunction of individual items of information that are associated with statistical frequencies. In fact, if we couldn't, it wouldn't be possible to form any probabilistic judgements about the situation at all. For example, suppose we are thinking about a fingerprint left at the scene of a crime. It is a unique fingerprint, never before seen and never to be repeated. But our ability to reason about it, and its relationship to (for example) other fingerprints taken elsewhere or in our database, depends on our being able to decompose the 'fingerprint' into a number of more-abstract features, such as its size, or distinctive elements in its pattern, that allow us to make comparisons with other similar objects.

To be considered data at all, in other words, superficially-unique 'specific' pieces of information must be capable of being treated as composites of several 'background' pieces of information. So the distinction between 'frequency' based information (with large reference classes) and 'belief' based information (reference classes of 1) is not supportable on a theoretical level. 

Other ways of distinguishing 'background' from 'specific' information

There may be as-yet unidentified, coherent ways of distinguishing background information from specific information, but it seems unlikely, given the potential usefulness that any such division would have in (for example) legal or medical contexts - someone would have thought of it by now.

Does it accord with analysts' usage?

The short answer is: not in general. None of the respondents in our survey advanced it as a putative definition of 'confidence'. It's included here because of personal experience working with intelligence analysts, some of whom make use of a working distinction between 'background' evidence and 'specific' intelligence, and for whom the link to 'confidence' seems palpable.

Is it decision-relevant?

Would the distinction between 'background' and 'specific' information, even if coherent, have relevance to a decision-maker in addition to the probabilities those types of information supported? In common with other 'confidence' theories that seek to ground it in the nature of the evidence, the answer is 'no'. The reason, as before, is that the probability already summarises the strength of the evidence supporting a hypothesis, and optimal decision-making involves looking more-or-less only at outcomes and probabilities. It's rather like the distinction between a ton of lead and a ton of feathers: if what you're interested in is whether your truck will be able to carry it, the weight and not the composition is the important thing.


The idea that 'confidence' somehow captures the weight of 'background' compared to 'specific' data fails all three tests.

In the next post we'll look at our penultimate theory: the 'expected value' theory, which posits that 'confidence' measures the potential value of new information. This involves taking an economic approach to analysis, which seeks to understand its value added in terms of the impact it has on decision-making.