## Wednesday, 22 October 2014

### A Simple Base Rate Formula

A number of studies have identified 'base rate neglect' as a significant factor undermining forecasting performance even among experts.  A 'base rate' is not easy to define precisely, but in essence it's a sort of 'information-lite' probability which you might assign to something if you had no specific information about the thing in question.  For example, since North Korea has had a nuclear test three times in about the last nine years, a base rate for another test in the next year would be about one in three, or 33%.  If you're asked to make a judgement about the probability of an event in a forthcoming period of time, you should first construct a base rate, then use your knowledge of the specifics to adjust the probability up or down.  It seems simplistic, but anchoring to a base rate has been shown significantly to improve forecasting performance.

If your arithmetic is rusty, you can use the following simple formula to get a base rate for the occurrence of a defined event:

How far AHEAD are you looking?  Call this 'A'.
How far BACK can you look to identify comparable events?  Call this 'B'.  (Make sure the units are the same as for 'A' - e.g. months, years.)
What NUMBER of events of this kind have happened over this timeframe, anywhere?  Call this 'N'
How big is the POPULATION of entities of which your subject of interest is a part?  Call this 'P'

Your starting base rate is then given by: (A x N) / (B x P)

For example, suppose we were interested in the probability of a successful coup in Iran in the next five years.

How far AHEAD are we looking? 5 (years)
How far BACK can we look to identify comparable events?  68 (years)
What NUMBER of events of this kind have happened over this timeframe?  223 (successful coups since 1946, according to the Center for Systemic Peace)
How big is the POPULATION of entities (countries, in this case) of which Iran is a part?  The data cover 165 countries

The base rate is therefore: (5 x 223) / (68 x 165) = 0.099, or 9.9%, or more appropriately 'about one in ten'.

Remember this is just a starting point, not a forecast.  And there isn't just one base rate for a event - it will depend on how you classify the event and how good your data are.  But doing this simple step first will help mitigate a significant bias.

(NB. If you're dealing with events that have no precedents, or if the events are relatively frequent compared to your forecast horizon, you have a different problem on your hands and shouldn't use a simple formula like the one above.)