If your arithmetic is rusty, you can use the following simple formula to get a base rate for the occurrence of a defined event:
How far AHEAD are you looking? Call this 'A'.
How far BACK can you look to identify comparable events? Call this 'B'. (Make sure the units are the same as for 'A' - e.g. months, years.)
What NUMBER of events of this kind have happened over this timeframe, anywhere? Call this 'N'
How big is the POPULATION of entities of which your subject of interest is a part? Call this 'P'
Your starting base rate is then given by: (A x N) / (B x P)
For example, suppose we were interested in the probability of a successful coup in Iran in the next five years.
How far AHEAD are we looking? 5 (years)
How far BACK can we look to identify comparable events? 68 (years)
What NUMBER of events of this kind have happened over this timeframe? 223 (successful coups since 1946, according to the Center for Systemic Peace)
How big is the POPULATION of entities (countries, in this case) of which Iran is a part? The data cover 165 countries
The base rate is therefore: (5 x 223) / (68 x 165) = 0.099, or 9.9%, or more appropriately 'about one in ten'.
Remember this is just a starting point, not a forecast. And there isn't just one base rate for a event - it will depend on how you classify the event and how good your data are. But doing this simple step first will help mitigate a significant bias.
(NB. If you're dealing with events that have no precedents, or if the events are relatively frequent compared to your forecast horizon, you have a different problem on your hands and shouldn't use a simple formula like the one above.)