Probabilistic forecasts for anomaly detection


3 July 2024


International Symposium on Forecasting, Dijon, France


International Symposium on Forecasting

When a forecast is very inaccurate, it is sometimes because a poor forecasting model is used, but it can also occur when an unusual observation occurs. I will discuss the latter situation, where a good forecasting model can be used to identify anomalies. The approach taken is to use a probabilistic forecast, and to compute the “density scores” equal to the negative log likelihood of the observations based on the forecast distributions. The density scores provide a measure of how anomalous each observation is, given the forecast density. A large density score indicates that the observation is unlikely, and so is a potential anomaly. On the other hand, typical values will have low density scores. A Generalized Pareto Distribution is fitted to the largest density scores to estimate the probability of each observation being an anomaly. Applications to pharmaceutical scripts and mortality data will be used to illustrate the ideas using the fable and weird R packages.


Download pdf