The thief package for R: Temporal HIErarchical Forecasting

Date

22 August 2016

Topics
forecasting
R
hts
seasonality
time series

I have a new R package available to do temporal hierarchical forecasting, based on my paper with George Athanasopoulos, Nikolaos Kourentzes and Fotios Petropoulos. (Guess the odd guy out there!)

It is called “thief” - an acronym for Temporal HIErarchical Forecasting. The idea is to take a seasonal time series, and compute all possible temporal aggregations that result in an integer number of observations per year. For example, a quarterly time series is aggregated to biannual and annual; while a monthly time series is aggregated to 2-monthly, quarterly, 4-monthly, biannual and annual. Each of the resulting time series are forecast, and then the forecasts are reconciled using the hierarchical reconciliation algorithm described in our paper.

It turns out that this tends to give better forecasts, even though no new information has been added, especially for time series with long seasonal periods. It also allows different types of forecasts for different forecast horizons to be combined in a consistent manner.

The main function is thief which can be used like any of the forecast functions from the forecast package. For example:

library(thief)
library(ggplot2)
fc <- thief(USAccDeaths)
autoplot(fc)

Under the hood, the thief function computes the temporal aggregates, then calculates all forecasts (using ets by default), and finally reconciles the forecasts using the approach described in our paper. The reconciled forecasts corresponding to the original series are returned.

There are options for computing forecasts using other methods such as auto.arima, the Theta method, or a user-defined forecast function.

There are also functions for doing each part separately: tsaggregates will compute all aggregated time series, reconcilethief will reconcile a list of time series (or a list of forecast objects) of different levels of temporal aggregation.

Here is an example using these lower-level functions. This example is in the paper, and uses weekly data (which we treat as having seasonal period of 52).

# Original series: Total Weekly Emergency Admissions in the UK
total <- AEdemand[,12]

# Construct all temporal aggregates
totalagg <- tsaggregates(total)
autoplot(totalagg, main="Total Emergency Admissions")

# Compute base forecasts
base <- list()
for(i in seq_along(totalagg))
  base[[i]] <- forecast(auto.arima(totalagg[[i]]),
                 h=2*frequency(totalagg[[i]]), level=80)

# Reconcile forecasts
reconciled <- reconcilethief(base)

#Plot original and reconciled forecasts
par(mfrow=c(2,3), mar=c(3,3,1,0))
for(i in 6:1)
{
  plot(reconciled[[i]], main=names(totalagg)[i],
       xlim=c(2010.5,2017.5), flwd=1)
  lines(base[[i]]$mean, col="red")
}

The red line represents the original base forecasts and the blue line the reconciled forecasts. Observe how different the base forecasts are across the aggregation levels. At the annual level there is no trend captured in the forecast due to the limited fitting sample, and there is a weak trend captured in the weekly forecasts due to the noisy data. But the trend is quite strong at the intermediate temporal aggregation levels. Similarly seasonality is captured relatively accurately in the quarterly, monthly and bi-weekly levels, but not at the weekly level. Once these are reconciled, the trend at the annual and weekly levels are stronger, and the weekly seasonality looks more reasonable.

For now, the package is available on github only. It will probably migrate to CRAN soon.