Derivations of forecast variance for benchmark methods

Date

8 December 2022

Topics
forecasting
In my forecasting textbook coauthored with George Athanasopoulos, we provide formulas for the forecast variances of four simple benchmark forecasting methods, but we don’t explain where they come from. So here are the derivations.

We assume that the residuals from the method are uncorrelated and homoscedastic, with mean 0 and variance \sigma^2. Let y_1,\dots,y_T denote the time series observations, and let \hat{y}_{T+h|T} be the estimated forecast mean (or point forecast). Then we can write y_t = \hat{y}_{t|t-1} + \varepsilon_t \tag{1} where \varepsilon is a white noise process. Let \hat\sigma_h^2 be the estimated h-step forecast variance.

Random walk/Naive method

For a random walk, Equation 1 suggests that the appropriate model is y_t = y_{t-1} + \varepsilon_t. Therefore \begin{align*} y_{T+1} &= y_T + \varepsilon_T \\ y_{T+2} &= y_{T+1} + \varepsilon_{T+1} = y_T + \varepsilon_T + \varepsilon_{T+1}\\ \vdots \\ y_{T+h} &= y_T + \sum_{i=1}^h \varepsilon_{T+i}. \end{align*} Consequently \begin{align*} \hat{y}_{T+h|T} &= E(y_{T+h|T}) = y_T\\ \text{and}\qquad \hat{\sigma}^2_{h} &= V(y_{T+h|T}) = h\sigma^2 \end{align*}

Seasonal naive method

Here the model is y_t = y_{t-m} + \varepsilon_t, where m is the seasonal period. Thus \begin{align*} y_{T+1} &= y_{T+1-m} + \varepsilon_{T+1} \\ y_{T+2} &= y_{T+2-m} + \varepsilon_{T+2} \\ \vdots \\ y_{T+m} &= y_{T} + \varepsilon_{T+m} \\ y_{T+m+1} &= y_{T+1} + \varepsilon_{T+m+1} = y_{T+1-m} + \varepsilon_{T+1} + \varepsilon_{T+m+1}\\ \vdots\\ y_{T+2m} &= y_{T+m} + \varepsilon_{T+2m} = y_{T} + \varepsilon_{T+m} + \varepsilon_{T+2m}\\ y_{T+2m+1} &= y_{T+m+1} + \varepsilon_{T+2m+1} = y_{T+1-m} + \varepsilon_{T+1} + \varepsilon_{T+m+1} + \varepsilon_{T+2m+1}\\ \vdots\\ y_{T+h} &= y_{T+h-m(k+1)} + \sum_{i=0}^{k} \varepsilon_{T+h+m(i-k)}, \end{align*} where k is the integer part of (h-1) /m (i.e., the number of complete years in the forecast period prior to time T+h). Therefore \begin{align*} \hat{y}_{T+h|T} &= E(y_{T+h|T}) = y_{T+h-m(k+1)}\\ \text{and}\qquad \hat{\sigma}^2_{h} &= V(y_{T+h|T}) = (k+1)\sigma^2 \end{align*}

Mean method

The model underpinning the mean method is y_t = c + \varepsilon_t for some constant c to be estimated. The least-squares estimate of c is the mean, \hat{c} = \bar{y} = (y_1+\dots+y_T)/T. Thus, \begin{align*} y_{T+1} &= c + \varepsilon_{T+1} \\ y_{T+2} &= c + \varepsilon_{T+1} \\ \vdots \\ y_{T+h} &= c + \varepsilon_{T+h}. \end{align*} Therefore \begin{align*} \hat{y}_{T+h|T} &= E(y_{T+h|T}) = \hat{c} = \bar{y}\\ \text{and}\qquad \hat{\sigma}^2_{h} &= V(y_{T+h|T}) = V(\hat{c}) + \sigma^2 = (T^{-1} + 1)\sigma^2 \end{align*}

Drift method

For a random walk with drift y_{t} = c + y_{t-1} + \varepsilon_t. Therefore, \begin{align*} y_{T+1} &= c + y_T + \varepsilon_T \\ y_{T+2} &= 2 + y_{T+1} + \varepsilon_{T+1} = 2c + y_T + \varepsilon_T + \varepsilon_{T+1}\\ \vdots \\ y_{T+h} &= hc + y_T + \sum_{i=1}^h \varepsilon_{T+i}. \end{align*} Now the least squares estimate of c is \hat{c} = (y_{T}-y_1)/(T-1). Therefore \begin{align*} \hat{y}_{T+h|T} &= E(y_{T+h|T}) = h\hat{c} + y_T\\ \text{and}\qquad \hat{\sigma}^2_{h} &= V(y_{T+h|T}) = V(\hat{c}) + h\sigma^2 = \frac{h^2\sigma^2}{T-1} + h\sigma^2. \end{align*}