Non-Gaussian forecasting using fable
library(tidyverse)
library(tsibble)
library(lubridate)
library(feasts)
library(fable)
In my previous post about the new fable package, we saw how fable can produce forecast distributions, not just point forecasts. All my examples used Gaussian (normal) distributions, so in this post I want to show how non-Gaussian forecasting can be done.
As an example, we will use eating-out expenditure in my home state of Victoria.
vic_cafe <- tsibbledata::aus_retail %>%
filter(
State == "Victoria",
Industry == "Cafes, restaurants and catering services"
) %>%
select(Month, Turnover)
vic_cafe %>%
autoplot(Turnover) + ggtitle("Monthly turnover of Victorian cafes")
Forecasting with transformations
Clearly the variance is increasing with the level of the series, so we will consider modelling a Box-Cox transformation of the data.
vic_cafe %>% autoplot(box_cox(Turnover, lambda = 0.2))
The variance now looks more homogeneous across the series, allowing us to fit an additive model. I chose the value of \(\lambda=0.2\) by eye, but you can use the guerrero
function for an automated approach.
vic_cafe %>% features(Turnover, guerrero)
## # A tibble: 1 x 1
## lambda_guerrero
## <dbl>
## 1 0.124
It suggests something slightly smaller, but I will stick with 0.2.
Now to fit a model. For this post I will use ETS models
fit <- vic_cafe %>%
model(ets = ETS(box_cox(Turnover, 0.2)))
fit
## # A mable: 1 x 1
## ets
## <model>
## 1 <ETS(A,A,A)>
An ETS(A,A,A), or additive Holt-Winters model, has been selected for the transformed data. We can produce forecasts in the usual way.
fc <- fit %>% forecast(h = "3 years")
fc
## # A fable: 36 x 4 [1M]
## # Key: .model [1]
## .model Month Turnover .mean
## <chr> <mth> <dist> <dbl>
## 1 ets 2019 Jan t(N(13, 0.02)) 608.
## 2 ets 2019 Feb t(N(13, 0.028)) 563.
## 3 ets 2019 Mar t(N(13, 0.036)) 629.
## 4 ets 2019 Apr t(N(13, 0.044)) 615.
## 5 ets 2019 May t(N(13, 0.052)) 613.
## 6 ets 2019 Jun t(N(13, 0.061)) 593.
## 7 ets 2019 Jul t(N(13, 0.069)) 624.
## 8 ets 2019 Aug t(N(13, 0.077)) 640.
## 9 ets 2019 Sep t(N(13, 0.085)) 630.
## 10 ets 2019 Oct t(N(13, 0.093)) 642.
## # … with 26 more rows
Note that the distributions are given as transformed normal, denoted by t(N). The point forecast (in column Turnover
) is the mean of this distribution. The back-transformation and bias adjustment is done automatically.
One particularly clever part of the package (thanks to Mitchell O’Hara-Wild) is that you can use any transformation in the model()
function, and the bias adjustment is computed based on a Taylor series expansion using numerical derivatives. So you will always get the approximate mean as the point forecast, even when using some exotic transformation for which you have no analytic expression for the bias.
fc %>% autoplot(vic_cafe)
Bootstrapped prediction intervals
In the preceding analysis, there was still a normality assumption for the residuals of the model applied to the transformed data. If you want to avoid that as well, you can use bootstrapped intervals. These are constructed from simulated future sample paths where the residuals are resampled as possible future errors.
We can simulate future sample paths using the generate()
function.
sim <- fit %>% generate(h = "3 years", times = 5, bootstrap = TRUE)
sim
## # A tsibble: 180 x 4 [1M]
## # Key: .model, .rep [5]
## .model Month .rep .sim
## <chr> <mth> <chr> <dbl>
## 1 ets 2019 Jan 1 583.
## 2 ets 2019 Feb 1 600.
## 3 ets 2019 Mar 1 679.
## 4 ets 2019 Apr 1 669.
## 5 ets 2019 May 1 671.
## 6 ets 2019 Jun 1 624.
## 7 ets 2019 Jul 1 644.
## 8 ets 2019 Aug 1 689.
## 9 ets 2019 Sep 1 702.
## 10 ets 2019 Oct 1 690.
## # … with 170 more rows
Here we have generated five possible sample paths for future months. The .rep
variable provides a new key for the tsibble. The back-transformation of the sample paths is handled automatically. If we had multiple models, each would be used to generate future sample paths provided the corresponding generate
function existed. (In the current version of fable, we have not yet implemented this for ARIMA models.)
The plot below shows the five sample paths along with the last few years of historical data.
vic_cafe %>%
filter(year(Month) >= 2008) %>%
ggplot(aes(x = Month)) +
geom_line(aes(y = Turnover)) +
geom_line(aes(y = .sim, colour = as.factor(.rep)), data = sim) +
ggtitle("Monthly turnover of Victorian cafes") +
guides(col = FALSE)
Prediction intervals are calculated using percentiles of the future sample paths. This is all built into the forecast()
function so you do not need to call generate()
directly.
fc <- fit %>% forecast(h = "3 years", bootstrap = TRUE)
fc
## # A fable: 36 x 4 [1M]
## # Key: .model [1]
## .model Month Turnover .mean
## <chr> <mth> <dist> <dbl>
## 1 ets 2019 Jan t(sample[5000]) 608.
## 2 ets 2019 Feb t(sample[5000]) 563.
## 3 ets 2019 Mar t(sample[5000]) 628.
## 4 ets 2019 Apr t(sample[5000]) 615.
## 5 ets 2019 May t(sample[5000]) 612.
## 6 ets 2019 Jun t(sample[5000]) 593.
## 7 ets 2019 Jul t(sample[5000]) 624.
## 8 ets 2019 Aug t(sample[5000]) 640.
## 9 ets 2019 Sep t(sample[5000]) 630.
## 10 ets 2019 Oct t(sample[5000]) 642.
## # … with 26 more rows
Notice that the forecast distribution is now represented as transformed simulation with 5000 sample paths. This default number can be modified in the times
argument for forecast()
.
fc %>% autoplot(vic_cafe) +
ggtitle("Monthly turnover of Victorian cafes")
In this example, the resulting forecast intervals are almost identical to those obtained when we assumed the residuals were normally distributed.
Accuracy calculations
We can check whether the bootstrapping helped by comparing the CRPS values from both models after doing a training/test set split.
train <- vic_cafe %>% filter(year(Month) <= 2014)
fit <- train %>%
model(ets = ETS(box_cox(Turnover, 0.2)))
fit %>%
forecast(h = "4 years", bootstrap = FALSE) %>%
accuracy(vic_cafe,
measures = distribution_accuracy_measures
)
## # A tibble: 1 x 4
## .model .type percentile CRPS
## <chr> <chr> <dbl> <dbl>
## 1 ets Test 24.6 24.4
fit %>%
forecast(h = "4 years", bootstrap = TRUE) %>%
accuracy(vic_cafe,
measures = distribution_accuracy_measures
)
## # A tibble: 1 x 4
## .model .type percentile CRPS
## <chr> <chr> <dbl> <dbl>
## 1 ets Test 24.7 24.5
In this case it makes almost no difference which of the two approaches is used, so the non-bootstrap approach is preferred because it is much faster.