library(tidyverse)
library(tsibble)
library(lubridate)
library(feasts)
library(fable)
Non-Gaussian forecasting using fable
In my previous post about the new fable package, we saw how fable can produce forecast distributions, not just point forecasts. All my examples used Gaussian (normal) distributions, so in this post I want to show how non-Gaussian forecasting can be done.
As an example, we will use eating-out expenditure in my home state of Victoria.
<- tsibbledata::aus_retail %>%
vic_cafe filter(
== "Victoria",
State == "Cafes, restaurants and catering services"
Industry %>%
) select(Month, Turnover)
%>%
vic_cafe autoplot(Turnover) + ggtitle("Monthly turnover of Victorian cafes")
Forecasting with transformations
Clearly the variance is increasing with the level of the series, so we will consider modelling a Box-Cox transformation of the data.
%>% autoplot(box_cox(Turnover, lambda = 0.2)) vic_cafe
The variance now looks more homogeneous across the series, allowing us to fit an additive model. I chose the value of \lambda=0.2 by eye, but you can use the guerrero
function for an automated approach.
%>% features(Turnover, guerrero) vic_cafe
# A tibble: 1 × 1
lambda_guerrero
<dbl>
1 0.173
It suggests something slightly smaller, but I will stick with 0.2.
Now to fit a model. For this post I will use ETS models
<- vic_cafe %>%
fit model(ets = ETS(box_cox(Turnover, 0.2)))
fit
# A mable: 1 x 1
ets
<model>
1 <ETS(A,A,A)>
An ETS(A,A,A), or additive Holt-Winters model, has been selected for the transformed data. We can produce forecasts in the usual way.
<- fit %>% forecast(h = "3 years")
fc fc
# A fable: 36 x 4 [1M]
# Key: .model [1]
.model Month Turnover .mean
<chr> <mth> <dist> <dbl>
1 ets 2019 Jan t(N(13, 0.02)) 608.
2 ets 2019 Feb t(N(13, 0.028)) 563.
3 ets 2019 Mar t(N(13, 0.036)) 629.
4 ets 2019 Apr t(N(13, 0.044)) 615.
5 ets 2019 May t(N(13, 0.052)) 613.
6 ets 2019 Jun t(N(13, 0.061)) 593.
7 ets 2019 Jul t(N(13, 0.069)) 624.
8 ets 2019 Aug t(N(13, 0.077)) 640.
9 ets 2019 Sep t(N(13, 0.085)) 630.
10 ets 2019 Oct t(N(13, 0.093)) 642.
# … with 26 more rows
Note that the distributions are given as transformed normal, denoted by t(N). The point forecast (in column Turnover
) is the mean of this distribution. The back-transformation and bias adjustment is done automatically.
One particularly clever part of the package (thanks to Mitchell O’Hara-Wild) is that you can use any transformation in the model()
function, and the bias adjustment is computed based on a Taylor series expansion using numerical derivatives. So you will always get the approximate mean as the point forecast, even when using some exotic transformation for which you have no analytic expression for the bias.
%>% autoplot(vic_cafe) fc
Bootstrapped prediction intervals
In the preceding analysis, there was still a normality assumption for the residuals of the model applied to the transformed data. If you want to avoid that as well, you can use bootstrapped intervals. These are constructed from simulated future sample paths where the residuals are resampled as possible future errors.
We can simulate future sample paths using the generate()
function.
<- fit %>% generate(h = "3 years", times = 5, bootstrap = TRUE)
sim sim
# A tsibble: 180 x 5 [1M]
# Key: .model, .rep [5]
.model .rep Month .innov .sim
<chr> <chr> <mth> <dbl> <dbl>
1 ets 1 2019 Jan -0.145 583.
2 ets 1 2019 Feb 0.324 600.
3 ets 1 2019 Mar 0.173 679.
4 ets 1 2019 Apr 0.0908 669.
5 ets 1 2019 May 0.0546 671.
6 ets 1 2019 Jun -0.122 624.
7 ets 1 2019 Jul -0.116 644.
8 ets 1 2019 Aug 0.117 689.
9 ets 1 2019 Sep 0.165 702.
10 ets 1 2019 Oct -0.0674 690.
# … with 170 more rows
Here we have generated five possible sample paths for future months. The .rep
variable provides a new key for the tsibble. The back-transformation of the sample paths is handled automatically. If we had multiple models, each would be used to generate future sample paths provided the corresponding generate
function existed. (In the current version of fable, we have not yet implemented this for ARIMA models.)
The plot below shows the five sample paths along with the last few years of historical data.
%>%
vic_cafe filter(year(Month) >= 2008) %>%
ggplot(aes(x = Month)) +
geom_line(aes(y = Turnover)) +
geom_line(aes(y = .sim, colour = as.factor(.rep)), data = sim) +
ggtitle("Monthly turnover of Victorian cafes") +
guides(col = FALSE)
Warning: `guides(<scale> = FALSE)` is deprecated. Please use
`guides(<scale> = "none")` instead.
Prediction intervals are calculated using percentiles of the future sample paths. This is all built into the forecast()
function so you do not need to call generate()
directly.
<- fit %>% forecast(h = "3 years", bootstrap = TRUE)
fc fc
# A fable: 36 x 4 [1M]
# Key: .model [1]
.model Month Turnover .mean
<chr> <mth> <dist> <dbl>
1 ets 2019 Jan t(sample[5000]) 608.
2 ets 2019 Feb t(sample[5000]) 563.
3 ets 2019 Mar t(sample[5000]) 628.
4 ets 2019 Apr t(sample[5000]) 615.
5 ets 2019 May t(sample[5000]) 612.
6 ets 2019 Jun t(sample[5000]) 593.
7 ets 2019 Jul t(sample[5000]) 624.
8 ets 2019 Aug t(sample[5000]) 640.
9 ets 2019 Sep t(sample[5000]) 630.
10 ets 2019 Oct t(sample[5000]) 642.
# … with 26 more rows
Notice that the forecast distribution is now represented as transformed simulation with 5000 sample paths. This default number can be modified in the times
argument for forecast()
.
%>% autoplot(vic_cafe) +
fc ggtitle("Monthly turnover of Victorian cafes")
In this example, the resulting forecast intervals are almost identical to those obtained when we assumed the residuals were normally distributed.
Accuracy calculations
We can check whether the bootstrapping helped by comparing the CRPS values from both models after doing a training/test set split.
<- vic_cafe %>% filter(year(Month) <= 2014)
train <- train %>%
fit model(ets = ETS(box_cox(Turnover, 0.2)))
%>%
fit forecast(h = "4 years", bootstrap = FALSE) %>%
accuracy(vic_cafe,
measures = distribution_accuracy_measures
)
# A tibble: 1 × 4
.model .type percentile CRPS
<chr> <chr> <dbl> <dbl>
1 ets Test 24.6 24.4
%>%
fit forecast(h = "4 years", bootstrap = TRUE) %>%
accuracy(vic_cafe,
measures = distribution_accuracy_measures
)
# A tibble: 1 × 4
.model .type percentile CRPS
<chr> <chr> <dbl> <dbl>
1 ets Test 24.7 24.5
In this case it makes almost no difference which of the two approaches is used, so the non-bootstrap approach is preferred because it is much faster.