Forecast v7 (part 2)
tslm function is designed to fit linear models to time series data. It is intended to approximately mimic
lm (and calls
lm to do the estimation), but to package the output to remember the
ts attributes. It also handles some predictor variables automatically, notably
season. The re-write means that
tslm now handles functions as predictors, including
deaths.lm <- tslm(mdeaths ~ trend + fourier(mdeaths,3)) mdeaths.fcast <- forecast(deaths.lm, data.frame(fourier(mdeaths,3,36))) autoplot(mdeaths.fcast)
fourier now takes 3 arguments. The first is the series, which is only used to grab the seasonal period and the
tsp attribute. The second argument
K is the number of Fourier harmonics to compute. If the third argument
NULL (the default), the function returns Fourier terms for the times of the historical observations. But if
h is a positive integer, the function returns Fourier terms for the next
h time periods after the end of the historical data.
lm function has long allowed a matrix to be passed and independent linear models fitted to each column. The new
tslm function also allows this now.
Bias adjustment for Box-Cox transformations
Almost all modelling and forecasting functions in the package allow Box-Cox transformations to be applied before the model is fitted, and for the forecasts to be back transformed. This will give median forecasts on the original scale, as I’ve explained before.
There is now an option to adjust the forecasts so they are means rather than medians, but setting
biasadj=TRUE whenever the forecasts are computed. I will probably make this the default in some future version, but for now the default is
biasadj=FALSE so the forecasts are actually medians.
library(fpp, quietly=TRUE) fit <- ets(eggs, model="AAN", lambda=0) fc1 <- forecast(fit, biasadj=TRUE, h=20, level=95) fc2 <- forecast(fit, biasadj=FALSE, h=20) cols <- c("Mean"="#0000ee","Median"="#ee0000") autoplot(fc1) + ylab("Price") + xlab("Year") + autolayer(fc2, PI=FALSE, series="Median") + autolayer(fc1, PI=FALSE, series="Mean") + guides(fill=FALSE) + scale_colour_manual(name="Forecasts",values=cols)
A new Ccf function
Cross-correlations can now be computed using
ccf except that the axes are more informative.
Acf function now handles multivariate time series, with cross-correlation functions computed as well as the ACFs of each series.
Covariates in neural net AR models
nnetar function allows neural networks to be applied to time series data by building a nonlinear autoregressive model. A new feature allows additional inputs to be included in the model.
Better subsetting of time series
subset.ts allows quite sophisticated subsetting of a time series. For example
## Time Series: ## Start = 1965.5 ## End = 1994.5 ## Frequency = 1 ##  6633 6730 6946 6915 7190 7105 6840 7819 7045 5540 5906 5505 5318 5466 5696 ##  5341 5464 5129 5524 6080 6540 6339 6590 6077 5146 5127 5222 4954 5309 6396
This is now substantially more robust than it used to be.
The next major release will probably be around the end of 2016. On the to-do list are:
In-sample multi-step fitted values. Currently
fittedreturns in-sample one-step forecasts. A new argument to
fittedwill allow multi-step forecasts of the training data.
Applying fitted models to new data sets. A related issue is to take an estimated model and apply it to some new data without re-estimating parameters. This is already possible with
etsmodels. It will be extended to many more model types.
Better choice of seasonal differencing. Currently
auto.arimadoes a pretty good job at finding the orders of a model, and the number of first-differences required, but it does not handle seasonal differences well. It often selects 0 differences, when I think it should select 1 difference. So I tend to over-ride the automatic choice with
auto.arima(x, D=1). I will attempt to find some better tests of seasonal unit roots than those that are currently implemented.
Prediction intervals for NNAR forecasts. The forecasts obtained using a NNAR model (via the
nnetarfunction) do not have prediction intervals because there is no underlying stochastic model on which to base them. However, there are ways of computing the uncertainty using simulation, and I hope to implement something like that for the next version.