### Hyndsight

Thoughts on research, forecasting, statistics, and other distractions.

# Subject ▸ R

## Saving ts objects as csv files

Occasionally R might not be the tool you want to use (hard to believe, but apparently that happens). Then you may need to export some data from R via a csv file. When the data is stored as a ts object, the time index can easily get lost. So I wrote a little function to make this easier, using the tsibble package to do almost all of the work in looking after the time index.

## Seasonal decomposition of short time series

Many users have tried to do a seasonal decomposition with a short time series, and hit the error “Series has less than two periods”. The problem is that the usual methods of decomposition (e.g., decompose and stl) estimate seasonality using at least as many degrees of freedom as there are seasonal periods. So you need at least two observations per seasonal period to be able to distinguish seasonality from noise.

## A forecast ensemble benchmark

Forecasting benchmarks are very important when testing new forecasting methods, to see how well they perform against some simple alternatives. Every week I get sent papers proposing new forecasting methods that fail to do better than even the simplest benchmark. They are rejected without review. Typical benchmarks include the naïve method (especially for finance and economic data), the seasonal naïve method (for seasonal data), an automatically selected ETS model, and an automatically selected ARIMA model.

## Forecasting in NYC: 25-27 June 2018

In late June, I will be in New York to teach my 3-day workshop on Forecasting using R. Tickets are available at Eventbrite. This is the first time I’ve taught this workshop in the US, having previously run it in the Netherlands and Australia. It will be based on the 2nd edition of my book “Forecasting: Principles and Practice” with George Athanasopoulos. All participants will get a print version of the book.

## Upcoming talks: May-July 2018

First semester teaching is nearly finished, and that means conference season for me. Here are some talks I’m giving in the next two months. Click the links for more details. Melbourne, Australia. 28 May: Panel discussion: Forecasting models, the uncertainties and associated risk Boulder, Colorado, USA. 17-20 June: International Symposium on Forecasting. I’ll be talking about “Tidy forecasting in R”. New York, USA. 21 June: Feature-based time series analysis. New York Open Statistical Programming Meetup, eBay NYC.

## forecast v8.3 now on CRAN

The latest version of the forecast package for R is now on CRAN. This is the version used in the 2nd edition of my forecasting textbook with George Athanasopoulos. So readers should now be able to replicate all examples in the book using only CRAN packages. A few new features of the forecast package may be of interest. A more complete Changelog is also available. mstl() handles multiple seasonality STL decomposition was designed to handle a single type of seasonality, but modern data often involves several seasonal periods (e.

## A brief history of time series forecasting competitions

Prediction competitions are now so widespread that it is often forgotten how controversial they were when first held, and how influential they have been over the years. To keep this exercise manageable, I will restrict attention to time series forecasting competitions — where only the history of the data is available when producing forecasts. Nottingham studies The earliest non-trivial study of time series forecast accuracy was probably by David Reid as part of his PhD at the University of Nottingham (1969).

## R package for M4 Forecasting Competition

The M4 forecasting competition is well under-way, and a few of my PhD students have been working on submissions. Pablo Montero-Manso, Carla Netto, and Thiyanga Talagala have made an R package containing all of the data (100,000 time series), which should make it substantially easier for other contestants to load the data into R in order to compute forecasts. Grab the package from this github repository. For more details about the M4 competition see this post or go to the M4 website.

The official guidelines for the M4 competition have now been published, and there have been several developments since my last post on this. There is now a prize for prediction interval accuracy using a scaled version of the Mean Interval Score. If the $100(1-\alpha)$% prediction interval for time $t$ is given by $[L_{t},U_{t}]$, for $t=1,\dots,h$, then the MIS is defined as $$\frac{1}{h}\sum_{t=1}^{h} \left[ (U_t-L_t) + \frac{2}{\alpha}(L_t-Y_t)1(Y_t < L_t) + \frac{2}{\alpha}(Y_t-U_t)1(Y_t > U_t) \right]$$ where $Y_t$ is the observation at time $t$.