### Hyndsight

Thoughts on research, forecasting, statistics, and other distractions.

# Subject ▸ Reproducible Research

## A brief history of time series forecasting competitions

Prediction competitions are now so widespread that it is often forgotten how controversial they were when first held, and how influential they have been over the years. To keep this exercise manageable, I will restrict attention to time series forecasting competitions — where only the history of the data is available when producing forecasts. Nottingham studies The earliest non-trivial study of time series forecast accuracy was probably by David Reid as part of his PhD at the University of Nottingham (1969).

## R package for M4 Forecasting Competition

The M4 forecasting competition is well under-way, and a few of my PhD students have been working on submissions. Pablo Montero-Manso, Carla Netto, and Thiyanga Talagala have made an R package containing all of the data (100,000 time series), which should make it substantially easier for other contestants to load the data into R in order to compute forecasts. Grab the package from this github repository. For more details about the M4 competition see this post or go to the M4 website.

## M4 Forecasting Competition update

The official guidelines for the M4 competition have now been published, and there have been several developments since my last post on this. There is now a prize for prediction interval accuracy using a scaled version of the Mean Interval Score. If the $100(1-\alpha)$% prediction interval for time $t$ is given by $[L_{t},U_{t}]$, for $t=1,\dots,h$, then the MIS is defined as $$\frac{1}{h}\sum_{t=1}^{h} \left[ (U_t-L_t) + \frac{2}{\alpha}(L_t-Y_t)1(Y_t < L_t) + \frac{2}{\alpha}(Y_t-U_t)1(Y_t > U_t) \right]$$ where $Y_t$ is the observation at time $t$.

## Some new time series packages

This week I have finished preliminary versions of two new R packages for time series analysis. The first (tscompdata) contains several large collections of time series that have been used in forecasting competitions; the second (tsfeatures) is designed to compute features from univariate time series data. For now, both are only on github. I will probably submit them to CRAN after they’ve been tested by a few more people. tscompdata There are already two packages containing forecasting competition data: Mcomp (containing the M and M3 competition data) and Tcomp (containing the tourism competition data).

## M4 Forecasting Competition: response from Spyros Makridakis

Following my post on the M4 competition yesterday, Spyros Makridakis sent me these comments for posting here. I would like to thank Rob, my friend and co-author, for his insightful remarks concerning the upcoming M4 competition. As Rob says, the two of us have talked a great deal about competitions and I certainly agree with him about the “ideal” forecasting competition. In this reply, I will explain why I have deviated from the “ideal”, mostly for practical reasons and to ensure higher participation.

## M4 Forecasting Competition

The “M” competitions organized by Spyros Makridakis have had an enormous influence on the field of forecasting. They focused attention on what models produced good forecasts, rather than on the mathematical properties of those models. For that, Spyros deserves congratulations for changing the landscape of forecasting research through this series of competitions. Makridakis & Hibon, (JRSSA 1979) was the first serious attempt at a large empirical evaluation of forecast methods.

## rOpenSci OzUnconference coming to Melbourne

For a second year running, there will be another rOpenSci OzUnconference in Australia. This one will be held in Melbourne, on 26-27 October 2017. Unlike regular conferences, there are no talks and there is no pre-determined agenda. It brings together scientists, developers, and open data enthusiasts from academia, industry, government, and non-profit to get together for a few days to work on R-related projects. The agenda is mostly decided during the conference itself, and involves participants dividing into small groups to work on the projects of most interest to them.

## Monash Rmarkdown templates on github

Rmarkdown templates for staff and students in my department are now available on github. For a thesis, fork the repository MonashThesis.

For other templates, install the R package MonashEBSTemplates R package. This provides templates for

• beamer slides
• working papers
• exams
• letters

## The Australian Macro Database

AusMacroData is a new website that encourages and facilitates the use of quantitative, publicly available Australian macroeconomic data. The Australian Macro Database hosted at ausmacrodata.org provides a user-friendly front end for searching among over 40000 economic variables and is loosely based on similar international sites such as the Federal Reserve Economic Database (FRED). In total, data on 40,304 variables are available for download from AusMacroData. The majority of variables are sourced from the Australian Bureau of Statistics (ABS) and include data on national accounts, balance of payments and trade, housing and finance, labour force consumer price indices.