The github page for the forecast package currently shows the following information

Note the downloads figure: 264K/month. I know the package is popular, but that seems crazy. Also, the downloads figure on github only counts the downloads from the RStudio mirror, and ignores downloads from the other 125 mirrors around the world. Continue reading →

# The hidden benefits of open-source software

I’ve been having discussions with colleagues and university administration about the best way for universities to manage home-grown software.

The traditional business model for software is that we build software and sell it to everyone willing to pay. Very often, that leads to a software company spin-off that has little or nothing to do with the university that nurtured the development. Think MATLAB, S-Plus, Minitab, SAS and SPSS, all of which grew out of universities or research institutions. This model has repeatedly been shown to stifle research development, channel funds away from the institutions where the software was born, and add to research costs for everyone.

I argue that the open-source model is a much better approach both for research development and for university funding. Under the open-source model, we build software, and make it available for anyone to use and adapt under an appropriate licence. This approach has many benefits that are not always appreciated by university administrators. Continue reading →

# Big Data for Official Statistics Competition

This is a new competition being organized by EuroStat. The first phase involves nowcasting economic indicators at national and European level including unemployment, HICP, Tourism and Retail Trade and some of their variants.

The main goal of the competition is to discover promising methodologies and data sources that could, now or in the future, be used to improve the production of official statistics in the European Statistical System.

The organizers seem to have been encouraged by the success of Kaggle and other data science competition platforms. Unfortunately, they have chosen not to give any prizes other than an invitation to give a conference presentation or poster, which hardly seems likely to attract many good participants.

The deadline for registration is 10 January 2016. The duration of the competition is roughly a year (including about a month for evaluation).

See the call for participation for more information.

# Piecewise linear trends

I prepared the following notes for a consulting client, and I thought they might be of interest to some other people too.

Let $y_t$ denote the value of the time series at time $t$, and suppose we wish to fit a trend with correlated errors of the form
$$y_t = f(t) + n_t,$$
where $f(t)$ represents the possibly nonlinear trend and $n_t$ is an autocorrelated error process. Continue reading →

# forecast package v6.2

It is a while since I last updated the CRAN version of the forecast package, so I uploaded the latest version (6.2) today. The github version remains the most up-to-date version and is already two commits ahead of the CRAN version.

This update is mostly bug fixes and additional error traps. The full ChangeLog is listed below. Continue reading →

# Stanford seminar

I gave a seminar at Stanford today. Slides are below. It was definitely the most intimidating audience I’ve faced, with Jerome Friedman, Trevor Hastie, Brad Efron, Persi Diaconis, Susan Holmes, David Donoho and John Chambers all present (and probably other famous names I’ve missed).

I’ll be giving essentially the same talk at UC Davis on Thursday. Continue reading →

# Chinese R conference

I will be speaking at the Chinese R conference in Nanchang, to be held on 24-25 October, on “Forecasting Big Time Series Data using R”.

Details (for those who can read Chinese) are at china-r.org.

# Upcoming talks in California

I’m back in California for the next couple of weeks, and will give the following talk at Stanford and UC-Davis.

### Optimal forecast reconciliation for big time series data

Time series can often be naturally disaggregated in a hierarchical or grouped structure. For example, a manufacturing company can disaggregate total demand for their products by country of sale, retail outlet, product type, package size, and so on. As a result, there can be millions of individual time series to forecast at the most disaggregated level, plus additional series to forecast at higher levels of aggregation.

A common constraint is that the disaggregated forecasts need to add up to the forecasts of the aggregated data. This is known as forecast reconciliation. I will show that the optimal reconciliation method involves fitting an ill-conditioned linear regression model where the design matrix has one column for each of the series at the most disaggregated level. For problems involving huge numbers of series, the model is impossible to estimate using standard regression algorithms. I will also discuss some fast algorithms for implementing this model that make it practicable for implementing in business contexts.

# International Symposium on Forecasting: Spain 2016

June 19-22, 2016
Santander, Spain – Palace of La Magdalena

The International Symposium on Forecasting (ISF) is the premier forecasting conference, attracting the world’s leading forecasting researchers, practitioners, and students. Through a combination of keynote speaker presentations, academic sessions, workshops, and social programs, the ISF provides many excellent opportunities for networking, learning, and fun.

### Speakers:

Greg Allenby, The Ohio State University, USA
Todd Clark, Federal Reserve Bank of Cleveland, USA
José Duato, Polytechnic University of Valencia, Spain
Robert Fildes, Lancaster University, United Kingdom
Edward Leamer, UCLA Anderson, USA
Henrik Madsen, Technical University of Denmark
Adrian Raftery, University of Washington, USA

### Important Dates

Invited Session Proposals: January 31 2016
Abstract Submissions: March 16 2016
Early Registration Ends: May 15 2016