# “Forecasting with R” short course in Eindhoven

I will be giving my 3-day short-course/workshop on “Forecasting with R” in Eindhoven (Netherlands) from 19-21 October.

Register here

# Stanford seminar

I gave a seminar at Stanford today. Slides are below. It was definitely the most intimidating audience I’ve faced, with Jerome Friedman, Trevor Hastie, Brad Efron, Persi Diaconis, Susan Holmes, David Donoho and John Chambers all present (and probably other famous names I’ve missed).

I’ll be giving essentially the same talk at UC Davis on Thursday. Continue reading →

# Upcoming talks in California

I’m back in California for the next couple of weeks, and will give the following talk at Stanford and UC-Davis.

### Optimal forecast reconciliation for big time series data

Time series can often be naturally disaggregated in a hierarchical or grouped structure. For example, a manufacturing company can disaggregate total demand for their products by country of sale, retail outlet, product type, package size, and so on. As a result, there can be millions of individual time series to forecast at the most disaggregated level, plus additional series to forecast at higher levels of aggregation.

A common constraint is that the disaggregated forecasts need to add up to the forecasts of the aggregated data. This is known as forecast reconciliation. I will show that the optimal reconciliation method involves fitting an ill-conditioned linear regression model where the design matrix has one column for each of the series at the most disaggregated level. For problems involving huge numbers of series, the model is impossible to estimate using standard regression algorithms. I will also discuss some fast algorithms for implementing this model that make it practicable for implementing in business contexts.

# Seminars in Taiwan

I’m currently visiting Taiwan and I’m giving two seminars while I’m here — one at the National Tsing Hua University in Hsinchu, and the other at Academia Sinica in Taipei. Details are below for those who might be nearby. Continue reading →

# hts with regressors

The hts package for R allows for forecasting hierarchical and grouped time series data. The idea is to generate forecasts for all series at all levels of aggregation without imposing the aggregation constraints, and then to reconcile the forecasts so they satisfy the aggregation constraints. (An introduction to reconciling hierarchical and grouped time series is available in this Foresight paper.)

The base forecasts can be generated using any method, with ETS models and ARIMA models provided as options in the forecast.gts() function. As ETS models do not allow for regressors, you will need to choose ARIMA models if you want to include regressors. Continue reading →

# Specifying complicated groups of time series in hts

With the latest version of the hts package for R, it is now possible to specify rather complicated grouping structures relatively easily.

All aggregation structures can be represented as hierarchies or as cross-products of hierarchies. For example, a hierarchical time series may be based on geography: country, state, region, store. Often there is also a separate product hierarchy: product groups, product types, packet size. Forecasts of all the different types of aggregation are required; e.g., product type A within region X. The aggregation structure is a cross-product of the two hierarchies.

This framework includes even apparently non-hierarchical data: consider the simple case of a time series of deaths split by sex and state. We can consider sex and state as two very simple hierarchies with only one level each. Then we wish to forecast the aggregates of all combinations of the two hierarchies.

Any number of separate hierarchies can be combined in this way. Non-hierarchical factors such as sex can be treated as single-level hierarchies. Continue reading →

# Hierarchical forecasting with hts v4.0

A new version of my hts package for R is now on CRAN. It was completely re-written from scratch. Not a single line of code survived. There are some minor syntax changes, but the biggest change is speed and scope. This version is many times faster than the previous version and can handle hundreds of thousands of time series without complaining. Continue reading →