I’m back in California for the next couple of weeks, and will give the following talk at Stanford and UC-Davis.
Optimal forecast reconciliation for big time series data
Time series can often be naturally disaggregated in a hierarchical or grouped structure. For example, a manufacturing company can disaggregate total demand for their products by country of sale, retail outlet, product type, package size, and so on. As a result, there can be millions of individual time series to forecast at the most disaggregated level, plus additional series to forecast at higher levels of aggregation.
A common constraint is that the disaggregated forecasts need to add up to the forecasts of the aggregated data. This is known as forecast reconciliation. I will show that the optimal reconciliation method involves fitting an ill-conditioned linear regression model where the design matrix has one column for each of the series at the most disaggregated level. For problems involving huge numbers of series, the model is impossible to estimate using standard regression algorithms. I will also discuss some fast algorithms for implementing this model that make it practicable for implementing in business contexts.
Stanford: 4.30pm, Tuesday 6th October.
UCDavis: 4:10pm, Thursday 8th October.
I’ve always struggled with using
plotmath via the
expression function in R for adding mathematical notation to axes or legends. For some reason, the most obvious way to write something never seems to work for me and I end up using trial and error in a loop with far too many iterations.
So I am very happy to see the new latex2exp package available which translates LaTeX expressions into a form suitable for R graphs. This is going to save me time and frustration! Continue reading →
There are some tools that I use regularly, and I would like my research students and post-docs to learn them too. Here are some great online tutorials that might help.
Last week I gave a talk in the Yahoo! Big Thinkers series. The video of the talk is now online and embedded below.
Every now and then a commercial software vendor makes claims on social media about how their software is so much better than the forecast package for R, but no details are provided.
There are lots of reasons why you might select a particular software solution, and R isn’t for everyone. But anyone claiming superiority should at least provide some evidence rather than make unsubstantiated claims. Continue reading →
The anomalous package provides some tools to detect unusual time series in a large collection of time series. This is joint work with Earo Wang (an honours student at Monash) and Nikolay Laptev (from Yahoo Labs). Yahoo is interested in detecting unusual patterns in server metrics. Continue reading →
This week I uploaded a new version of the forecast package to CRAN. As there were a lot of changes, I decided to increase the version number to 6.0.
The changes are all outlined in the ChangeLog file as usual. I will highlight some of the more important changes since v5.0 here. Continue reading →
Yahoo Labs has just released an interesting new data set useful for research on detecting anomalies (or outliers) in time series data. There are many contexts in which anomaly detection is important. For Yahoo, the main use case is in detecting unusual traffic on Yahoo servers. Continue reading →
I spend much of my day sitting in front of a screen, coding or writing. To limit the strain on my eyes, I use a dark theme as much as possible. That is, I write with light colored text on a dark background. I don’t know why this is not the default in more software as it makes a big difference after a few hours of writing.
Most of the time, I am writing using either Sublime Text, RStudio or TeXstudio. Each of them can be set to use a dark theme with syntax coloring to highlight structural features in the text.
Continue reading →