Jane Frazier spoke at our research team meeting today on “Reproducibility in computational research”. We had a very stimulating and lively discussion about the issues involved. One interesting idea was that reproducibility is on a scale, and we can all aim to move further along the scale towards making our own research more reproducible. For example
- Can you reproduce your results tomorrow on the same computer with the same software installed?
- Could someone else on a different computer reproduce your results with the same software installed?
- Could you reproduce your results in 3 years time after some of your software environment may have changed?
Think about what changes you need to make to move one step further along the reproducibility continuüm, and do it.
Jane’s slides and handout are below. Continue reading →
I’m back in California for the next couple of weeks, and will give the following talk at Stanford and UC-Davis.
Optimal forecast reconciliation for big time series data
Time series can often be naturally disaggregated in a hierarchical or grouped structure. For example, a manufacturing company can disaggregate total demand for their products by country of sale, retail outlet, product type, package size, and so on. As a result, there can be millions of individual time series to forecast at the most disaggregated level, plus additional series to forecast at higher levels of aggregation.
A common constraint is that the disaggregated forecasts need to add up to the forecasts of the aggregated data. This is known as forecast reconciliation. I will show that the optimal reconciliation method involves fitting an ill-conditioned linear regression model where the design matrix has one column for each of the series at the most disaggregated level. For problems involving huge numbers of series, the model is impossible to estimate using standard regression algorithms. I will also discuss some fast algorithms for implementing this model that make it practicable for implementing in business contexts.
Stanford: 4.30pm, Tuesday 6th October.
UCDavis: 4:10pm, Thursday 8th October.
This week, I am teaching my Business Analytics class about the bias-variance trade-off. For some reason, the proof is not contained in either ESL or ISL, even though it is quite simple. I also discovered that the proof currently provided on Wikipedia makes little sense in places.
So I wrote my own for the class. It is longer than necessary to ensure there are no jumps that might confuse students.
Continue reading →
At the recent International Symposium on Forecasting, held in Riverside, California, Tillman Gneiting gave a great talk on “Evaluating forecasts: why proper scoring rules and consistent scoring functions matter”. It will be the subject of an IJF invited paper in due course.
One of the things he talked about was the “Murphy diagram” for comparing forecasts, as proposed in Ehm et al (2015). Here’s how it works for comparing mean forecasts. Continue reading →
For the next few weeks I am travelling in North America and will be giving the following talks.
The Yahoo talk will be streamed live.
I’ll post slides on my main site after each talk.
This week I uploaded a new version of the forecast package to CRAN. As there were a lot of changes, I decided to increase the version number to 6.0.
The changes are all outlined in the ChangeLog file as usual. I will highlight some of the more important changes since v5.0 here. Continue reading →
We are now advertising for various positions in applied statistics, operations research and applied mathematics.
These jobs are with MAXIMA (the Monash Academy for Cross & Interdisciplinary Mathematical Applications).
Please do not send any questions to me (I won’t answer). Click above and follow the instructions.
I was recently interviewed as part of a promotion for the Monash Business School. The interviews can be watched below if anyone is interested. The titles chosen weren’t my ideas. Continue reading →
I’m currently attending the one day workshop on this topic at QUT in Brisbane. This morning I spoke on “Visualizing and forecasting big time series data”. My slides are here.
The talks are being streamed.
Big data is now endemic in business, industry, government, environmental management, medical science, social research and so on. One of the commensurate challenges is how to effectively model and analyse these data.
This workshop will bring together national and international experts in statistical modelling and analysis of big data, to share their experiences, approaches and opinions about future directions in this field.
This poem was written by David Goddard from the Monash University Department of Epidemiology and Preventive Medicine. It is reproduced here with his permission. The poem won the inaugural Monash University poetry competition and will soon be published in an anthology of contemporary poetry. Continue reading →