ODI looking for young postgrad statisticians

The Overseas Development Institute Fellowship Scheme sends young postgraduate statisticians (and economists) to work in the public sectors of developing countries in Africa, the Caribbean and the Pacific on two-year contracts. This is a great way to develop skills and gain experience working within a developing country’s government. And you get to live in a fascinating place!

The application process for the 2016-2018 Fellowship Scheme is now open. Students are advised to apply before 17 December 2015 for a chance to be part of the ODI Fellowship Scheme.

Essential criteria:

  • degree in statistics, economics, or a related field
  • postgraduate degree qualification
  • ability to commit to a two-year assignment

Application is via the online application form.

Read some first-hand experiences of current and former Fellows.


Big Data for Official Statistics Competition

This is a new competition being organized by EuroStat. The first phase involves nowcasting economic indicators at national and European level including unemployment, HICP, Tourism and Retail Trade and some of their variants.

The main goal of the competition is to discover promising methodologies and data sources that could, now or in the future, be used to improve the production of official statistics in the European Statistical System.

The organizers seem to have been encouraged by the success of Kaggle and other data science competition platforms. Unfortunately, they have chosen not to give any prizes other than an invitation to give a conference presentation or poster, which hardly seems likely to attract many good participants.

The deadline for registration is 10 January 2016. The duration of the competition is roughly a year (including about a month for evaluation).

See the call for participation for more information.

Reproducibility in computational research

Jane Frazier spoke at our research team meeting today on “Reproducibility in computational research”. We had a very stimulating and lively discussion about the issues involved. One interesting idea was that reproducibility is on a scale, and we can all aim to move further along the scale towards making our own research more reproducible. For example

  • Can you reproduce your results tomorrow on the same computer with the same software installed?
  • Could someone else on a different computer reproduce your results with the same software installed?
  • Could you reproduce your results in 3 years time after some of your software environment may have changed?
  • etc.

Think about what changes you need to make to move one step further along the reproducibility continuum, and do it.

Jane’s slides and handout are below. Continue reading →

Upcoming talks in California

I’m back in California for the next couple of weeks, and will give the following talk at Stanford and UC-Davis.

Optimal forecast reconciliation for big time series data

Time series can often be naturally disaggregated in a hierarchical or grouped structure. For example, a manufacturing company can disaggregate total demand for their products by country of sale, retail outlet, product type, package size, and so on. As a result, there can be millions of individual time series to forecast at the most disaggregated level, plus additional series to forecast at higher levels of aggregation.

A common constraint is that the disaggregated forecasts need to add up to the forecasts of the aggregated data. This is known as forecast reconciliation. I will show that the optimal reconciliation method involves fitting an ill-conditioned linear regression model where the design matrix has one column for each of the series at the most disaggregated level. For problems involving huge numbers of series, the model is impossible to estimate using standard regression algorithms. I will also discuss some fast algorithms for implementing this model that make it practicable for implementing in business contexts.

Stanford: 4.30pm, Tuesday 6th October.
UCDavis: 4:10pm, Thursday 8th October.

The bias-variance decomposition

This week, I am teaching my Business Analytics class about the bias-variance trade-off. For some reason, the proof is not contained in either ESL or ISL, even though it is quite simple. I also discovered that the proof currently provided on Wikipedia makes little sense in places.

So I wrote my own for the class. It is longer than necessary to ensure there are no jumps that might confuse students.
Continue reading →

Murphy diagrams in R

At the recent International Symposium on Forecasting, held in Riverside, California, Tillman Gneiting gave a great talk on “Evaluating forecasts: why proper scoring rules and consistent scoring functions matter”. It will be the subject of an IJF invited paper in due course.

One of the things he talked about was the “Murphy diagram” for comparing forecasts, as proposed in Ehm et al (2015). Here’s how it works for comparing mean forecasts. Continue reading →

North American seminars: June 2015

For the next few weeks I am travelling in North America and will be giving the following talks.

  • 19 June: Southern California Edison, Rosemead CA.
    “Probabilistic forecasting of peak electricity demand”.
  • 23 June: International Symposium on Forecasting, Riverside CA.
    “MEFM: An R package for long-term probabilistic forecasting of electricity demand”.
  • 25 June: Google, Mountain View, CA.
    “Automatic algorithms for time series forecasting”.
  • 26 June: Yahoo, Sunnyvale, CA.
    “Exploring the boundaries of predictability: what can we forecast, and when should we give up?”
  • 30 June: Workshop on Frontiers in Functional Data Analysis, Banff, Canada.
    “Exploring the feature space of large collections of time series”.

The Yahoo talk will be streamed live.

I’ll post slides on my main site after each talk.