Big Data for Official Statistics Competition

This is a new competition being organized by EuroStat. The first phase involves nowcasting economic indicators at national and European level including unemployment, HICP, Tourism and Retail Trade and some of their variants.

The main goal of the competition is to discover promising methodologies and data sources that could, now or in the future, be used to improve the production of official statistics in the European Statistical System.

The organizers seem to have been encouraged by the success of Kaggle and other data science competition platforms. Unfortunately, they have chosen not to give any prizes other than an invitation to give a conference presentation or poster, which hardly seems likely to attract many good participants.

The deadline for registration is 10 January 2016. The duration of the competition is roughly a year (including about a month for evaluation).

See the call for participation for more information.

A time series classification contest

Amongst today’s email was one from someone running a private competition to classify time series. Here are the essential details.

The data are measurements from a medical diagnostic machine which takes 1 measurement every second, and after 32-1000 seconds, the time series must be classified into one of two classes. Some pre-classified training data is provided. It is not necessary to classify all the test data, but you do need to have relatively high accuracy on what is classified. So you could find a subset of more easily classifiable test time series, and leave the rest of the test data unclassified. Continue reading →

Prediction competitions

Competitions have a long history in forecasting and prediction, and have been instrumental in forcing research attention on methods that work well in practice. In the forecasting community, the M competition and M3 competition have been particularly influential. The data mining community have the annual KDD cup which has generated attention on a wide range of prediction problems and associated methods. Recent KDD cups are hosted on kaggle.

In my research group meeting today, we discussed our (limited) experiences in competing in some Kaggle competitions, and we reviewed the following two papers which describe two prediction competitions:

  1. Athanasopoulos and Hyndman (IJF 2011). The value of feedback in forecasting competitions. [preprint version]
  2. Roy et al (2013). The Microsoft Academic Search Dataset and KDD Cup 2013.

Continue reading →

GEFCom 2014 energy forecasting competition is underway

GEFCom 2014 is the most advanced energy forecasting competition ever organized, both in terms of the data involved, and in terms of the way the forecasts will be evaluated.

So everyone interested in energy forecasting should head over to the competition webpage and start forecasting:

This time, the competition is hosted on CrowdANALYTIX rather than Kaggle.

Highlights of GEFCom2014:

  • An upgraded edition from GEFCom2012
  • Four tracks: electric load, electricity price, wind power and solar power forecasting.
  • Probabilistic forecasting: contestants are required to submit 99 quantiles for each step throughout the forecast horizon.
  • Rolling forecasting: incremental data sets are being released on weekly basis to forecast the next period of interest.
  • Prizes for winning teams and institutions: up to 3 teams from each track will be recognized as the winning team; top institutions with multiple well-performing teams will be recognized as the winning institutions.
  • Global participation: 200+ people from 40+ countries have already signed up the GEFCom2014 interest list.

Tao Hong (the main organizer) has a few tips on his blog that you should read before starting.


New jobs in business analytics at Monash

We have an exciting new initiative at Monash University with some new positions in business analytics. This is part of a plan to strengthen our research and teaching in the data science/computational statistics area. We are hoping to make multiple appointments, at junior and senior levels. These are five-year appointments, but we hope that the positions will continue after that if we can secure suitable funding. Continue reading →