A time series classification contest

Amongst today’s email was one from some­one run­ning a pri­vate com­pe­ti­tion to clas­sify time series. Here are the essen­tial details.

The data are mea­sure­ments from a med­ical diag­nos­tic machine which takes 1 mea­sure­ment every sec­ond, and after 32–1000 sec­onds, the time series must be clas­si­fied into one of two classes. Some pre-​​classified train­ing data is pro­vided. It is not nec­es­sary to clas­sify all the test data, but you do need to have rel­a­tively high accu­racy on what is clas­si­fied. So you could find a sub­set of more eas­ily clas­si­fi­able test time series, and leave the rest of the test data unclas­si­fied. Con­tinue reading →

Prediction competitions

Com­pe­ti­tions have a long his­tory in fore­cast­ing and pre­dic­tion, and have been instru­men­tal in forc­ing research atten­tion on meth­ods that work well in prac­tice. In the fore­cast­ing com­mu­nity, the M com­pe­ti­tion and M3 com­pe­ti­tion have been par­tic­u­larly influ­en­tial. The data min­ing com­mu­nity have the annual KDD cup which has gen­er­ated atten­tion on a wide range of pre­dic­tion prob­lems and asso­ci­ated meth­ods. Recent KDD cups are hosted on kag­gle.

In my research group meet­ing today, we dis­cussed our (lim­ited) expe­ri­ences in com­pet­ing in some Kag­gle com­pe­ti­tions, and we reviewed the fol­low­ing two papers which describe two pre­dic­tion competitions:

  1. Athana­sopou­los and Hyn­d­man (IJF 2011). The value of feed­back in fore­cast­ing com­pe­ti­tions. [preprint ver­sion]
  2. Roy et al (2013). The Microsoft Aca­d­e­mic Search Dataset and KDD Cup 2013.

Con­tinue reading →

GEFCom 2014 energy forecasting competition is underway

GEF­Com 2014 is the most advanced energy fore­cast­ing com­pe­ti­tion ever orga­nized, both in terms of the data involved, and in terms of the way the fore­casts will be evaluated.

So every­one inter­ested in energy fore­cast­ing should head over to the com­pe­ti­tion web­page and start fore­cast­ing: www​.gef​com​.org.

This time, the com­pe­ti­tion is hosted on Crow­d­AN­A­LYTIX rather than Kag­gle.

High­lights of GEFCom2014:

  • An upgraded edi­tion from GEFCom2012
  • Four tracks: elec­tric load, elec­tric­ity price, wind power and solar power forecasting.
  • Prob­a­bilis­tic fore­cast­ing: con­tes­tants are required to sub­mit 99 quan­tiles for each step through­out the fore­cast horizon.
  • Rolling fore­cast­ing: incre­men­tal data sets are being released on weekly basis to fore­cast the next period of interest.
  • Prizes for win­ning teams and insti­tu­tions: up to 3 teams from each track will be rec­og­nized as the win­ning team; top insti­tu­tions with mul­ti­ple well-​​performing teams will be rec­og­nized as the win­ning institutions.
  • Global par­tic­i­pa­tion: 200+ peo­ple from 40+ coun­tries have already signed up the GEFCom2014 inter­est list.

Tao Hong (the main orga­nizer) has a few tips on his blog that you should read before starting.


New jobs in business analytics at Monash

We have an excit­ing new ini­tia­tive at Monash Uni­ver­sity with some new posi­tions in busi­ness ana­lyt­ics. This is part of a plan to strengthen our research and teach­ing in the data science/​computational sta­tis­tics area. We are hop­ing to make mul­ti­ple appoint­ments, at junior and senior lev­els. These are five-​​year appoint­ments, but we hope that the posi­tions will con­tinue after that if we can secure suit­able fund­ing. Con­tinue reading →

Crowd sourcing forecasts

Fore­cast­ing Ace is look­ing for par­tic­i­pants to develop improved meth­ods for pre­dict­ing future events and out­comes. Their goal is to develop meth­ods for aggre­gat­ing many indi­vid­ual judg­ments in a man­ner that yields more accu­rate pre­dic­tions than any one per­son or small group alone could pro­vide. Poten­tial appli­ca­tions of the sys­tem include fore­cast­ing eco­nomic con­di­tions, polit­i­cal changes, tech­no­log­i­cal devel­op­ment and med­ical break­throughs.
Con­tinue reading →