GEFCom 2014 energy forecasting competition is underway

GEF­Com 2014 is the most advanced energy fore­cast­ing com­pe­ti­tion ever orga­nized, both in terms of the data involved, and in terms of the way the fore­casts will be evaluated.

So every­one inter­ested in energy fore­cast­ing should head over to the com­pe­ti­tion web­page and start fore­cast­ing: www​.gef​com​.org.

This time, the com­pe­ti­tion is hosted on Crow­d­AN­A­LYTIX rather than Kag­gle.

High­lights of GEFCom2014:

  • An upgraded edi­tion from GEFCom2012
  • Four tracks: elec­tric load, elec­tric­ity price, wind power and solar power forecasting.
  • Prob­a­bilis­tic fore­cast­ing: con­tes­tants are required to sub­mit 99 quan­tiles for each step through­out the fore­cast horizon.
  • Rolling fore­cast­ing: incre­men­tal data sets are being released on weekly basis to fore­cast the next period of interest.
  • Prizes for win­ning teams and insti­tu­tions: up to 3 teams from each track will be rec­og­nized as the win­ning team; top insti­tu­tions with mul­ti­ple well-​​performing teams will be rec­og­nized as the win­ning institutions.
  • Global par­tic­i­pa­tion: 200+ peo­ple from 40+ coun­tries have already signed up the GEFCom2014 inter­est list.

Tao Hong (the main orga­nizer) has a few tips on his blog that you should read before starting.

 

New jobs in business analytics at Monash

We have an excit­ing new ini­tia­tive at Monash Uni­ver­sity with some new posi­tions in busi­ness ana­lyt­ics. This is part of a plan to strengthen our research and teach­ing in the data science/​computational sta­tis­tics area. We are hop­ing to make mul­ti­ple appoint­ments, at junior and senior lev­els. These are five-​​year appoint­ments, but we hope that the posi­tions will con­tinue after that if we can secure suit­able fund­ing. Con­tinue reading →

Crowd sourcing forecasts

Fore­cast­ing Ace is look­ing for par­tic­i­pants to develop improved meth­ods for pre­dict­ing future events and out­comes. Their goal is to develop meth­ods for aggre­gat­ing many indi­vid­ual judg­ments in a man­ner that yields more accu­rate pre­dic­tions than any one per­son or small group alone could pro­vide. Poten­tial appli­ca­tions of the sys­tem include fore­cast­ing eco­nomic con­di­tions, polit­i­cal changes, tech­no­log­i­cal devel­op­ment and med­ical break­throughs.
Con­tinue reading →

Tourism forecasting competition ends

And the win­ners are … Jeremy Howard and Lee C Baker. (See my ear­lier post for infor­ma­tion about the competition.)

Jeremy describes his approach to sea­sonal time series in a blog post on Kag​gle​.com. Lee described his approach to annual time series in an ear­lier post.

A few lessons that come out of this:

  • For data from a sin­gle indus­try, using a global trend (i.e., esti­mated across all series) can be useful.
  • Com­bin­ing fore­casts is a good idea. (This les­son seems to be re-​​learned in every fore­cast­ing competition!)
  • The MASE can be very sen­si­tive to a few series, and to opti­mize MASE it is worth con­cen­trat­ing on these. (This is actu­ally not a good mes­sage for fore­cast­ing over­all, as we want good fore­casts for all series. Maybe we need to find a met­ric with sim­i­lar prop­er­ties to MASE but with a less skewed distribution.)
  • Out­lier removal before fore­cast­ing can be effec­tive. (This is an inter­est­ing result as out­lier removal algo­rithms used in the M3 com­pe­ti­tion did not help fore­cast accuracy.)

Jeremy and Lee receive $500 for their efforts and they have decided to donate their prize money to the Fred Hol­lows Foun­da­tion. $500 will restore vision to 20 peo­ple. They will also write up their meth­ods in more detail for the Inter­na­tional Jour­nal of Fore­cast­ing. I am hope­ful that Philip Brier­ley of team Sali Mali (who did very well in the sec­ond stage of the com­pe­ti­tion) will also write a short expla­na­tion of his meth­ods for the IJF.

Thanks to every­one who par­tic­i­pated in the com­pe­ti­tion. Thanks also to Anthony Gold­bloom from Kag­gle for host­ing the com­pe­ti­tion. Kag­gle is a won­der­ful plat­form for pre­dic­tion com­pe­ti­tions and I hope it will be used for many more com­pe­ti­tions of this type in the future.

Tourism forecasting competition results: part one

The first stage of the tourism fore­cast­ing com­pe­ti­tion on kag­gle has fin­ished. This stage involved fore­cast­ing 518 annual time series. Twenty one teams beat our Theta method bench­mark which is a great result, and well beyond our expec­ta­tions. Con­grat­u­la­tions to Lee Baker for win­ning stage one.

I am yet to learn what meth­ods the top teams were using, but we hope to write up a paper for the IJF describ­ing the results. Of course, the win­ning team (over­all) gets to write their own dis­cus­sion paper for the IJF.

Stage 2 of the com­pe­ti­tion is now open and involves fore­cast­ing 366 monthly time series and 427 quar­terly time series. In this case, the best result in our paper for the monthly data was the auto­matic ARIMA algo­rithm (Hyn­d­man & Khan­dakar, 2008) with a MASE of 1.38. For quar­terly data, the ETS(A,Ad,A) model per­formed slightly bet­ter than our ARIMA algo­rithm with a MASE of 1.43. Let’s see how much bet­ter every­one else can do! Head over to kag­gle and get the data. Entries close on 31 Octo­ber 2010.

The tourism forecasting competition

Recently I wrote a paper enti­tled “The tourism fore­cast­ing com­pe­ti­tion” in which we (i.e., George Athana­sopou­los, Haiyan Song, Doris Wu and I) com­pared var­i­ous fore­cast­ing meth­ods on a rel­a­tively large set of tourism-​​related time series. The paper has been accepted for pub­li­ca­tion in the Inter­na­tional Jour­nal of Fore­cast­ing. (When I sub­mit a paper to the IJF it is always han­dled by another edi­tor. In this case, Mike Clements han­dled the paper and it went through sev­eral revi­sions before it was finally accepted. Just to show the process is unbi­ased, I have had a paper rejected by the jour­nal dur­ing the period I have been Editor-​​in-​​Chief.)

We are now open­ing up the com­pe­ti­tion to any­one who thinks they can do bet­ter than the best meth­ods we imple­mented in the paper. Meth­ods will be eval­u­ated based on the small­est MASE (Mean Absolute Scaled Error) — see Hyn­d­man & Koehler (2006) for details of this statistic.

To make it inter­est­ing, there is a prize. The over­all win­ner will col­lect $AUD500 and will be invited to con­tribute a dis­cus­sion paper to the Inter­na­tional Jour­nal of Fore­cast­ing describ­ing their method­ol­ogy and giv­ing their results, pro­vided either the monthly MASE results are bet­ter than 1.38, the quar­terly results are bet­ter than 1.43 or the yearly results are bet­ter than 2.28. These thresh­olds are the best per­form­ing meth­ods in the analy­sis of these data described in Athana­sopou­los et al (2010).  In other words, the win­ner has to beat the best results in this paper for at least one of the three sets of series. It will also be nec­es­sary that the win­ner be able to describe their method clearly, in suf­fi­cient detail to enable repli­ca­tion and in a form suit­able for the Inter­na­tional Jour­nal of Fore­cast­ing. The paper would appear in the April 2011 issue of the IJF.

The com­pe­ti­tion is being hosted by the inno­v­a­tive folks at kag​gle​.com. Head over to kag​gle​.com/​t​o​u​rism1 to get the data and enter the competition.

The com­pe­ti­tion will be in two stages. Stage 1 involves only the annual data — 518 time series. You need to sub­mit fore­casts of the next four obser­va­tions for each series before 20 Sep­tem­ber 2010. Stage 2 will involve the monthly and quar­terly data and will begin after Stage 1 closes.

Good luck!