GEFCom 2014 is the most advanced energy forecasting competition ever organized, both in terms of the data involved, and in terms of the way the forecasts will be evaluated. So everyone interested in energy forecasting should head over to the competition webpage and start forecasting: www.gefcom.org. This time, the competition is hosted on CrowdANALYTIX rather than Kaggle. Highlights of GEFCom2014: An upgraded edition from GEFCom2012 Four tracks: electric load, electricity price, wind power and solar power forecasting. Probabilistic forecasting: contestants are required to submit 99 quantiles for each step throughout the forecast horizon. Rolling forecasting: incremental data sets are being released on weekly basis to forecast the next period of interest. Prizes for winning teams and institutions: up to 3 teams from each track will be recognized as the winning team; top institutions with multiple well-performing teams will be recognized as the winning institutions. Global participation: 200+ people from 40+ countries have already signed up the GEFCom2014 interest list. Tao Hong (the main organizer) has a few tips on his blog that you should read before starting.
Posts Tagged ‘kaggle’:
We have an exciting new initiative at Monash University with some new positions in business analytics. This is part of a plan to strengthen our research and teaching in the data science/computational statistics area. We are hoping to make multiple appointments, at junior and senior levels. These are five-year appointments, but we hope that the positions will continue after that if we can secure suitable funding.
The 2012 GEFcom competition was a great success with several new innovative forecasting methods introduced. These have been published in the IJF as follows:
The International Journal of Forecasting is calling for papers on probabilistic energy forecasting. Here are the details (taken from Tao Hong’s blog).
Forecasting competitions are a great way to test new methods and obtain a realistic evaluation of how good they are. So I’m delighted that the IEEE is organizing an energy forecasting competition as outlined by Tao Hong below.
It is good to see forecasting algorithms getting some mainstream exposure on ABC Catalyst. Update: See also this great talk by Jeremy Howard, a data scientist from Melbourne and now part of Kaggle.
Forecasting Ace is looking for participants to develop improved methods for predicting future events and outcomes. Their goal is to develop methods for aggregating many individual judgments in a manner that yields more accurate predictions than any one person or small group alone could provide. Potential applications of the system include forecasting economic conditions, political changes, technological development and medical breakthroughs.
And the winners are … Jeremy Howard and Lee C Baker. (See my earlier post for information about the competition.) Jeremy describes his approach to seasonal time series in a blog post on Kaggle.com. Lee described his approach to annual time series in an earlier post. A few lessons that come out of this: For data from a single industry, using a global trend (i.e., estimated across all series) can be useful. Combining forecasts is a good idea. (This lesson seems to be re-learned in every forecasting competition!) The MASE can be very sensitive to a few series, and to optimize MASE it is worth concentrating on these. (This is actually not a good message for forecasting overall, as we want good forecasts for all series. Maybe we need to find a metric with similar properties to MASE but with a less skewed distribution.) Outlier removal before forecasting can be effective. (This is an interesting result as outlier removal algorithms used in the M3 competition did not help forecast accuracy.) Jeremy and Lee receive 500 will restore vision to 20 people. They will also write up their methods in more detail for the International Journal
The first stage of the tourism forecasting competition on kaggle has finished. This stage involved forecasting 518 annual time series. Twenty one teams beat our Theta method benchmark which is a great result, and well beyond our expectations. Congratulations to Lee Baker for winning stage one. I am yet to learn what methods the top teams were using, but we hope to write up a paper for the IJF describing the results. Of course, the winning team (overall) gets to write their own discussion paper for the IJF. Stage 2 of the competition is now open and involves forecasting 366 monthly time series and 427 quarterly time series. In this case, the best result in our paper for the monthly data was the automatic ARIMA algorithm (Hyndman & Khandakar, 2008) with a MASE of 1.38. For quarterly data, the ETS(A,Ad,A) model performed slightly better than our ARIMA algorithm with a MASE of 1.43. Let’s see how much better everyone else can do! Head over to kaggle and get the data. Entries close on 31 October 2010.
Recently I wrote a paper entitled “The tourism forecasting competition” in which we (i.e., George Athanasopoulos, Haiyan Song, Doris Wu and I) compared various forecasting methods on a relatively large set of tourism-related time series. The paper has been accepted for publication in the International Journal of Forecasting. (When I submit a paper to the IJF it is always handled by another editor. In this case, Mike Clements handled the paper and it went through several revisions before it was finally accepted. Just to show the process is unbiased, I have had a paper rejected by the journal during the period I have been Editor-in-Chief.) We are now opening up the competition to anyone who thinks they can do better than the best methods we implemented in the paper. Methods will be evaluated based on the smallest MASE (Mean Absolute Scaled Error) — see Hyndman & Koehler (2006) for details of this statistic. To make it interesting, there is a prize. The overall winner will collect $AUD500 and will be invited to contribute a discussion paper to the International Journal of Forecasting describing their methodology and giving their results, provided either the monthly MASE results are better than 1.38, the quarterly results are better than 1.43 or the yearly results are