Xiaozhe Wang1, Kate A. Smith-Miles1 and Rob J. Hyndman2
Neurocomputing, 72 (2009), 2581–2594.
- Faculty of Information Technology, Monash University, Clayton VIC 3800, Australia.
- Department of Econometrics and Business Statistics, Monash University, VIC 3800, Australia.
Abstract For univariate forecasting, there are various statistical models and computational algorithms available. In real-world exercises, too many choices can create difficulties in selecting the most appropriate technique, especially for users lacking sufficient knowledge of forecasting. This study focuses on rule induction for forecasting method selection by understanding the nature of historical forecasting data. A novel approach for selecting a forecasting method for univariate time series based on measurable data characteristics is presented that combines elements of data mining, meta-learning, clustering, classification and statistical measurement. We conducted a large-scale empirical study of over 300 time series using four of the most popular forecasting methods. To provide a rich portrait of the global characteristics of univariate time series, we extracted measures from a comprehensive set of features such as trend, seasonality, periodicity, serial correlation, skewness, kurtosis, nonlinearity, self-similarity, and chaos. Both supervised and unsupervised learning methods are used to learn the relationship between the characteristics of the time series and the forecasting method suitability, providing both recommendation rules, as well as visualizations in the feature space. A derived weighting schema based on the rule induction is also used to improve forecasting accuracy based on combined forecasting models.
Keywords: Meta-learning; Rule induction; Univariate time series; Data characteristics; Clustering.