Archive for category Papers in conference proceedings

Exploratory graphics for functional data

Han Lin Shang and Rob J Hyndman

Department of Econometrics and Business Statistics, Monash University, Clayton, Australia

Interface 2010: Computing Science and Statistics, Seattle, Washington, June 16–19, 2010

Abstract
We survey some graphical tools for visualizing large sets of functional data represented by smooth curves. These graphical tools include the phase-​​plane plot, singular value decomposition plot, rainbow plot, functional variants of the bagplot and the highest density region boxplot. The latter two techniques utilize the first two robust principal component scores, Tukey’s halfspace location depth and highest density regions.

The computer code and datasets are collected in the rainbow package for R, which is available at the Comprehensive R Archive Network (CRAN).

Keywords: Highest density regions, Kernel density estimation, Robust principal component analysis,
Singular value decomposition, Tukey’s halfspace location depth.

Download article (10Mb)

Short-​​term load forecasting based on a semi-​​parametric additive model

Shu Fan and Rob J Hyndman
20th Australasian Universities Power Engineering Conference

5–8 December 2010, University of Canterbury, Christchurch, New Zealand

Abstract
Short-​​term load forecasting is an essential instrument in power system planning, operation and control. Many operating decisions are based on load forecasts, such as dispatch scheduling of generating capacity, reliability analysis, and maintenance planning for the generators. Overestimation of electricity demand will cause a conservative operation, which leads to the start-​​up of too many units or excessive energy purchase, thereby supplying an unnecessary level of reserve. On the contrary, underestimation may result in a risky operation, with insufficient preparation of spinning reserve, causing the system to operate in a vulnerable region to the disturbance.

In this paper, semi-​​parametric additive models are proposed to estimate the relationships between demand and the driver variables. Specifically, the inputs for these models are calendar variables, lagged actual demand observations and historical and forecast temperature traces for one or more sites in the target power system. The proposed methodology has been used to forecast the half-​​hourly electricity demand for up to seven days ahead for power systems in the Australian National Electricity Market. The performance of the methodology is validated via out-​​of-​​sample experiment with the real data from the power system, as well as the on-​​site operation by the system operator.

Functionalization of microarray devices: process optimization using a multiobjective PSO and multiresponse MARS modeling

L. Villanova, P. Falcaro, D. Carta, I. Poli, R. J. Hyndman, K. Smith-​​Miles
2010 IEEE Congress on Evolutionary Computation, July 18–23, Barcelona, Spain

Abstract: An evolutionary approach for the optimization of microarray coatings produced via sol-​​gel chemistry is presented. The aim of the methodology is to face the challenging aspects of the problem: high dimensional variable space, constraints on the independent variables, multiple responses, expensive or time-​​consuming experimental trials, expected complexity of the functional relationships between independent and response variables. The proposed approach iteratively select a set of experiments by combining a multiobjective Particle Swarm Optimization (PSO) and a multiresponse Multivariate Adaptive Regression Spines (MARS) model. At each iteration of the algorithm the selected experiments are implemented and evaluated, and the system response is used as a feedback for the selection of the new trials. The best coating identified using the described methodology is characterized by relevant improvements with respect to the best coating obtained changing one variable at a time. The proposed evolutionary approach is shown to be a useful methodology for process optimization with great promise for industrial applications.

Download paper

Nonparametric time series forecasting with dynamic updating

Han Lin Shang and Rob J Hyndman (2009)
18th World IMACS/​MODSIM Congress, Cairns, Australia 13–17 July 2009.

Dimension reduction for clustering time series using global characteristics

Wang, X.1, Smith, K.A.1, and Hyndman, R.J.2 (2005)
  1. School of Business Systems, Monash University, Clayton VIC 3800, Australia.
  2. Department of Econometrics and Business Statistics, Monash University, VIC 3800, Australia.
Lecture Notes in Computer Science, Volume 3516, April 2005, Pages 792–795.
Proceedings. Computational Science — ICCS 2005: 5th International Conference, Atlanta, GA, USA, May 22–25, 2005.

Abstract Existing methods for time series clustering rely on the actual data values can become impractical since the methods do not easily handle dataset with high dimensionality, missing value, or different lengths. In this paper, a dimension reduction method is proposed that replaces the raw data with some global measures of time series characteristics. These measures are then clustered using a self-​​organizing map. The proposed approach has been tested using benchmark time series previously reported for time series clustering, and is shown to yield useful and robust clustering.

Download pdf of full article

Robust forecasting of mortality and fertility rates: a functional data approach

Hyndman, R.J., and Ullah, M.S. (2005)
Invited paper, Demographic Forecasting session, 55th session of the International Statistical Institute, Sydney, Australia, April 2005

Abstract We propose a new method for forecasting age-​​specific mortality and fertility rates observed over time. We combine ideas from functional data analysis, nonparametric smoothing and robust statistics to form a methodology that is widely applicable to any functional time series data, and age-​​specific mortality and fertility in particular. Our approach provides a modelling framework that is easily adapted to allow for constraints and other information. The model used can be considered a generalization of the Lee-​​Carter model commonly used in mortality and fertility forecasting. The methodology is applied to Australian fertility data.

Keywords: forecasting, mortality, fertility, functional data

R code

Download pdf of full article

Statistical methodological issues in studies of air pollution and respiratory disease

Erbas, B. and Hyndman, R.J. (2001)
16th International Workshop on Statistical Modelling, Odense, Denmark. 2–6 July, 2001.

Abstract: Epidemiological studies have consistently shown short term associations between levels of air pollution and respiratory disease in countries of diverse populations, geographical locations and varying levels of air pollution and climate. The aims of this paper are: (1) to assess the sensitivity of the observed pollution effects to model specification, with particular emphasis on the inclusion of seasonally adjusted covariates; and (2) to study the effect of air pollution on respiratory disease in Melbourne, Australia.

Keywords: air pollution, autocorrelation, generalized additive models, respiratory disease, seasonal adjustment

Download pdf of article

Nonparametric additive regression models for binary time series

Hyndman, R.J. (1999)
Published in the Proceedings, 1999 Australasian Meeting of the Econometric Society, 7–9 July 1999, University of Technology, Sydney.

Abstract: I consider models for binary time series, starting with autoregression models and then developing generalizations of them which allow nonparametric additive covariates. I show that several apparently different binary AR(1) models are equivalent. Three possible nonparametric additive regression models which allow for autocorrelation are considered; one is a generalization of an ARX model, the other two are generalizations of a regression model with AR errors. One of the models is applied to two data sets: IBM stock transactions and Melbourne’s rainfall. The fitted models show that stock transaction occurrences are more likely if there have been large transactions in the previous time period. They also show that the Southern Oscillation Index does not provide a strong predictor of rainfall occurrence in Melbourne, contrary to current meteorological practice.

Keywords: ARX models, autocorrelated errors; autocorrelation; binary time series; generalized additive model; generalized linear model; logistic regression; non-​​Gaussian time series, smoothing with correlated errors; time series regression.

Download pdf of article

Calculating the odds

Hyndman, R.J. (1987)
In Faces of gambling, Proceedings of the second national conference of the National Association for Gambling Studies (1986). ed. Michael Walker. pp.139–152.