Archive for category Refereed papers

Phenological change detection while accounting for abrupt and gradual trends in satellite image time series

Jan Verbesselt1, Rob J Hyndman2, Achim Zeilis3, Darius Culvenor1
  1. Remote sensing team, CSIRO Sustainable Ecosystems, Private Bag 10, Melbourne VIC 3169, Australia
  2. Department of Econometrics and Business Statistics, Monash University, Melbourne VIC 3800, Australia
  3. Institute for Statistics, Leopold-​​Franzens-​​Universitt Innsbruck, 6020 Innsbruck, Austria
Remote Sensing of Environment, to appear.

Abstract
A challenge in phenology studies is understanding what constitutes significant phenological change amidst background variation (e.g. noise) and ecosystem disturbances (e.g. fires). The majority of phenological studies have focussed on extracting critical points in the seasonal growth cycle (e.g. Start-​​of-​​spring), without exploiting the full temporal detail. Moreover, the high degree of phenological variability between years demonstrates the necessity of distinguishing long term phenological change from temporal variability. Here, we evaluate the phenological change detection ability of a method for detecting Breaks For Additive Seasonal and Trend (BFAST). BFAST integrates the decomposition of time series into trend, seasonal, and noise components with methods for detecting change within time series. BFAST detects significant phenological changes within time series by exploiting the full time series without needing to derive phenological metrics. The times and numbers of trend and phenological changes are iteratively estimated by fitting piecewise robust linear models, of which the parameters are used to characterize change by its magnitude and direction. We tested BFAST by simulating 16-​​day Normalized Difference Vegetation Index (NDVI) time series with varying amounts of seasonality and noise, containing abrupt disturbances (e.g. fires) and long term phenological changes. This revealed that BFAST is able to accurately detect the number and timing of phenological changes within time series while accounting for disturbances (e.g. fires) and noise. The simulation study also showed that the phenological change detection is influenced by the signal to noise ratio of the time series. Application of the method on 16-​​day NDVI MODIS images from 2000 until 2009 for a forested study area in south eastern Australia confirmed these results. Phenological change is more easily detected in grasslands where the seasonal amplitude is larger than 0.3 NDVI when compared to evergreen forests where the seasonal amplitude is approximately 0.1 NDVI while noise levels were the same. BFAST present a novel approach for the detection of significant long term phenological changes within full time series which is necessary to study spatio-​​temporal patterns in land cover phenology, distinguish change from interannual variability in a global change context. The method can be applied to other disciplines dealing with seasonal time series data, such as biology, hydrology, and climatology to detect and characterize change within time series. The methods described in this study are available in the BFAST package for R (R Development Core Team, 2009).

Keywords: seasonal change, phenology, change detection, time series, disturbance,climate change, remote sensing, NDVIMODIS.

Working paper

Online paper

Forecasting age-​​related changes in breast cancer mortality among white and black US women

Farah Yasmeen, Rob J Hyndman and Bircan Erbas
Cancer Epidemiology, to appear

Abstract:
The disparity in breast cancer mortality rates among white and black US women is widening, with higher mortality rates among black women. We apply functional time series models on age-​​specific breast cancer mortality rates for each group of women, and forecast their mortality curves using exponential smoothing state-​​space models with damping.

The data were obtained from the Surveillance, Epidemiology and End Results (SEER) program of the US. Mortality data were obtained from the National Centre for Health Statistics (NCHS) available on the SEER*Stat database. We use annual unadjusted breast cancer mortality rates from 1969 to 2004 in 5-​​year age groups (45−49, 50–54, 55–59, 60–64, 65–69, 70–74, 75–79, 80–84). Age-​​specific mortality curves were obtained using nonparametric smoothing methods. The curves are then decomposed using functional principal components and we fit functional time series models with four basis functions for each population separately. The curves from each population are forecast and prediction intervals are calculated.

Twenty-​​year forecasts indicate an over-​​all decline in future breast cancer mortality rates for both groups of women. This decline is steeper among white women aged 55–73 and black women aged 60–84. For black women under 55 years of age, the forecast rates are relatively stable indicating no significant change in future breast cancer mortality rates among young black women in the next 20 years.

Keywords: Breast cancer mortality, racial and ethnic disparities, screening, trends, forecasting, functional data analysis

Working paper

Published paper

Nonparametric time series forecasting with dynamic updating

Han Lin Shang and Rob J Hyndman
Mathematics and Computers in Simulation (2010), to appear.

Abstract
We present a nonparametric method to forecast a seasonal univariate time series, and propose four dynamic updating methods to improve point forecast accuracy. Our methods consider a seasonal univariate time series as a functional time series. We propose first to reduce the dimensionality by applying functional principal component analysis to the historical observations, and then to use univariate time series forecasting and functional principal component regression techniques. When data in the most recent year are partially observed, we improve point forecast accuracy using dynamic updating methods. We also introduce a nonparametric approach to construct prediction intervals of updated forecasts, and compare the empirical coverage probability with an existing parametric method. Our approaches are data-​​driven and computationally fast, and hence they are feasible to be applied in real time high frequency dynamic updating. The methods are demonstrated using monthly sea surface temperatures from 1950 to 2008.

Keywords: Functional time series, Functional principal component analysis, Ordinary least squares, Penalized least squares, Ridge regression, Sea surface temperatures, Seasonal time series.

Working paper

Published paper

The tourism forecasting competition

George Athanasopoulos, Rob J Hyndman, Haiyan Song and Doris Wu
International Journal of Forecasting (2011) 27(2), to appear

Abstract We evaluate the performance of various methods for forecasting tourism demand. The data used include 366 monthly series, 427 quarterly series and 518 yearly series, all supplied to us by tourism bodies or by academics from previous tourism forecasting studies. The forecasting methods implemented in the competition are univariate and multivariate time series approaches, and econometric models. This forecasting competition differs from previous competitions in several ways: (i) we concentrate only on tourism demand data; (ii) we include approaches with explanatory variables; (iii) we evaluate the forecast interval coverage as well as point forecast accuracy; (iv) we observe the effect of temporal aggregation on forecasting accuracy; and (v) we consider the mean absolute scaled error as an alternative forecasting accuracy measure. We find that pure time series approaches provide more accurate forecasts for tourism data than models with explanatory variables. For seasonal data we implement three fully automated pure time series algorithms that generate accurate point forecasts and two of these also produce forecast coverage probabilities which are satisfactorily close to the nominal rates. For annual data we find that Naïve forecasts are hard to beat.

KeywordsARIMA, exponential smoothing, state space model, time varying parameter model, dynamic regression, autoregressive distributed lag model, vector autoregression.

Working paper

Online paper

See my blog for an opportunity to beat us, have your method published in the International Journal of Forecasting and win $500!

Rainbow plots, bagplots and boxplots for functional data

Rob J Hyndman and Han Lin Shang
Journal of Computational and Graphical Statistics (2010), 19(1), 29–45.

Abstract: We propose new tools for visualizing large numbers of functional data in the form of smooth curves or surfaces. The proposed tools include functional versions of the bagplot and boxplot, and make use of the first two robust principal component scores, Tukey’s data depth and highest density regions.

By-​​products of our graphical displays are outlier detection methods for functional data. We compare these new outlier detection methods with existing methods for detecting outliers in functional data and show that our methods are better able to identify the outliers.

Keywords: Highest density regions, Robust principal component analysis, Kernel density estimation, Outlier detection, Tukey’s halfspace depth.

Online paper

Working paper

R package

Using functional data analysis models to estimate future time trends of age-​​specific breast cancer mortality for the United States and England-​​Wales

Bircan Erbas1, Muhammad Akram2, Dorota M Gertig3, Dallas English4,5, John L. Hopper5, Anne M Kavanagh6 and Rob J Hyndman2
Journal of Epidemiology (2010), 20(2), 159–165.
  1. School of Public Health, La Trobe University, Bundoora, 3086 Australia
  2. Business and Economic Forecasting Unit, Monash University, Clayton, 3800, Australia.
  3. Victoria Cytology Service Inc, Carlton, 3053 Australia.
  4. Cancer Epidemiology Centre, The Cancer Council Victoria, Carlton 3053 Australia.
  5. Centre for MEGA Epidemiology, The University of Melbourne, Parkville 3053 Australia.
  6. Key Centre for Women’s Health in Society, School of Population Health, The University of Melbourne, Parkville, 3053 Australia.
ABSTRACT

Background: Mortality/​incidence predictions are used for planning public health resources and need to accurately reflect age-​​related changes through time. We present a new forecasting model to estimate future trends in age-​​related breast cancer mortality for the United States and England-​​Wales.

Material and methods: We use functional data analysis techniques to model breast cancer mortality-​​age relationships in the United States from 1950 to 2001 and England-​​Wales from 1950 to 2003, and estimate 20-​​year predictions using a new forecasting method.

Results: In the United States, trends for women aged 45–54 years continued to decline since 1980. In contrast, trends in women aged 60 — 84 years increased in the 1980s and declined in the 1990s. For England-​​Wales, trends for women aged 45 to 74 years slightly increased prior to 1980, but declined thereafter. The greatest age-​​related changes for both countries were during the 1990s. For both the United States and England-​​Wales, trends are expected to decline and then stabilize with the greatest decline in women aged 60 — 70 years. Forecasts suggest relatively stable trends for women over 75 years.

Conclusions: Predicting age related changes in mortality/​incidence can be used for planning and targeting programs for specific age groups. Currently, these models are being extended to incorporate other variables that may influence age-​​related changes in mortality/​incidence trends. In their current form, these models will be most useful for modelling and projecting future trends of diseases where there has been very little advancement in treatment and minimal cohort effects such as lethal cancers.

Key words: breast cancer, forecasting, functional-​​data-​​analysis models, mortality trends

Online paper

Detecting trend and seasonal changes in satellite image time series

Jan Verbesselt1, Rob J Hyndman2, Glenn Newnham1, Darius Culvenor1
Remote Sensing of Environment (2010), 114(1), 106–115.
  1. Remote sensing team, CSIRO Sustainable Ecosystems, Private Bag 10, Melbourne VIC 3169, Australia
  2. Department of Econometrics and Business Statistics, Monash University, Melbourne VIC 3800, Australia
Abstract

A wealth of remotely sensed time series covering large areas is now available to the earth science community. Change detection methods are often not capable of detecting land cover changes within time series that are heavily influenced by seasonal climatic variations. Detecting change within the trend and seasonal components of time series enables the detection of different types of changes. Changes occurring in the trend component indicate disturbances (e.g., insect attack), while changes occurring in the seasonal component indicate phenological changes (e.g., change in land cover type). An approach is proposed for automated change detection in time series by detecting and characterizing Breaks For Additive Seasonal and Trend (BFAST). BFAST integrates the decomposition of time series into trend, seasonal, and remainder components with methods for detecting significant change within time series. BFAST iteratively estimates the time and number of changes, and characterizes change by its magnitude and direction. We tested BFAST by simulating 16-​​day composites of Normalized Difference Vegetation Index (NDVI) time series with varying amounts of seasonality and noise, and by adding abrupt changes at different times and magnitudes. This revealed that BFAST can robustly detect change with different magnitudes (>0.1 NDVI) within time series with different noise levels (0.01−−0.07 σ) and seasonal amplitudes (0.1−−0.5 NDVI) Additionally, BFAST was applied to 16-​​day NDVI MODIS (Moderate Resolution Imaging Spectroradiometer) composites for a forested study area in south eastern Australia. This showed that BFAST is able to detect and characterize spatial and temporal changes in a forested landscape. BFAST is developed as a generic change detection approach, and can be applied to time series without the need to normalize for specific land cover types, select a reference period, or define a threshold or change trajectory. The method can be used to detect and characterize changes within time series or can be integrated within monitoring frameworks and used as an alarm system to flag when and where significant changes occur.

Online paper

Density forecasting for long-​​term peak electricity demand

Rob J Hyndman and Shu Fan
IEEE Transactions on Power Systems, 2010, 25(2), 1142–1153

Abstract: Long-​​term electricity demand forecasting plays an important role in planning for future generation facilities and transmission augmentation. In a long term context, planners must adopt a probabilistic view of potential peak demand levels, therefore density forecasts (providing estimates of the full probability distributions of the possible future values of the demand) are more helpful than point forecasts, and are necessary for utilities to evaluate and hedge the financial risk accrued by demand variability and forecasting uncertainty. This paper proposes a new methodology to forecast the density of long-​​term peak electricity demand.

Peak electricity demand in a given season is subject to a range of uncertainties, including underlying population growth, changing technology, economic conditions, prevailing weather conditions (and the timing of those conditions), as well as the general randomness inherent in individual usage. It is also subject to some known calendar effects due to the time of day, day of week, time of year, and public holidays.

We describe a comprehensive forecasting solution in this paper. First, we use semi-​​parametric additive models to estimate the relationships between demand and the driver variables, including temperatures, calendar effects and some demographic and economic variables. Then we forecast the demand distributions using a mixture of temperature simulation, assumed future economic scenarios, and residual bootstrapping. The temperature simulation is implemented through a new seasonal bootstrapping method with variable blocks.

The proposed methodology has been used to forecast the probability distribution of annual and weekly peak electricity demand for South Australia since 2007. We evaluate the performance of the methodology by comparing the forecast results with the actual demand of the summer 200708.

Keywords: Long-​​term demand forecasting, density forecast, time series, simulation.

Online article

The vector innovations structural time series framework: a simple approach to multivariate forecasting

Ashton de Silva1, Rob J Hyndman2 and Ralph D Snyder2
Statistical modelling (2010), to appear.
  1. School of Economics, Finance and Marketing, RMIT, VIC 3000, Australia.
  2. Department of Econometrics and Business Statistics, Monash University, VIC 3800, Australia.

Abstract The vector innovations structural time series framework is proposed as a way of modelling a set of related time series. Like all multivariate approaches, the aim is to exploit potential inter-​​series dependencies to improve the fit and forecasts. The model is based around an unobserved vector of components representing features such as the level and slope of each time series. Equations that describe the evolution of these components through time are used to represent the inter-​​temporal dependencies. The approach is illustrated on a bivariate data set comprising Australian exchange rates of the UK pound and US dollar. The forecasting accuracy of the new modelling framework is compared to other common uni– and multivariate approaches in an experiment using time series from a large macroeconomic database.

Keywords: vector innovations structural time series, state space model, multivariate time series, exponential smoothing, forecast comparison, vector autoregression.

Download pdf file

Exponential smoothing and non-​​negative data

Md. Akram1, Rob J. Hyndman1 and J. Keith Ord2
Australian and New Zealand Journal of Statistics (2009), 51(4), 415–432.
  1. Department of Econometrics and Business Statistics, Monash University, VIC 3800, Australia.
  2. McDonough School of Business, Georgetown University, Washington, DC20057, USA.

Abstract The most common forecasting methods in business are based on exponential smoothing and the most common time series in business are inherently non-​​negative. Therefore it is of interest to consider the properties of the potential stochastic models underlying exponential smoothing when applied to non-​​negative data. We explore exponential smoothing state space models for non-​​negative data under various assumptions about the innovations, or error, process.

We first demonstrate that prediction distributions from some commonly used state space models may have an infinite variance beyond a certain forecasting horizon. For multiplicative error models which do not have this flaw, we show that sample paths will converge almost surely to zero even when the error distribution is non-​​Gaussian. We propose a new model with similar properties to exponential smoothing, but which does not have these problems, and we develop some distributional properties for our new model.

We then explore the implications of our results for inference, and compare the short-​​term forecasting performance of the various models using data on the weekly sales of over three hundred items of costume jewelry.

The main findings of the research are that the Gaussian approximation is adequate for estimation and one-​​step-​​ahead forecasting. However, as the forecasting horizon increases, the approximate prediction intervals become increasingly problematic.  When the model is to be used for simulation purposes, a suitably specified scheme must be employed.

Keywords: forecasting; time series; exponential smoothing; positive-​​valued processes; seasonality; state space models.

Online paper