Rob J Hyndman1, Earo Wang1 and Nikolay Laptev2

  1. Monash Business School, Monash University, Clayton, Victoria, Australia.
  2. Yahoo Labs, Sunnyvale, California, USA

It is becoming increasingly common for organizations to collect very large amounts of data over time, and to need to detect unusual or anomalous time series. For example, Yahoo has banks of mail servers that are monitored over time. Many measurements on server performance are collected every hour for each of thousands of servers. We wish to identify servers that are behaving unusually.

We compute a vector of features on each time series, measuring characteristics of the series. The features may include lag correlation, strength of seasonality, spectral entropy, etc. Then we use a principal component decomposition on the features, and use various bivariate outlier detection methods applied to the first two principal components. This enables the most unusual series, based on their feature vectors, to be identified. The bivariate outlier detection methods used are based on highest density regions and \(\alpha\)-hulls.

Download working paper

Associated R package

  Tag: highest density regions

9 posts
June 1st, 2015

Large-scale unusual time series detection

Rob J Hyndman, Earo Wang and Nikolay Laptev

October 19th, 2013

hdrcde package for R

The hdrcde package provides tools for computation of highest density regions in one and two dimensions, kernel estimation of univariate […]

March 10th, 2011

Improved interval estimation of long run response from a dynamic linear model: a highest density region approach

Jae H. Kim1 , Iain Fraser2 and Rob J. Hyndman1 Department of Econometrics and Business Statistics, Monash University, VIC 3800, […]

August 3rd, 2010

Exploratory graphics for functional data

Han Lin Shang and Rob J Hyndman Department of Econometrics and Business Statistics, Monash University, Clayton, Australia Interface 2010: Computing Science […]

March 1st, 2010

Rainbow plots, bagplots and boxplots for functional data

Rob J Hyndman and Han Lin Shang Journal of Computational and Graphical Statistics (2010), 19(1), 29-45. Abstract: We propose new […]

June 19th, 2008

Bagplots, boxplots and outlier detection for functional data

Australian Statistics Conference. Melbourne, July 2008. When: June 19-21, 2008 Where: First International Workshop on Functional and Operatorial Statistics, Toulouse […]

May 15th, 2008

Bagplots, boxplots and outlier detection for functional data

Rob J Hyndman and Han Lin Shang (2008) In Dabo-Niang, S., and Ferraty, F. (eds), Functional and Operatorial Statistics, chap […]

April 1st, 2007

Half-life estimation based on the bias-corrected bootstrap: a highest density region approach

Computational Statistics and Data Analysis (2007), 51(7), 3418-3432. Jae H. Kim, Param Silvapulle and Rob J. Hyndman Abstract: The half-life […]

July 16th, 1996

Estimating and visualizing conditional densities

Journal of Computational and Graphical Statistics (1996), 5 315-336. Rob J Hyndman1 and David Bashtannyk1 Abstract: We consider the kernel […]

July 16th, 1996

Computing and graphing highest density regions

American Statistician (1996),50, 120-126. Rob J Hyndman1 Abstract: Many statistical methods involve summarizing a probability distribution by a region of […]

July 16th, 1995

Highest density forecast regions for non-linear and non-normal time series models

Journal of Forecasting (1995),14, 431-441. Rob J Hyndman Abstract: Many modern time series methods, such as those involving non-linear models […]

December 17th, 1992

Continuous-time threshold autoregressive modelling

Rob J Hyndman (1992)  PhD thesis, The University of Melbourne. Abstract: This thesis considers continuous time autoregressive processes defined by […]