Sample quantiles in statistical packages

American Statistician (1996), 50 361-365.

Rob J Hyndman1 and Yanan Fan2

  1. Department of Econometrics and Business Statistics, Monash University, Clayton VIC 3800, Australia.
  2. School of Mathematics and Statistics, University of NSW.

Abstract: There are a large number of different definitions used for sample quantiles in statistical computer packages. Often within the same package one definition will be used to compute a quantile explicitly while other definitions may be used when producing a boxplot, a probability plot or a QQ-plot. We compare the most commonly implemented sample quantile definitions by writing them in a common notation and investigating their motivation and some of their properties. We argue that there is a need to adopt a standard definition for sample quantiles so that the same answers are produced by different packages and within each package. We conclude by recommending that the median-unbiased estimator is used since it has most of the desirable properties of a quantile estimator and can be defined independently of the underlying distribution.

Keywords: sample quantiles, percentiles, quartiles, statistical computer packages.

R code: The quantile() function in R from version 2.0.0 onwards implements all the methods in this paper.

Errata:

  • Table 1, p361. P2 should have lower bound equal to ⌊np⌋.
  • p363, left column. P2 is satisfied if and only if α≥0 and β≤1.

Thanks to Eric Langford and Alan Dorfman for pointing out the errors. 8 May 2007.

Article on JSTOR

 
Tags: