This paper demonstrates that the performance of various outlier detection methods depends sensitively on both the data normalization schemes employed, as well as characteristics of the datasets. Recasting the challenge of understanding these dependencies as an algorithm selection problem, we perform the first instance space analysis of outlier detection methods. Such analysis enables the strengths and weaknesses of unsupervised outlier detection methods to be visualized and insights gained into which method and normalization scheme should be selected to obtain the most likely best performance for a given dataset.
On normalization and algorithm selection for unsupervised outlier detection
Sevvandi Kandanaarachchi, Mario A Muñoz, Rob J Hyndman and Kate Smith-Miles