The box-and-whisker plot, introduced by Tukey (1977), is one of the most popular graphical methods in descriptive statistics. On the other hand, however, Tukey’s boxplot is free of sample size, yielding the so-called “one-size-fits-all” fences for outlier detection. Although improvements on the sample size adjusted boxplots do exist in the literature, most of them are either not easy to implement or lack justification. As another common rule for outlier detection, Chauvenet’s criterion uses the sample mean and standard derivation to perform the test, but it is often sensitive to the included outliers and hence is not robust. In this paper, by combining Tukey’s boxplot and Chauvenet’s criterion, we introduce a new boxplot, namely the Chauvenet-type boxplot, with the fence coefficient determined by an exact control of the outside rate per observation. Our new outlier criterion not only maintains the simplicity of the boxplot from a practical perspective, but also serves as a robust Chauvenet’s criterion. Simulation study and a real data analysis on the civil service pay adjustment in Hong Kong demonstrate that the Chauvenet-type boxplot performs extremely well regardless of the sample size, and can therefore be highly recommended for practical use to replace both Tukey’s boxplot and Chauvenet’s criterion. Lastly, to increase the visibility of the work, a user-friendly R package named ‘ChauBoxplot’ has also been officially released on CRAN.
童铁军,香港浸会大学数学系教授。2005年博士毕业于美国加州大学圣巴巴拉分校,2005-2007年在美国耶鲁大学从事博士后研究,2007-2010年在美国科罗拉多大学博尔德分校担任助理教授,2010年至今任职于香港浸会大学数学系。主要科研方向包括非参数回归模型、高维数据分析、Meta分析和循证医学。已在国际知名学术期刊JASA、Biometrika、Statistical Science、JMLR、Nature Communications等一共发表学术论文100余篇。