Graphical models are popular tools for exploring relationships among a set of variables. The Gaussian graphical model (GGM) is an important class of graphical models, where the conditional dependence among variables is represented by nodes and edges in a graph. In many real applications, we are interested in detecting hubs in graphical models, which refer to nodes with a significant higher degree of connectivity compared to non-hub nodes. A typical strategy for hub detection consists of estimating the graphical model, and then using the estimated graph to identify hubs. Despite its simplicity, the success of this strategy relies on the accuracy of the estimated graph. In this paper, we directly target on the estimation of hubs, without the need of estimating the graph. We establish a novel connection between the presence of hubs in a graphical model, and the spectral decomposition of the underlying covariance matrix. Based on this connection, we propose the method of inverse principal components for hub detection (IPC-HD). Both consistency and convergence rates are established for IPC-HD. Our simulation study demonstrates the superior performance and fast computation of the proposed method compared to existing methods in the literature in terms of hub detection. Our application to a prostate cancer gene expression dataset detects several hub genes with close connections to tumor development.
赵俊龙,北京师范大学统计学院教授。从事数理统计和机器学习相关研究,包括:高维数据分析、统计机器学习、稳健统计等。在统计学各类期刊发表论文六十余篇,部分结果发表在统计学国际顶级期刊JRSSB、 AOS、 JASA,Biometrika, JBES等期刊,主持国家自然科学基金多项。