Transfer learning enhances model performance by utilizing knowledge from related domains, particularly when labeled data is scarce. While existing research addresses transfer learning under various distribution shifts in independent settings, handling dependencies in networked data remains challenging. To address this challenge, we propose a high-dimensional transfer learning framework based on network convolutional regression (NCR), inspired by the success of graph convolutional networks (GCNs). The NCR model incorporates random network structure by allowing each node’s response to depend on its features and the aggregated features of its neighbors, capturing local dependencies effectively. Our methodology includes a two-step transfer learning algorithm that addresses domain shift between source and target networks, along with a source detection mechanism to identify informative domains. Theoretically, we analyze the lasso estimator in the context of a random graph based on the Erdős–Rényi model assumption, demonstrating that transfer learning improves convergence rates when informative sources are present. Empirical evaluations, including simulations and a real-world application using Sina Weibo data, demonstrate substantial improvements in prediction accuracy, particularly when labeled data in the target domain is limited.
黄丹阳,中国人民大学统计学院教授,吴玉章青年学者,中国人民大学国家治理大数据和人工智能创新平台北京市消费大数据监测子实验室主任。主持国家自然科学基金面上项目、北京市社会科学基金重点项目等科研课题,入选北京市科协青年人才托举工程,曾获北京市优秀人才培养资助。从事复杂网络模型、大规模数据计算等方向的理论研究,关注统计理论在中小企业数字化发展中的应用。研究成果三十余篇发表于JRSSB、JASA、JOE、JBES等权威期刊。独著专著《大规模网络数据分析与空间自回归模型》入选“京东统计学图书热卖榜”。获北京高校青年教师教学基本功比赛二等奖、最受学生欢迎奖等多项省部级教学奖励。