A unified momentum-based paradigm of decentralized SGD for non-convex models and heterogeneous data
报告人:杜海舟
报告地点:腾讯会议ID: 8898648458 密码: 115119
报告时间:2024年12月17日星期二09:00-10:00
邀请人:徐东坡
报告摘要:
Emerging distributed applications recently boosted the development of decentralized machine learning, especially in AIoT and edge computing fields. In real-world scenarios, the common problems of non-convexity and data heterogeneity result in inefficiency, performance degradation, and development stagnation. The bulk of studies concentrate on one of the issues mentioned above without having a more general framework that has been proven optimal. To this end, we propose a unified paradigm called UMP, which comprises two algorithms D-SUM and GT-DSUM based on the momentum technique with decentralized stochastic gradient descent (SGD). The former provides a convergence guarantee for general non-convex objectives, while the latter is extended by introducing gradient tracking, which estimates the global optimization direction to mitigate data heterogeneity (i.e., distribution drift). We can cover most momentum-based variants based on the classical heavy ball or Nesterov's acceleration with different parameters in UMP. In theory, we rigorously provide the convergence analysis of these two approaches for non-convex objectives and conduct extensive experiments, demonstrating a significant improvement in model accuracy up to 57.6% compared to other methods in practice.
主讲人简介:
杜海舟,博士,上海电力大学副教授,硕导。先后在美国耶鲁大学,英国strathclyde大学做过长期和短期的学术访问。现担任ICML、AAAI、IJCAI 三个人工智能领域CCF A类国际会议TPC,TMC、KBS、SMC、TNNLS、AIJ人工智能领域顶级期刊审稿人。同时也是中国计算机学会高级会员、分布式计算与系统专委会执行委员,数据与网络专委会执行委员、中国人工智能学会青工委委员,IEEE,ACM会员,上海市信息技术能力考试人工智能命题组副组长,上海市科技奖励评审专家。目前主要从事分布式机器学习、大数据分析,知识图谱,网络性能优化等领域的研究工作。近5年发表论文50余篇,其中在NeruIPS、ECML、AI等人工智能顶级会议和中科院SCI一区发表论文10篇,被CCF认定高水平国际会议论文录用20余篇,国际会议最佳论文奖2篇。先后获得过上海市科技进步二等奖1次,三等奖2次,拥有十余项授权发明专利和软件著作权。