当前位置: 首页 > 学术活动 > 正文
Global Convergence of Block Coordinate Descent in Deep Learning
时间:2019年04月09日 16:52 点击数:

报告人:曾锦山

报告地点:数学与统计学院104报告厅

报告时间:2019年04月15日星期一09:20-10:00

邀请人:

报告摘要:

Deep learning has aroused extensive attention due to its great empirical success. The efficiency of the block coordinate descent (BCD) methods has been recently demonstrated in deep neural network (DNN) training. However, theoretical studies on their convergence properties are limited due to the highly nonconvex nature of DNN training. In this paper, we aim at providing a general methodology for provable convergence guarantees for this type of methods. In particular, for most of the commonly used DNN training models involving both two- and three-splitting schemes, we establish the global convergence to a critical point at a O(1/k) rate, where k is the number of iterations. The results extend to general loss functions which have Lipschitz continuous gradients and deep residual networks (ResNets). Our key development adds several new elements to the Kurdyka-Lojasiewicz inequality framework that enables us to carry out the global convergence analysis of BCD in the general scenario of deep learning.

 

主讲人简介:

江西师范大学计算机信息工程学院特聘教授,硕士导师。2015年博士毕业于西安交通大学数学系,2013年11月-2014年11月在美国加州大学洛杉矶分校数学系访问,2017年4月-2018年3月和2018年8月-2019年2月在香港科技大学数学系访问。现已发表SCI论文三十余篇,其中在IEEE Trans.系列期刊上发表论文近10篇。所发表论文曾获2018年世界华人数学家大会最佳论文奖。论文近五年被引560多次,单篇最高引用230次。承担国家自然科学基金1项,参与多项。现担任期刊Frontiers in Applied Mathematics and Statistics的Reviewer Editor及多个国际主流期刊评论员,曾担任2017年度IEEE CYBER国际会议学术委员会委员。主要研究方向包括非凸优化、分布式优化及机器学习。

©2019 东北师范大学数学与统计学院 版权所有

地址:吉林省长春市人民大街5268号 邮编:130024 电话:0431-85099589 传真:0431-85098237