当前位置: 首页 > 学术活动 > 正文
Hidden Traveling Waves bind Working Memory Variables in Recurrent Neural Networks
时间:2024年03月20日 12:37 点击数:

报告人:Arjun Karuvally

报告地点:ZOOM会议ID: 89725288701 密码: 240327

报告时间:2024年03月27日星期三08:30-09:30

邀请人:高忆先、祖建

报告摘要:

Traveling waves are a fundamental phenomenon in the brain, playing a crucial role in short-term information storage. In this work, we leverage the concept of traveling wave dynamics within a neural lattice to formulate a theoretical model of neural working memory, study its properties, and its real-world implications in practical neural networks. We first investigate the model’s capabilities in representing and learning state histories, vital for learning in history-dependent dynamical systems. The findings reveal that the wave memory stores external information and enhances the learning process by addressing the diminishing gradient problem. To understand the model’s real-world applicability, we explore two cases of the wave theory: linear boundary condition and non-linear, self-attention-driven boundary condition. The experiments reveal that the linear case emerges in Recurrent Neural Networks (RNNs) when trained on history-dependent dynamical systems by the backpropagation algorithm showing that RNNs leverage waves as a working memory store. Conversely, the non-linear scenario parallels the autoregressive loop of an attention-only transformer. Collectively, our findings suggest the broader relevance of traveling waves in AI and its potential in advancing our understanding and improving neural network architectures.

主讲人简介:

I am a Phd Candidate in Computer Science from the University of Massachusetts, Amherst. Through my research, I develop models of memory to analyze and improve neural network systems. My contributions include a new model of sequence memory, that generalized the Hopfield energy paradigm to the temporal setting. This was published in the International Conference in Machine Learning (ICML) 2023. Recently, I have been interested in developing working memory models to study how practical neural networks process information, enabling researchers to open the black box of neural systems. Some of my contributions from this line of research include a theory to utilize linear subspaces to store recent history, which was published as the “Episodic Memory Theory for the Mechanistic Interpretation of Recurrent Neural Networks” in the Topology, Algebra and Geometry in ML 2023, and a mechanism to …

©2019 东北师范大学数学与统计学院 版权所有

地址:吉林省长春市人民大街5268号 邮编:130024 电话:0431-85099589 传真:0431-85098237