Analyses of Neural Networks Training Dynamics: Blessings of Width, Curses of Depth

Speaker: Lenaic Chizat

Location: 60 Fifth Avenue, Room 150

Date: Friday, May 26, 2023

In this talk, I will present several results around the large-width limit of gradient-based algorithms to train artificial neural networks (NNs). I will first review the case of two-layer NNs, where this asymptotics has led to new insights. This includes connections to mean-field analyses of interacting particles systems and a rich qualitative understanding of various training regimes. Next, I will explore the case of deeper NNs where the non-linear dynamics that arises appears much less tractable, notably because of the intricate way the random matrices of the initialization interact during training. I will elaborate on the case of deep linear NNs where our joint work with M. Colombo, X. Fernández-Real and A. Figalli has resulted in a complete description of the dynamics (https://arxiv.org/abs/2211.16980). I will argue that this simplified scenario illustrates the structure of randomness in general, serving as a valuable toy model to develop intuition.