Searching for the Implicit Bias of Deep Learning

Speaker: Matus Telgarski

Location: 60 Fifth Avenue, Room C15

Date: Thursday, March 2, 2023

What makes deep learning special --- why is it effective in so many settings where other models fail? This talk will present recent progress from three perspectives. The first result is approximation-theoretic: deep networks can easily represent phenomena that require exponentially-sized shallow networks, decision trees, and other classical models. Secondly, I will show that their statistical generalization ability --- namely, their ability to perform well on unseen testing data --- is correlated with their prediction margins, a classical notion of confidence. Finally, comprising the majority of the talk, I will discuss the interaction of the preceding two perspectives with optimization: specifically, how standard descent methods are implicitly biased towards models with good generalization. Here I will present two approaches: the strong implicit bias, which studies convergence to specific well-structured objects, and the weak implicit bias, which merely ensures certain good properties eventually hold, but has a more flexible proof technique.