Searching for the Implicit Bias of Deep Learning
Time and Location:March 02, 2023 at 3:30PM; 60 Fifth Avenue, Room 7th Floor
Speaker:Matus Telgarsky, University of Illinois, Urbana-Champaign
What makes deep learning special --- why is it effective in so many settings where other models fail? This talk will present recent progress from three perspectives. The first result is approximation-theoretic: deep networks can easily represent phenomena that require exponentially-sized shallow networks, decision trees, and other classical models. Secondly, I will show that their statistical generalization ability --- namely, their ability to perform well on unseen testing data --- is correlated with their prediction margins, a classical notion of confidence. Finally, comprising the majority of the talk, I will discuss the interaction of the preceding two perspectives with optimization: specifically, how standard descent methods are implicitly biased towards models with good generalization. Here I will present two approaches: the strong implicit bias, which studies convergence to specific well-structured objects, and the weak implicit bias, which merely ensures certain good properties eventually hold, but has a more flexible proof technique.
Matus Telgarsky is an assistant professor at the University of Illinois, Urbana-Champaign, specializing in deep learning theory. He was fortunate to receive a PhD at UCSD under Sanjoy Dasgupta. Other highlights include: co-founding, in 2017, the Midwest ML Symposium (MMLS) with Po-Ling Loh; receiving a 2018 NSF CAREER award; and organizing two Simons Institute programs, one on deep learning theory (summer 2019), and one on generalization (fall 2024). Matus is very happy to be visiting NYC again, where he had undergraduate studies in Violin Performance at The Juilliard School.