Events
MIC Seminar: Precise Asymptotic Analysis of Sobolev Training for Random Feature Models
Speaker: Kate Fisher
Location: 60 Fifth Avenue
Date: Monday, November 10, 2025
Gradient data is widely relevant in science and engineering, so it is natural to use it to train neural networks. However, it remains unclear on a theoretical level how Sobolev training, the augmentation of regression loss functions with gradient data, performs in high dimensions and when the model is highly overparameterized. Studies of training with squared loss on function data have demonstrated that overparameterized networks can memorize noisy data yet outperform their underparameterized counterparts when generalizing to new test cases. This talk presents a precise asymptotic characterization of a random feature (RF) model under Sobolev training. Our analysis is carried out in the proportional asymptotics limit where the number of trainable parameters, input dimensions, and training data points jointly tend to infinity. This limit implies infinite spatial dimension, so we sketch gradients onto a finite dimensional subspace in the training loss, a technique consistent with practical applications of Sobolev training. By combining the replica method from statistical physics with linearization strategies from operator-valued free probability, we obtain a finite dimensional, efficient to solve, description of the generalization capabilities of the RF network, which allow us to identify the conditions under which gradient data improve generalization.
Bio: Kate Fisher is a graduate student at the Center for Computational Science and Engineering at MIT, working with Youssef Marzouk. Her research explores the theory of generalization and uncertainty in machine learning but also includes the practical construction of uncertainty aware surrogates for multi-scale materials simulation.