MiC Seminar: Towards understanding sharpness-aware minimization

Speaker: Nicolas Flammarion

Location: 60 Fifth Avenue, Room 150

Date: Monday, November 28, 2022

Sharpness-Aware Minimization (SAM) is a recent training method that relies on worst-case weight perturbations which significantly improves generalization in various settings. In this talk, we theoretically analyze its implicit bias for diagonal linear networks. We prove that SAM always chooses a solution that enjoys better generalization properties than standard gradient descent for a certain class of problems, and that this effect is amplified when using m-sharpness. We further study the properties of the implicit bias on non-linear networks empirically. Finally, we provide convergence results of SAM for non-convex objectives when used with stochastic gradients. We illustrate these results empirically for deep networks and discuss their relation to the generalization behavior of SAM.