Homepage of Levent Sagun

As of October 2017, I am a postdoctoral fellow jointly in Paris at ENS-Paris and CEA-Saclay, and in Lausanne at EPFL. The position is part of the Simons Collaboration Cracking the Glass Problem. In May 2017, I finished my PhD at the Courant Institute of Mathematical Sciences. I also did a research internship at FAIR over the summer of 2016.

Here is my CV and Google Scholar page. Code for some of the projects above are available at [GitHub]


Email: levent [dot] sagun [at] epfl [dot] ch

Research Interests

Probability, statistical mechanics, and deep learning from the energy landscape point-of-view. Applications of machine learning in social sciences.


Explorations on high dimensional landscapes [arXiv]
Levent Sagun, V. Ugur Guney, Gerard Ben Arous, Yann LeCun
ICLR 2015 Workshop Poster

Universal halting times in optimization and machine learning [AMS] [arXiv]
Levent Sagun, Thomas Trogdon, Yann LeCun
Quart. Appl. Math. 76 (2018), 289-301
ICML 2016 Optimization Workshop

Early Predictability of Asylum Court Decisions [SSRN] [ICAIL]
Matthew Dunn, Levent Sagun, Hale Sirin, Daniel Chen
ICAIL 2017, Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law, Pages 233-236

Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond [arXiv] [OpenReview]
Levent Sagun, Leon Bottou, Yann LeCun
Preprint, 2016

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys [arXiv] [OpenReview]
Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann LeCun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina
ICLR 2017 Conference Paper

SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine [arXiv][data]
Matthew Dunn, Levent Sagun, Mike Higgins, Ugur Guney, Volkan Cirik, Kyunghyun Cho
Preprint, 2017

Perspective: Energy Landscapes for Machine Learning [arXiv] [PCCP]
Andrew J. Ballard, Ritankar Das, Stefano Martiniani, Dhagash Mehta, Levent Sagun, Jacob D. Stevenson, David J. Wales
Physical Chemistry Chemical Physics, 19, 12585-12603, 2017

Empirical Analysis of the Hessian of Over-Parametrized Neural Networks [arXiv] [OpenReview]
Levent Sagun, Utku Evci, Ugur Guney, Yann Dauphin, Leon Bottou
ICLR 2018 Workshop Poster

Comparing Dynamics: Deep Neural Networks versus Glassy Systems [arXiv]
Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann LeCun, Matthieu Wyart, Giulio Biroli
ICML 2018 Conference Paper

Easing non-convex optimization with neural networks [OpenReview]
David Lopez-Paz, Levent Sagun
ICLR 2018 Workshop Poster

The jamming transition as a paradigm to understand the loss landscape of deep neural networks [arXiv]
Mario Geiger, Stefano Spigler, Stephane d'Ascoli, Levent Sagun, Marco Baity-Jesi, Giulio Biroli, Matthieu Wyart
Preprint, 2018

A jamming transition from under- to over-parametrization affects loss landscape and generalization [arXiv]
Stefano Spigler, Mario Geiger, Stephane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart
Integration of Deep Learning Theories, NeurIPS Workshop 2018

Scaling description of generalization with number of parameters in deep learning [arXiv]
Mario Geiger, Arthur Jacot, Stefano Spigler, Franck Gabriel, Levent Sagun, Stephane d'Ascoli, Giulio Biroli, Clement Hongler, Matthieu Wyart
Preprint, 2019

A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks [arXiv]
Umut Simsekli, Levent Sagun, Mert Gurbuzbalaban
Preprint, 2019


Statistical and Mathematical Methods, Center for Data Science at NYU [fall 2015, fall 2016]
Machine Learning, Center for Data Science at NYU [spring 2016]
Theory of Probability, Courant Institute [fall 2016, fall 2014]
Probability and Statistics, Courant Institute [spring 2015]
Introduction to Mathematical Analysis, Courant Institute [spring 2014]
Written Exam Workshop, Courant Institute [fall 2013]


A short twitter thread on duplicate images in CIFAR100 with different labels:

1 - Rethinking about "Understanding deep learning requires rethinking generalization" @beenwrekt and colleagues after some curious results of ResNets on CIFAR100 that required measuring the training accuracy: It was never 100%!

— Levent Sagun (@leventsagun) September 16, 2018

Here are the pairs of indices of duplicates as provided by the default torchvision.datasets.CIFAR100 (note that this is only on the training set, I didn't check the test set for duplicates):

[[4348, 30931], [8393, 36874], [8599, 22657], [9012, 31128], [16646, 31828], [17688, 41874], [18461, 46752], [20635, 32666], [23947, 33638], [25218, 46851], [27737, 47636], [28293, 41860], [30418, 47806], [31227, 34187]]