CDS Seminar: Empirical Bayes in Action, and How Pretraining/Transformer Solves It

Speaker: Yanjun Han (NYU)

Location: 60 Fifth Avenue, Room 7th Floor Open Space

Date: Friday, March 27, 2026

Empirical Bayes (EB) provides a principled framework for estimating the prior distribution directly from the observed data, and for borrowing strength across a large number of related estimation problems. In the first part of this talk, I will give a concrete example of using EB for the classical problem of distribution estimation. This perspective yields an estimator that achieves optimal instance-wise risk in a competitive framework and ultimately bests the Good--Turing estimator, the classical "gold standard", in both theory and practice.

In the second part, I’ll explain why pretraining offers a modern approach to solving EB problems. We formalize recent empirical evidence that transformers pretrained on synthetic data perform strongly on EB tasks by showing the existence of universal priors under which a pretrained estimator achieves near-optimal regret uniformly over arbitrary test distributions. From this perspective, the pretrained estimator is performing hierarchical Bayesian inference: adaptation to unknown test priors arises through posterior contraction, and length generalization (when the test sequence exceeds the training length) corresponds to inference under a fractional posterior. Numerical experiments with pretrained transformers support these theoretical predictions.
 
Bio: Yanjun is an assistant professor of mathematics and data science at Courant math and CDS. He is broadly interested in the mathematics of data science, including statistics, online learning and bandits, information theory, and machine learning. Before joining NYU, he received his Ph.D. in electrical engineering at Stanford, and was a Simons postdoctoral fellow at UC Berkeley and a Norbert Wiener fellow at MIT.