CILVR Seminar: Toward Improved Sample Efficiency: From Active Learning to Language Modeling

Speaker: Jordan Ash

Location: 60 Fifth Avenue, Room 7th floor open space
Videoconference link: https://nyu.zoom.us/j/98889023391

Date: Wednesday, December 3, 2025

This talk is about exploration as a means of improving sample complexity. I’ll first discuss VeSSAL, a powerful and flexible batch active learning algorithm designed for the streaming setting. VeSSAL takes advantage of model representations to prioritize the labeling of data most productive for model fitting. I’ll then show how its underlying principles—motivated by foundational approaches to sequential decision making—can be used to improve the sample efficiency of language models in both post-training and inference. I’ll argue that this classical, representation-based perspective offers promising insights for constructing more efficient contemporary learning systems.Bio: Jordan Ash is a principal researcher at Microsoft Research in New York City. His research focuses on uncertainty quantification and sequential decision-making methods for deep learning and generative modeling. He earned his PhD in Computer Science from Princeton University in 2020, where he was advised by Ryan P. Adams.