Understanding Language Models through Discovery and by Design

Speaker: John Hewitt

Location: 60 Fifth Avenue, Room 150

Date: Tuesday, February 20, 2024

Whereas we understand technologies like airplanes or microprocessors well enough to fix them when they break, our tools for fixing modern language models are coarse. This is because, despite language models' increasing ubiquity and utility, we understand little about how they work. In this talk, I will present two lines of research for developing a deep, actionable understanding of language models that allows us to discover how they work, and fix them when they fail. In the first line, I will present structural probing methods for discovering the learned structure of language models, finding evidence that models learn structure like linguistic syntax. In the second line, I will show how we can understand complex models by design: through the new Backpack neural architecture, which gives us precise tools for fixing models.