Events
A mathematical perspective on transformers
Speaker: Philippe Rigollet
Location: 60 Fifth Avenue, Room 150
Date: Thursday, October 26, 2023
In just five years since their introduction, Transformers have revolutionized large language models and the broader field of deep learning. Central to this transformative success is the groundbreaking self-attention mechanism. In this presentation, I'll introduce a mathematical framework that casts this mechanism as a mean-field interacting particle system, revealing a desirable long-time clustering behavior. This perspective leads to a trove of fascinating questions with unexpected connections to Kuramoto oscillators, sphere packing, and Wasserstein gradient flows.