Events
CDS Seminar: Information constrained emergent communication
Speaker: Noga Zaslavsky
Location: 60 Fifth Avenue, Room 150
Date: Friday, April 25, 2025
While Large Language Models (LLMs) are transforming AI, they are also limited in important ways that are unlikely to be resolved with more data or compute resources. For example, LLMs require massive amounts of training data that does not exist for many languages, they are not grounded in the way humans perceive and act in the world, and they do not provide much insight on the origins of language and how languages evolve over time. In this talk, I propose a complementing approach. Specifically, I address the question: How can a human-like lexicon emerge in interactive neural agents, without any human supervision? To this end, I present a novel information-theoretic framework for emergent communication in artificial agents, which leverages recent empirical findings that human languages evolve under pressure to efficiently compress meanings into words via the Information Bottleneck (IB) principle. I show that this framework: (1) can give rise to human-like semantic systems in deep-learning agents, with an emergent signal-embedding space that resembles word embeddings, (2) yields better convergence rates and out-of-domain generalization compared to earlier emergent communication methods, and (3) allows us to bridge local context-sensitive pragmatic interactions and the emergence of a global non-contextualized lexicon. Taken together, this line of work advances our understanding of language evolution, both in humans and in machines, and more generally, it suggests that fundamental information-theoretic principles may underlie intelligent systems.