Events
CDS & MBM Seminar: Minds, Brains, Machines and Center for Data Science Event on Interpretability
Speaker: David Bau, Grace Lindsay
Location: 60 Fifth Avenue, Room 150
Date: Friday, November 1, 2024
In this talk we discuss recent work in interpreting and understanding the explicit structure of learned computations within large deep network models. We examine the localization of factual knowledge within transformer LMs, and discuss how these insights can be used to edit behavior of LLMs and multimodal diffusion models. Then we discuss recent findings on the structure of computations underlying in-context learning, and how these lead to insights about the representation and composition of functions within LLMs. Finally, time permitting, we discuss the technical challenges of doing interpretability research in a world where the most powerful models are only available via API, and we describe a National Deep Inference Fabric that will offer a transparent API standard that enables transparent scientific research on large-scale AI.