NLP and Text-as-Data Speaker Series: Understanding Incremental Processing in LLMs with Temporal Feature Analysis

Speaker: Aaron Mueller

Location: 60 Fifth Avenue, Room Center for Data Science
Videoconference link: https://nyu.zoom.us/j/93934334443

Date: Thursday, February 5, 2026

Language is processed incrementally; this gives rise to a rich temporal structure at multiple scales. Many methods exist for decomposing continuous language model activations into discrete human-interpretable features, but they implicitly assume that features are time-invariant. In this talk, I will show how the non-stationarity of language model representations leads to predictable pathologies in current feature extraction methods. I will then introduce temporal feature analysis (TFA), a method inspired by predictive coding principles in neuroscience. By decomposing activations into context-dependent and stimulus-driven components, TFA recovers event boundaries, evolving syntactic representations, and context-sensitive representational structures. Using garden-path sentences as a case study, we obtain interventional evidence that models represent mutually incompatible syntactic features before disambiguation, and observe that TFA recovers the correct parse after disambiguation. I will conclude by discussing the inherent limitations of feature extraction methods, and mention ongoing work to ground them in causal formalisms. Ultimately, by matching the inductive biases of analysis techniques to the phenomena they explain, we can develop more faithful and testable predictions about the conceptual structures of neural systems.

 

Bio: Aaron Mueller is an assistant professor of Computer Science and, by courtesy, of Data Science at Boston University. His research centers on developing language modeling methods, evaluations, and analysis techniques inspired by causal and linguistic principles, and applying these to precisely control and improve the generalization of natural language processing systems. He completed his Ph.D. at Johns Hopkins University, and was a Zuckerman postdoctoral fellow at Northeastern and the Technion. His work has been published in ML and NLP venues (such as ICML, ACL, and EMNLP) and has won awards at TMLR and ACL. He is a recurring co-organizer of the BlackboxNLP and BabyLM workshops, and has recently been featured in IEEE Spectrum (2024) and MIT Technology Review (2025).