Events | ai @ NYU

Data Science Lunch Seminar: In-context Language Learning

Speaker: Ekin Akyürek

Location: 60 Fifth Avenue, Room 7th floor common area

Date: Wednesday, April 3, 2024

Abstract: Language models exhibit in-context learning (ICL), the ability to learn new tasks from just a few examples of prompts in the context presented to them. Prior work has studied ICL through the lens of simple learning problems like linear regression, but there remains a gap in understanding the rich language generation capabilities exhibited in real language models. In this talk, I will discuss a new model problem for understanding ICL — in-context learning of (formal) languages (ICLL). In ICLL, language models are presented with example strings from a probabilistic language and must generate additional strings from that same language. Focusing on regular languages sampled from random finite automata, we study the behavior of a variety of sequence models on the ICLL task. We show that Transformers significantly outperform recurrent and convolutional models on these tasks. Moreover, we find evidence that their ability to do so relies on specialized “n-gram heads” (higher-order variants of induction heads) that compute input-conditional next-token distributions. Finally, we show that hard-wiring these heads into neural models improves performance not just on formal language learning, but modeling of real natural-language text — improving the perplexity of 340M-parameter models by up to 1.14 points (6.7%) on the SlimPajama dataset.

Bio: Ekin Akyürek is a PhD candidate at MIT CSAIL studying generalization in neural sequence models, advised by Jacob Andreas. He wants to understand how learning and algorithmic reasoning work in LMs, and use this understanding to design better models. His early work aimed to enable neural sequence models to compositionally generalize new tasks by a small number of demonstrations. His recent work contributed to understanding of how Transformers do in-context learning in model problems like linear regression and formal language learning. He is a recipient of Amazon Alexa Fellowship, and area chair paper award in ACL 2023. He received his Bachelor’s degrees in Electrical & Electronics Engineering and in Physics from Koç University in 2019.