NLP/Text-as-Data Seminar: "Lipstick on a Pig: Using Language Models as Few-Shot Learners"

Speaker: Sameer Singh

Location: 60 Fifth Avenue, Room 7th Floor Open Space

Date: Thursday, March 24, 2022

Current NLP pipelines increasingly rely on pre-trained models, focusing on language models that provide representations that we can adapt with minimal effort. For example, language modeling has provided exceptional few-shot natural language understanding and reasoning performance by framing these tasks as "fill in the blank" prompts. These benefits improve with larger models and datasets, suggesting that directly using language models may dominate the future of NLP applications. However, the goals of language modeling are not precisely the same as what we need from few-shot learners, and it is vital to understand this gap.

In this talk, I will describe some of our work in characterizing the differences between language modeling and few-shot learning. I will show how language modeling comes with crucial shortcomings for few-shot adaptation and describe a simple approach to address them. Then, focusing on numerical reasoning, I will show that the reasoning ability of the language models depends strongly on simple statistics of the pretraining corpus, performing much more accurately for more common terms. These results suggest language modeling may not be sufficient to learn robust reasoners and that we need to take the pretraining data into account when interpreting few-shot evaluation results.