Events
How to improve the robustness and interpretability of Natural Language Inference models
Speaker: Joe Stacey
Location: 60 Fifth Avenue, Room 7th Floor Open Space, CDS
Date: Wednesday, July 23, 2025
Abstract: Joe's talk will have two parts: 1) discussing how to improve the robustness of fine-tuned Natural Language Inference (NLI) models, and 2) introducing a method for creating inherently interpretable NLI models that provide faithful explanations for each prediction.In the first part, Joe will talk about different strategies to improve robustness, including training with natural language explanations, and using LLMs to generate out-of-distribution data for fine-tuning. Joe will also discuss why debiasing models is often not an effective solution, and how model robustness methods can also be applied to large-scale closed-source LLMs.The second part of the talk will introduce atomic inference, an approach that involves decomposing a task into discrete atoms, before making predictions for each atom and combining these atom-level predictions using interpretable rules. Joe will share why he's excited by these methods, and will discuss some open challenges that remain for future work.
Bio: Joe is an Apple AI/ML scholar in the 5th year of his PhD at Imperial College London, supervised by Marek Rei. Joe's research involves creating more robust and interpretable NLP models, focusing on the task of Natural Language Inference. Prior to his PhD, Joe worked as a consultant and taught maths in a challenging secondary school.