CDS Seminar : Equipping LLMs for Complex Knowledge Scenarios: Interaction and Retrieval

Speaker: Eunsol Choi

Location: 60 Fifth Avenue, Room 150

Date: Friday, January 24, 2025

Language models are increasingly used as an interface to gather information. Yet trusting the answers generated from LMs is risky, as they often contain incorrect or misleading information. Why is this happening? We identify two key issues: (1) ambiguous and underspecified user questions and (2) imperfect knowledge in LMs, especially for long tail or recent events. To address the first issue, we propose a system that can interact with users to clarify their intent before answering. By simulating their expected outcomes in the future turns, we reward LMs for generating clarifying questions and not just answering immediately. In the second part of the talk, I will discuss the state of retrieval augmentation, which is often lauded as the path to provide up-to-date, relevant knowledge to LMs. While their success is evident in scenarios where there exists a single gold document, incorporating information from a diverse set of documents remains challenging for both retrieval systems and LMs. Together, the talk highlights key research directions for building reliable LMs to answer information seeking questions.