Events
Learning from Naturally Occurring Human Feedback
Speaker: Ge Gao
Location: 60 Fifth Avenue, Room 650
Date: Thursday, October 3, 2024
Recent advances in Reinforcement Learning from Human Feedback (RLHF) have demonstrated the effectiveness of comparison-based feedback provided by paid annotators. However, this feedback is expensive to collect, and rarely occurs in real-world model deployment. In this talk, I will present an alternative: learning from feedback that naturally arises during user interaction, such as edits made to AI writing assistants. This type of feedback naturally emerges in practical applications, and offers a more authentic reflection of the user preference. The challenge, however, lies in its implicit nature—user edits do not directly express the actual preference and are often nuanced, which can lead to diverse interpretations. I will first introduce the PRELUDE framework, which formulates the preference learning problem and interaction process as a cost minimization problem. Then, I will present CIPHER, a simple yet effective algorithm that utilizes large language models (LLMs) to infer the context-dependent preference from user edits.