CILVR Seminar: PILAF: Optimal Human Preference Sampling for Reward Modeling

Date: Wednesday, February 12, 2025, 2PM
Location: 60FA , Room 7th floor open space
Speaker: Prof. Yaqi Duan

Watch the recording here.