Measuring and Leveraging Inconsistency in LLMs

Speaker: Xiang Lisa Li

Location: 60 Fifth Avenue, Room 204

Date: Friday, September 6, 2024

Consistency measures the invariance of model outputs for different inputs, and it serves as an effective self supervision signal. Inconsistency within a model provides training signals for model improvements, and inconsistency across models can reveal model weaknesses. In this talk, we will discuss two ways of leveraging these intra- and inter-model inconsistencies.
We first study generator and validator responses as intra-model consistency. For example, ChatGPT correctly answers “what is 7+8” with 15, but when asked “7+8=15, True or False” it responds with “False”. This inconsistency between generating and validating an answer is prevalent in state-of-the-art language models. To address this problem, we propose consistency-finetuning to align models’ generator and validator responses. Furthermore, our approach improves both generator and validator accuracy (16% and 6.3%) without using any labeled data. Second, we leverage inter-model inconsistency to construct better benchmarks. We observe that existing benchmarks yield highly correlated model rankings. In order to discover model weakness that’s not revealed by existing benchmarks, we propose to construct datasets that optimize for inconsistency in model rankings. We use a language model to automatically construct reliable datasets that optimize this objective. The scalability of LM-generated benchmarks allows it to test fine-grained categories and tail knowledge, creating datasets that are on average 27% more novel and discovering knowledge gaps in state-of-the-art models.