
With a Little Push, NLI Models can Robustly and Efficiently Predict Faithfulness
Abstract
Conditional language models have made great strides in generating grammatical and relevant text. However, they still tend to generate unfaithful output that is not supported by their input. These unfaithful generations jeopardize trust in real-world applications such as summarization, motivating a need for automatic faithfulness measures. For this task, NLI models seem attractive, since they solve a strongly related task that comes with a wealth of prior research and data. However, recent research suggests that NLI models require costly additional machinery to perform reliably across datasets, e.g., by running inference on a cartesian product of input and generated sentences, or supporting them with a question-generation/answering step. In this work, we take an alternative view and show that monolithic NLI models can outperform more complex measures when combining robustness-oriented data augmentation with uncertainty reducing inference procedures. We propose three modifications: (1) Augmenting NLI training data to reflect divergences in the definition of entailment in NLI vs. that of faithfulness in dialog (2) Making use of both entailment and contradiction probabilities, and (3) Using Monte-Carlo dropout during inference. On the comprehensive TRUE benchmark that combines faithfulness datasets across different domains and tasks, our approach strongly improves a vanilla NLI model and significantly outperforms all other methods, while showing favourable computational cost.