Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Title: Neural Coherence Modeling: An Entity-based Approach

Speaker: Sungho Jeon (HITS)

Abstract

Recent neural coherence models process the input document using large-scale pretrained language models. These neural models compute local coherence, semantic relations between items in adjacent sentences, based on words and even sub-words. They achieve state-of-the-art performance in several tasks. However, neural models, which capture focus in this end-to-end fashion, do not provide explanations of how they work. For the first time, Jeon and Strube (2020) reveal that the neural model does not always capture desirable items as focus, such as a component of subwords or function words. In these cases, the model might capture a focus different from the author's intention. The model might benefit from exploiting the shallow heuristics captured in a dataset. From a linguistic perspective, these items should not play a role. This problem did not occur with pre-neural models of coherence, since they compute coherence on the basis of entities. In this work, we propose a neural coherence model following pre-neural linguistic insights of entity-based modeling. We compute coherence on the basis of entities by constraining our neural model to capture focus on noun phrases. This gives us better insight into the behaviour of the neural model thus leading to better explainability. Our evaluations show that it also outperforms previous models on three downstream tasks.