Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

TLEX: An Exact Approach to Timeline Extraction from Natural-Language-Derived Temporal Graphs

Abstract

A timeline gives a total ordering of events and times presented in a text or text collection, and is useful for a number of natural language understanding tasks. However, qualitative temporal graphs that can be derived directly from text---e.g., TimeML annotations---explicitly reveal only partial orderings of events and times, and prior approaches to timeline extraction have fallen short in both expressivity and performance. I demonstrate a new, exact, complete solution to timeline extraction from temporal graphs derived from natural language, which we call TLEX (TimeLine EXtraction). TLEX adapts prior work on solving point algebra problems to enable the transformation of (TimeML) temporal annotations into a collection of timelines arranged in a trunk-and-branch structure. As has been done in prior work, TLEX checks the consistency of the temporal graph; however, it adds two novel capabilities. First, it identifies the specific relations involved in an inconsistency (which can then be manually corrected). Second, TLEX identifies sections of the timelines that have indeterminate order, information critical for downstream tasks such as aligning events from different timelines. Importantly, inconsistency and indeterminacy can be used as quality measures of temporal graph annotations, and we show a non-linear relationship between temporal graph quality and timeline quality. I sketch the formal proof of TLEX's correctness, and also describe an experimental evaluation over 385 gold-standard annotated texts from four corpora. I demonstrate our most recent work showing how the timelines can be combined with duration extraction to compute estimates of the overall duration of event sequences. Finally I describe our reference implementations of TLEX in Java and Python.