ONTOLOGY MODELING FOR RITUAL STRUCTURE RESEARCH (2009 – 2013)

Introduction

A project funded within the Sonderforschungsbereich SFB 619: "Ritual Dynamics"

Joint project of

Researchers:

  • Nils Reiter, Department of Computational Linguistics
  • Anand Mishra, South-Asia Institute, Department of Classical Indology, University of Heidelberg
  • Oliver Hellwig, South-Asia Institute, Department of Classical Indology, University of Heidelberg

Outcomes

Domain-Adaptable Deep Linguistic Processing Pipeline

We have developed a UIMA-based processing pipeline that automatically annotates information from various linguistic levels. The following list contains the different analysis levels and the respective tool we have integrated:

  1. Tokenization: Heuristically, based on character classes
  2. Sentence splitting: MorphAdorner
  3. Part of speech-tagging: OpenNLP
  4. Chunking: OpenNLP
  5. Lemmatization: Stanford CoreNLP
  6. Dependency parsing: Mate Parser
  7. Word sense disambiguation: UKB
  8. Coreference resolution: BART
  9. Semantic role labeling: Semafor

Adaptation

Many components have been adaptated to the ritual domain by retraining statistical models, see Reiter et al. (2010), Frank et al. (2012) and Reiter (2013) for details.

Discourse Representation

The outcome of the processing pipeline is an XML-based discourse representation tailored to our needs. The following class diagram shows the most important classes.

Search

Screen Search

We provide a search tool that allows searching for n-grams of events and inspecting individual results as well as aggregated statistics. The aggregated statistics shows the relative position of the search terms within their ritual sequence. The following picture shows the position distribution for the event sub sequence giving the dakṣiṇā:

Visualisation Tools

Based on the integrated discourse representation produced by the processing pipeline, we developed a number of visualisation tools that allow researchers the targeted inspection of interesting spots.

Entity Graph

Entity Graph

The entity graph shows participants of rituals in a graph-based form. Each participant is represented as a vertex in a graph. Two vertices are connected, if they appear in the same event. The vertices are directly linked to their appearance in the source texts, as shown on the right.

Alignment Graph

Alignment Graph

The alignment graph shows alignments between event sequences. Each node (connected in red or blue) represents an event in one of the sequences, showing the frame name in bold and the lemma in parentheses. Furthermore, each frame is connected with its frame element fillers, to the left or right respectively. The colors of the frame element fillers represent discourse entities. The links between two events are shown in the middle and represent alignments between the events. The two sequences can be moved interactively in this web-based visualisation.

Connectivity Graph

Connectivity Graph

This graph shows the connectivity to the other sequence for each node, organized by the node sequence. Higher scores mean that the node (and it's context) is more direct connected to the other sequence. The highest scores have been marked with their id, the areas enclosed in dotted lines show a subsequence of events happening in the middle of one and at the end of the other sequence.

Dissertation

Nils Reiter: Discovering Structural Similarities in Narrative Texts using Event Alignment Algorithms, 2013, defended.

Talks

Publications