Ruprecht-Karls-Universität Heidelberg
Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Semantic NLP – from a discourse perspective


Studiengang Modulkürzel Leistungs-
BA-2010 AS-CL 8LP
Master SS-CL, SS-TAC 8LP
Magister - -
Dozenten/-innen Anette Frank
Veranstaltungsart Hauptseminar
Erster Termin 17.10.2011
Zeit und Ort Mo, 16:1517:45, INF 327 / SR 4


  • Formale Semantik
  • Statistik


  • Regelmäßige Teilnahme
  • Referat
  • Hausarbeit
  • Erwerb von "Activity Points"

Note: The seminar is planned to be continued as a ''Forschungsseminar'' (MA-Forschungsmodul) in summer term 2012 and serves as a preparation course for participation.


Semantic NLP has achieved considerable success in core areas such as word sense disambiguation, semantic parsing, or coreference resolution. But current methods suffer from principled shortcomings:

  1. Semantic NLP is mainly restricted to the sentence level, and does not sufficiently exploit  discourse-level information. Thus, semantic processing modules neglect significant parts of  semantic information encoded in texts through inter-sentential meaning and inference relations. 
  2. Today's semantic NLP components focus mainly on overt linguistic information, again   neglecting significant parts of semantic information encoded in texts in terms of  inference relations.
  3. Semantic NLP systems are typically realized in pipeline architectures. Such systems fail to represent and exploit dependencies between semantic analysis levels, and thus propagate errors from one module to the next.

Current research investigates learning techniques that capture dependencies between individual NLP component models through 'joint inference'. These models explicitly represent dependencies between linguistic analysis levels, such as the joint contribution of semantic entity type information with decisions of a coreference classification system. By explicitly modeling such dependencies, they try to overcome the restricted performance of individual semantic processing models in classical pipeline architectures.

The seminar investigates perspectives of joint inference for discourse-oriented semantic NLP by interfacing local and global semantic processing models:
word sense disambiguation (WSD), named entity classification (NEC) and semantic role labeling (SRL) as prima facie local semantic phenomena, and their interaction with coreference resolution as a global discourse-level semantic resolution process.

For each of these semantic NLP models, we will analyze state-of-the-art methods, weaknesses of traditional models and perspectives of joint inference from a discourse perspective. We will discuss special discourse coherence models, such as entity-grids, discourse-level applications such as Textual Inference, and annotation models suited for joint inference tasks (e.g. OntoNotes).

In the introductory part, the seminar offers a survey of classical corpus-based semantic analysis methods in the areas of WSD, NEC, SRL and Coreference Resolution. For each of these, we contrast different learning frameworks (supervised, semi-supervised, unsupervised) and special learning algorithms (e.g. graph-based clustering), as well as special (heuristic) data acquisition techniques and evaluation methods.
The later part will focus on modeling interaction phenomena using joint inference techniques.

» Kursübersicht und Materialien

zum Seitenanfang