Ruprecht-Karls-Universität Heidelberg
Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

From Information Extraction to 'Deep' Semantic Parsing

Kursbeschreibung

Studiengang Modulkürzel Leistungs-
bewertung
BA-2010 AS-CL 8 LP
NBA AS-CL 8 LP
Master SS-CL, SS-TAC 8 LP
Magister - -
Dozenten/-innen Anette Frank
Veranstaltungsart Hauptseminar
Erster Termin 24.10.2013
Zeit und Ort Do, 11:1512:45, INF 327 / SR 5 (SR)
Commitment-Frist 02.12.2013 – 17.01.2014

Leistungsnachweis

  • regelmäßige und aktive Teilnahme
  • Referat
  • Hausarbeit oder Projekt

Inhalt

Open Information Extraction (OpenIE), Never Ending Language Learning (NELL), Machine Reading, ... - these terms describe recent developments in web-scale information extraction, knowledge acquisition and relation learning in the era of 'big data'.

This trend goes hand in hand with the evolving 'Web of Linked Data'. Starting from automatically harvested knowledge bases such as DBpedia, large-scale IE techniques are used to induce novel ontological resources that link knowledge about entities and relations to natural language expressions (see, e.g., YAGO, PATTY).

We will start with reviewing techniques and limitations of current research in large-scale IE approaches. We will study in particular techniques that aim at the automatic induction of (ontological) resources linking facts to language, and how to apply the acquired knowledge for enhanced 'deep' semantic parsing.

We will in particular focus on aspects such as:

  • relation and sense clustering
  • linking relations to syntactic configurations
  • induction of semantic types

On the basis of this review, we will discuss how to further exploit this novel paradigm for the automated induction of linguistic knowledge for enhanced NLP using data sets such as Google syntactic n-grams. Here, the seminar participants will be encouraged to develop own ideas and scenarios on how to link knowledge to language for enhanced NLP.

Kursübersicht

Seminarplan

Datum Sitzung Papiere und Referent/in
24.10. Einführung Anette Frank
Slides
31.10.
moved to 7.11.
State of the Art OpenIE Fader et al. 2011
Etzioni et al. 2011
(opt/bgr:) Banko et al. 2007
Elefterios Matios
7.11. Analysis of State of the Art:
Shallow vs. Deep; Limitations of web-scale OpenIE
Mesquita et al. 2013
Weikum et al. 2012
Anette Frank
14.11. Distant Supervision Mintz et al. 2009
Zhang et al. 2013
Anette Frank
14.11. 13:30
SR 2, INF 327
Semantic Methods in Relation Extraction Yao et al. 2012
Chan & Roth 2011
alt: Plank & Moschitti 2013
Ofer Bronstein
28.11. More Semantic Methods in Relation Extraction Lao et al. 2012
Nakashole et al. 2012
Mirjam Eppinger
5.12. From Relations to Events and Parsing McClosky et al. 2011
(opt.) Singh et al.
Goldberg & Orwant 2012
Tri Duc Nghiem
12.12. Events in Context Gerber & Chai 2010/2012
(opt.) Silberer & Frank 2012
Chambers & Jurafsky 2009
Thomas Haider
19.12. Unsupervised Event Schema/Frame Induction Cheung et al. 2013
Chambers 2013
Joachim Bingel
9.01. Event Coreference Resolution Bejan & Harabagiu 2010
Lee et al. 2012
group discussion
16.01. Cross-document Event Alignment and Search Roth & Frank 2012
Glavaš & Šnajder 2013
Chen Li
23.01. Applications I: Semantic parsing for instruction-giving Zettlemoyer & Collins 2007
Artzi & Zettlemoyer 2013
Mareike Hartmann
30.01. Applications II: Multi-Document Summarization Chi et al. 2013
Zoe Bylinovich
30.01. 13:30
SR7 INF 325
Wrap-up / Discussion

Literatur

  • Nakashole, Weikum and Suchanek (2012). PATTY: A Taxonomy of Relational Patterns with Semantic Types . EMNLP 2012.
  • Lao, Subramanya, Pereira and Cohen (2012). Reading The Web with Learned Syntactic-Semantic Inference Rules. EMNLP 2012.
  • Weikum et al. (2012). Big Data Methods for Computational Linguistics. IEEE.
  • Goldberg and Orwant (2013). A Dataset of Syntactic-Ngrams over Time from a Very Large Corpus of English Books . *SEM 2013

More literature will be provided with the beginning of the seminar.

» weitere Kursmaterialien

zum Seitenanfang