Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Lehrveranstaltungen
heiCO
Ressourcen	Fachschaft
Studien-FAQ	Technik-FAQ

Distributionelle Semantik jenseits der Wortbedeutung

Kursbeschreibung

Studiengang	Modulkürzel	Leistungs- bewertung
BA-2010	AS-CL	8 LP
NBA	AS-CL	8 LP
Master	SS-CL, SS-TAC	8 LP
Magister	-	-

Dozenten/-innen	Matthias Hartung
Veranstaltungsart	Hauptseminar
Erster Termin	22.04.2013
Zeit und Ort	Mo, 09:15–10:45, INF 328 / SR 25 (SR)
Commitment-Frist	20.05.–13.07.2013

Teilnahmevoraussetzungen

Formale Grundlagen
Formale Semantik
Statistik

Leistungsnachweis

Referat
Hausarbeit
Erwerb von "Activity Points" im Laufe des Semesters

Inhalt

Distributionelle Semantik ist eine aktuelle Forschungsrichtung innerhalb der Computerlinguistik und ihrer Nachbardisziplinen (Kognitionswissenschaft, Psychologie), die auf der "distributionellen Hypothese" (Harris 1951) als Grundannahme basiert. Demnach besteht ein Zusammenhang zwischen der Bedeutungsähnlichkeit linguistischer Einheiten (Wörter, Phrasen) und der Ähnlichkeit ihrer Verteilung über bestimmte sprachliche Kontexte, die aus empirischen Korpora gewonnen werden können.

Ursprünglich waren distributionelle Modelle zur Repräsentation der Bedeutung einzelner Wörter gedacht. Wir werden uns in diesem Seminar vorwiegend mit neueren Forschungsarbeiten auseinandersetzen, die untersuchen, wie die Bedeutung linguistischer Einheiten jenseits der Wortgrenzen (d.h. Phrasen, Sätze) in distributionellen Modellen erfasst werden kann. Damit bewegen sich diese Ansätze an der Schnittstelle zwischen distributioneller und formaler Semantik, an der insbesondere zwei Fragen und deren wechselseitige Bedingung von Interesse sind: Welche Bedeutungsaspekte sind zur Bedeutungsrepräsentation auf Wortebene essentiell? Welche Operatoren und Funktionen sind geeignet, um die Kompositionalität von Phrasen und Sätzen distributionell zu modellieren?

Kursübersicht

Seminarplan

Datum	Sitzung	Referent/in	Literatur	Materialien
22.04.	Einführung	MHa		Folien
29.04.	Organisatorisches, Konstituierung	Plenum
06.05.	Grundlagen	MHa		Folien
13.05.	Structured Vectors Spaces Latent Dirichlet Allocation Singular Value Decomposition	Franziska Chen Benjamin	Erk & Pado (2008, 2009) Steyvers & Griffiths (2007) Martin & Berry (2007)	Folien Folien Folien
20.05.	fällt aus (Pfingsten)
27.05.	Kontextualisierung von Vektorrepräsentationen	Lyuba Eric	Thater et al. (2010, 2011) van de Cruys et al. (2011)	Folien Folien
03.06.	Multi-Prototype Models Reguläre Polysemie	Patrick Christoph	Reisinger & Mooney (2010) Boleda et al. (2012a)	Folien Folien
10.06.	Vector Mixture Models	Madeline	Mitchell & Lapata (2009, 2010)	Folien
17.06.	Funktionale Applikation	Joachim	Baroni et al. (ms.; Kap. 3.1.1-3.5)	Folien
24.06.	Kategorialgrammatik Compositional Matrix Space Models	Mengfei Michael	Clark (ms.) Rudolph & Giesbrecht (2010)	Folien Folien
01.07.	Higher Order Modification Adverbklassifikation (kurz)	Damian Joachim	Boleda et al. (2012b)
08.07.	ausgefallen (Krankheit)
15.07.	Entailment Intensionalität (kurz) Adjektiv-Modifikation (kurz)	Dustin Madeline Damian	Baroni et al. (2012) Boleda et al. (2013) Baroni & Zamparelli (2010)	Folien
22.07.	Multimodalität Text-Image Relatedness (kurz) Abschlußbesprechung	Eric Michael Plenum	Silberer & Lapata (2012) Leong & Mihalcea (2011)	Folien

Zusätzliche Materialien

Video Lecture zu Topic Models
Benjamin Heinzerlings Demo zu Rank Correlation Coefficients

Literatur

M. Baroni and A. Lenci (2010): Distributional Memory. A General Framework for Corpus-based Semantics. Computational Linguistics 36 (4): 673-721.
M. Baroni and R. Zamparelli (2010): Nouns are Vectors, Adjectives are Matrices: Representing Adjective-Noun Constructions in Semantic Space, in: Proceedings of EMNLP: 1183-1193.
M. Baroni, R. Bernardi, N. Do and C. Shan (2012): Entailment Above the Word Level in Distributional Semantics, in: Proceedings of EACL: 23-32.
M. Baroni, R. Bernardi & R. Zamparelli (ms.): Frege in Space. A Program for Compositional Distributional Semantics. [PDF]
D. Blei, A. Ng & M. Jordan (2003): Latent Dirichlet Allocation, in: Journal of Machine Learning Research 3: 993-1022.
G. Boleda, M. Baroni, N. The Pham, L. McNally (2013): Intensionality was only alleged. On adjective-noun composition in distributional semantics, in: Proceedings of IWCS: 35-46.
G. Boleda, S. Padó, J. Utt (2012a): Regular Polysemy. A Distributional Model, in: Proceedings of *SEM: 151-160.
G. Boleda, E.M. Vecchi, M. Cornudella, L. McNally (2012b): First order vs. higher order modification in distributional semantics, in: Proceedings of EMNLP/CoNLL: 1223--1233.
E. Bruni, G.B. Tran and M. Baroni (2011): Distributional Semantics from Text and Images, in: Proceedings of the GEMS Workshop: 22-32.
S. Clark (ms.): Type-driven Syntax and Semantics for Composing Meaning Vectors. [PDF]
K. Erk (2012): Vector Space Models of Word Meaning and Phrase Meaning. A Survey, in: Language and Linguistics Compass 6 (10): 635-53. [PDF]
K. Erk & S. Pado (2008): A Structured Vector Space Model for Word Meaning in Context, in: Proceedings of EMNLP.
K. Erk & S. Pado (2009): Paraphrase Assessment in Structured Vector Space. Exploring Parameters and Datasets, in: Proceedings of the GEMS Workshop.
Y. Feng & M. Lapata (2010): Visual Information in Semantic Representation, in: Proceedings of NAACL: 91-99.
E. Grefenstette & Mehrnoosh Sadrzadeh (2011): Experimental Support for a Categorical Compositional Distributional Model of Meaning, in: Proceedings of EMNLP.
M. Hartung & A. Frank (2010): A Structured Vector Space Model for Hidden Attribute Meaning in Adjective-Noun Phrases, in: Proceedings of COLING.
M. Hartung & A. Frank (2011a): Exploring Supervised LDA Models for Assigning Attributes to Adjective-Noun Phrases, in: Proceedings of EMNLP.
M. Hartung & A. Frank, A. (2011b): Assessing Interpretable Attribute-related Meaning Representations for Adjective-Noun Phrases in a Similarity Prediction Task, in: Proceedings of the GEMS Workshop.
A. Lenci, S. Montemagni & V. Pirrelli (2006): Acquiring and Representing Meaning. Theoretical and Computational Perspectives, in: dies. (Hrsg.): Acquisition and Representation of Word Meaning. Theoretical and Computational Perspectives. Istituti Editoriali e Poligrafici Internazionali. Pisa: 19-66.
A. Lenci (2008): Distributional Semantics in Linguistic and Cognitive Research, in: Italian Journal of Linguistics 20 (1): 1-31.
K. Lund & C. Burgess (1996). Producing High-dimensional Semantic Spaces from Lexical Co-occurrence, in: Behavior Research Methods, Instrumentation, and Computers 28: 203-208.
D. Martin & M. Berry (2007): Mathematical Foundations Behind Latent Semantic Analysis, in: T. Landauer et al. (eds.): Handbook of Latent Semantic Analysis. London: 35-55.
J. Mitchell & M. Lapata (2009): Language Models Based on Semantic Composition, in: Proceedings of EMNLP: 430-439.
J. Mitchell & M. Lapata (2010): Composition in Distributional Models of Semantics, in: Cognitive Science 34: 1388-1429.
D. Ó Séaghdha (2010): Latent Variable Models of Selectional Preference, in: Proceedings of ACL: 435-444.
S. Pado & M. Lapata (2007): Dependency-based Construction of Semantic Space Models, in: Computational Linguistics 33(2), 161-199.
J. Reisinger & R. Mooney (2010): Multi-Prototype Vector-Space Models of Word Meaning, in: Proceedings of NAACL-2010: 109-117.
S. Rudolph & E. Giesbrecht (2010): Compositional Matrix-Space Models of Language, in: Proceedings of ACL: 907-916.
M. Sahlgren (2008): The Distributional Hypothesis, in: Italian Journal of Linguistics 20 (1): 33-54.
C. Silberer & M. Lapata (2012): Grounded Models of Semantic Representation, in: Proceedings of EMNLP/CoNLL: 1423-1433.
M. Steyvers & T. Griffiths (2007): Probabilistic Topic Models, in: T. Landauer et al. (eds.): Handbook of Latent Semantic Analysis. London.
P. Turney (2006): Similarity of Semantic Relations, in: Computational Linguistics 32 (3): 379-416.
P. Turney (2008): The Latent Relation Mapping Engine. Algorithm and Experiments, in: Journal of Artificial Intelligence Research 33: 615-655.
P. Turney (2012): Domain and Function. A Dual-Space Model of Semantic Relations and Compositions, in: Journal of Artificial Intelligence Research 44: 533-585.
P. Turney & P. Pantel (2010): From Frequency to Meaning. Vector Space Models for Semantics, in: Journal of Artificial Intelligence Research 37: 141-188.
S. Thater, H. Fürstenau & M. Pinkal (2010): Contextualizing Semantic Representations Using Syntactically Enriched Vector Models, in: Proceedings of ACL.
S. Thater, H. Fürstenau & M. Pinkal (2011): Word Meaning in Context. A Simple and Effective Vector Model, in: Proceedings of IJCNLP.
T. van de Cruys, T. Poibeau & A. Korhonen (2011): Latent Vector Weighting for Word Meaning in Context, in: Proceedings of EMNLP: 1012-1022.
J. Washtell & K. Markert (2009): A Comparison of Windowless and
Window-Based Computational Association Measures as Predictors of Syntagmatic Human Associations, in: Proceedings of EMNLP: 28-637.
J. Washtell (2009): Co-dispersion: A Windowless Approach to Lexical Association, in: Proceedings EACL: 861-869.
J. Weeds, D. Weir & D. McCarthy (2004): Characterising Measures Of Lexical Distributional Similarity, in: Proceedings of COLING.
D. Widdows (2003): Geometry and Meaning. CSLI Publications, Stanford.
D. Widdows & T. Cohen (2010): The Semantic Vectors Package. New Algorithms and Public Tools for Distributional Semantics, in: Proceedings of the Fourth IEEE International Conference on Semantic Computing.
S. Wu & W. Schuler (2011): Structured Composition of Semantic Vectors, in: Proceedings of the International Conference on Computational Semantics, Oxford, UK, 2011.

Liste wird im Laufe des Seminars laufend aktualisiert.