Ruprecht-Karls-Universität Heidelberg
Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Computational Morphology

Kursbeschreibung

Studiengang Modulkürzel Leistungs-
bewertung
BA-2010[100%|75%] CS-CL 6 LP
BA-2010[50%] BS-CL 6 LP
BA-2010[25%] BS-fL 4 LP
BA-2010 AS-FL 8 LP
NBA[100%|75%] CS-CL 6 LP
NBA[50%|25%] BS-CL, BS-FL 4 LP
NBA AS-FL 8 LP
Master SS-CL, SS-FAL 8 LP
Magister - -
Dozenten/-innen Jan Snajder
Veranstaltungsart Wahlweise Pro-
oder Hauptseminar
Erster Termin 17.04.2013
Zeit und Ort Mi, 16:1517:45, INF 325 / SR 23 (SR)
Commitment-Frist 20.05.13.07.2013

Teilnahmevoraussetzungen

  • Einführung in die Sprachwissenschaft
  • Einführung in die Computerlinguistik
  • Formale Grundlagen der Computerlinguistik: Mathematische Grundlagen

Leistungsnachweis

  • Aktive und regelmäßige Teilnahme
  • Wöchentliches vorbereitendes Lesen der Papiere
  • Proseminar: schriftliche Ausarbeitung des Referats
  • Hauptseminar: schriftliche Ausarbeitung einer erweiterten Fassung des Referats bzw. praktische Implementation des Ansatzes inklusive Evaluation und Dokumentation (jeweils nach Absprache)

Unterrichtssprache: Englisch (Referat/Seminararbeit auf Deutsch oder Englisch)

Inhalt

Computational morphology studies the theories and methods for computational analysis and synthesis of word forms. Morphological analysis is a prerequisite for many natural language processing tasks, such as part-of-speech tagging, parsing, and other higher-level tasks. Morphological analysis also plays an important role in many information retrieval tasks, especially for morphologically complex languages such as the Finish language, many other European languages, most Slavic and Semitic languages, etc. In this seminar course we will read and discuss the relevant literature on computational morphology. We will begin with an overview of the field and outline the basic tasks (analysis, segmentation, clustering, stemming, and lemmatization), approaches, and challenges. We will then be looking at different aspects of computational morphology, ranging from the basic models (finite state-based models and alternatives) to morphology learning (supervised and unsupervised), evaluation methodologies, and applications (in information retrieval and other fields). We will consider inflectional, but also derivational morphology, which is also interesting from a lexical semantics point of view. We will put an emphasis on the task of unsupervised morphology learning, relevant in particular for the many under resourced-languages, and discuss the recent developments in this line of research. This seminar course is appropriate for advanced undergraduate and graduate students.

Kursübersicht

Seminarplan

Datum Sitzung Referent/in Literatur/Materialien
17.4. Introduction and overview Jan Snajder Slides
24.4. No session
1.5. No session
8.5. Student presentations David Grimm Hammarström and Borin (2011)
(slides)
Patrick Claus Pirkola (2001), Kurimo et al. (2011)
(slides)
15.5. Student presentations Julian Hitschler Goldsmith (2001)
(slides)
22.5. Student presentations Sabrina Mänz Yarowsky and Wicentowski (2000), Yarowski et al. (2001)
(slides)
29.5. Student presentations Felix Krauss Schone and Jurafsky (2001)
(slides)
Isabell Wolter Baroni et al. (2002)
(slides)
5.6. Student presentations Madeline Remse Creutz and Lagus (2002), Creutz and Lagus (2005)
(slides)
12.6. Student presentations Lena Maldacker Kazakov and Manandhar (2001)
(slides)
19.6. Student presentations Atilla Azgin Monson et al. (2008), Monson et al. (2009)
(slides)
26.6. Student presentations Max Bacher Majumder et al. (2007)
(slides)
Sven Feuchtmüller Paik et al. (2011)
(slides)
3.7. Student presentations Catarina Cramer Poon et al. (2009)
(slides)
10.7. Student presentations Chen Li Naradowsky and Toutanova (2011)
17.7. Student presentations Leo Born Xu and Croft (1998)
(slides)
Hans-Martin Ramsl Dreyer and Eisner (2011)
(slides)
24.7. Student presentations and wrap-up Angela Schneider Can and Manandhar (2012)
Wrap up Jan Snajder Slides

Literatur

Textbook:

  • Brian Roark, Richard Sproat (2008). Computational Approaches to Morphology and Syntax. Oxford University Press.

Papers:

  • Galvez, C. and de Moya-Anegon, F. and Solana, V.H. (2005). Term conflation methods in information retrieval: non-linguistic and linguistic approaches. Journal of Documentation, 61(4), pp. 520-547.
  • Golsdsmith, J. (2001). Unsupervised learning of the morphology of a natural language . Computational Linguistics, 27(2), pp. 153-198.
  • Hammarström, H. and Borin, L. (2011). Unsupervised learning of morphology . Computational Linguistics, 37(2), pp. 309-350.
  • Karttunen, L. and Beesley, K.R. (2005). Twenty-five years of finite-state morphology . Inquiries Into Words, a Festschrift for Kimmo Koskenniemi on his 60th Birthday, pp. 71-83.
  • Koskenniemi, K (1984). A general computational model for word-form recognition and production . Proceedings of the 10th international conference on Computational linguistics, pp. 178-181.
  • Kurimo, M. and Virpioja, S. and Turunen, V. and Lagus, K. (2010). Morpho Challenge competition 2005-2010: Evaluations and results . Proceedings of the 11th Meething of the ACL Special Interest Group on Computational Morphology and Phonology, pp. 87-95.
  • Pirkola, A. (2001). Morphological typology of languages for IR . Journal of Documentation, 57(3), pp. 330-358.

Weitere Literatur wird zu Beginn des Seminars bekanntgegeben.

» weitere Kursmaterialien

zum Seitenanfang