Ruprecht-Karls-Universität Heidelberg
Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg
Siegel der Uni Heidelberg

Parallele Korpora in der Sprachverarbeitung - Materialübersicht

« zurück

Themenkomplex I: Alinierung

 
Thema Materialien
Manuelle Wortalinierung I. Dan Melamed (1998): Blinker Annotation Style Guidelines. Technical Report, Columbia University.
Evaluation von Wortalinierungen Alexander Fraser, Daniel Marcu. Measuring Word Alignment Quality for Statistical Machine Translation. Computational Linguistics, 33(3): 293-303. Link.
J. Veronis (2000): Evaluation of parallel text alignment systems - The ARCADE project. In J. Véronis (Ed.), Parallel text processing: Alignment and use of translation corpora (pp. 369-388). Dordrecht: Kluwer Academic Publishers.
Statistische Wortalinierung (Die "IBM-Modelle") H.J. Och and H. Ney (2003): A systematic comparison of various statistical alignment models. Computational Linguistics.
Heuristische Wortalinierung E. Pianta and L. Bentivogli (2004): Knowledge intensive word alignment with KNOWA. COLING 2004.
J. Tiedemann (2003): Combining clues for word alignment. EACL 2003.
Tiedemann, J. Word to word alignment strategies. In Proceedings of COLING 2004. Link.
 
 

Themenkomplex II: Übersetzung und Parallelismus

 
Thema Materialien
Übersetzungsäquivalenz/Parallelismus Kitty M. van Leuven-Zwart. 1990. Translation and original: Similarities and dissimilarities, I. Target, 1(2):151-181.
Parallelismus in der Syntax R. Hwa, P. Resnik, A. Weinberg, C. Cabezas, O. Kolak: Bootstrapping parsers via syntactic projection across parallel texts. Natural Language Engineering, Volume 11, Issue 03. 2005. Link.
Parallelismus in der lexikalischen Semantik, 1 L. Cyrus: Building a resource for studying translation shifts. Proc. LREC 2006. Link. Proc. LREC 2006, Genoa, May 24th-26th, 2006, pp. 1240-1245.
Parallelismus in der lexikalischen Semantik, 2 S. Pado, K. Erk: Translation Shifts and Frame-Semantic Mismatches: A Corpus Analysis. International Journal of Corpus Linguistics. Accepted for publication. Link.
 
 

Themenkomplex III: Induktion von Wissen in der Zielsprache aus unannotierten Quellkorpora

 
Thema Materialien
MehrwortausdrückeSina Zarrieß and Jonas Kuhn: Exploiting Translational Correspondences for Pattern-Independent MWE Identification. In Proceedings of the 2009 Workshop on Multiword Expressions. Link.
LesartenMona T. Diab, Philip Resnik: An Unsupervised Method for Word Sense Tagging using Parallel Corpora. Proceedings of ACL 2002. Link.
ParaphrasenColin Bannard; Chris Callison-Burch Paraphrasing with Bilingual Parallel Corpora. Proceedings of ACL 2005. Link.
 
 

Themenkomplex IV: Induktion von Wissen in der Zielsprache aus annotierten Quellkorpora

 
Thema Materialien
Chunks, Parts of Speech David Yarowsky; Grace Ngai: Inducing Multilingual POS Taggers and NP Bracketers via Robust Projection Across Aligned Corpora. Proceedings of NAACL 2001. Link.
LesartenLuisa Bentivogli and Emanuele Pianta, "Exploiting parallel texts in the creation of multilingual semantically annotated resources: the MultiSemCor Corpus". Natural Language Engineering, Special Issue on Parallel Texts. Link.
Semantische Rollen S. Pado, M. Lapata: Cross-lingual Annotation Projection of Semantic Roles. Journal of Artificial Intelligence Research 36, 307-340. 2009.
Zeitinformation Spreyer, K. and Frank, A. (2008): Projection-based Acquisition of a Temporal Labeller. Proceedings of the 3rd International Joint Conference on Natural Language Processing. Link.
 
 

Themenkomplex V: Vergleichbare und nichtparallele Korpora

 
Thema Materialien
Paraphrasen aus vergleichbaren KorporaRegina Barzilay; Lillian Lee: Bootstrapping Lexical Choice via Multiple-Sequence Alignment. Proceedings of EMNLP 2003. Link.
Übersetzungen in nichtparallelen TextenReinhard Rapp: Identifying Word Translation in Non-Parallel Texts. Proceedings of ACL 1995. Link.
Pascale Fung and Percy Cheung: Mining Very-Non-Parallel Corpora: Parallel Sentence and Lexicon Extraction via Bootstrapping and EM. Proceedings of EMNLP 2004. Link.
Induktion von Selektionspräferenzen ohne parallele Korpora Yves Peirsman und Sebastian Pado: Cross-lingual Induction of Selectional Preferences with Bilingual Vector Spaces. Proceedings of NAACL 2010.
Ideen zu weiteren Themen sind willkommen!