Ruprecht-Karls-Universität Heidelberg
Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Speech Recognition and Speech Translation

Kursbeschreibung

Studiengang Modulkürzel Leistungs-
bewertung
BA-2010 AS-CL 8 LP
NBA AS-CL 8 LP
Master SS-CL, SS-TAC 8 LP
ÜK 4LP
Dozenten/-innen Stefan Riezler und Sariya Karimova
Veranstaltungsart Hauptseminar/Übung
Erster Termin 24.04.2018
Zeit und Ort Di, 11:1512:45, INF 327 / SR 3 (SR)
  Do, 16:1517:45, INF 326 / SR 28 (SR)
Commitment-Frist tbd.

Teilnahmevoraussetzungen

Master: Grundlagen der Wahrscheinlichkeitstheorie und Statistik
Bachelor: Erfolgreicher Abschluss der Kurse "Formal Foundations of Computational Linguistics: Mathematical Foundations " und "Statistical Methods for Computational Linguistics"

Leistungsnachweis

- Regelmässige Teilnahme an Seminar und Übung

- Bearbeitung der Übungsaufgaben

- Referat inklusive Vorbereitung von Diskussionsfragen

- Hausarbeit und/oder Implementierungsprojekt

Inhalt

Automatic speech recognition (ASR) and machine translation (MT) are among the hardest problems in NLP, yet they belong to the few success stories in our area, due to the availability of large amounts of training data "in the wild". Furthermore, ASR and MT were among the first applications where deep learning methodology could be shown to be beneficial. The combination of large amounts of real-world data and sophisticated machine learning technology makes both topics interesting research problems.

The seminar will start with introductory lectures to both topics, with the goal to prepare for an even harder problem - automatic translation of speech input - and its specific challenges. These include disfluencies in speech input, error propagation in pipelines that combine ASR with MT, or the challenge of translating speech directly from acoustic signals.

Possible topics of the seminar include
- basics of phonetics and acoustic models
- basics of automatic speech recognition
- basics of neural machine translation
- in-depth readings of research papers on speech translation
- practical exercises for all discussed topics

Literatur

Jurafsky & Martin (2008). Speech and Language Processing. Prentice Hall.

Holmes & Holmes (2001). Speech Synthesis and Recognition. Taylor & Francis.

Ladefoged (2006). Elements of Acoustic Phonetics. University of Chicago Press.

Rabiner and Schafer (2007). Introduction to Digital Speech Processing. now publications.

Goldberg (2015). A Primer on Neural Network Models for Natural Language Processing. https://arxiv.org/abs/1510.00726

Cho (2015). Natural Language Understanding with Distributed Representation. https://arxiv.org/abs/1511.07916

Neubig (2017). Neural Machine Translation and Sequence-to-sequence Models: A Tutorial. https://arxiv.org/abs/1703.01619

» weitere Kursmaterialien

zum Seitenanfang