
Speech Recognition and Speech Translation
Kursbeschreibung
Studiengang | Modulkürzel | Leistungs- bewertung |
---|---|---|
BA-2010 | AS-CL | 8 LP |
NBA | AS-CL | 8 LP |
Master | SS-CL, SS-TAC | 8 LP |
ÜK | 4LP |
Dozenten/-innen | Stefan Riezler und Sariya Karimova |
Veranstaltungsart | Hauptseminar/Übung |
Erster Termin | 24.04.2018 |
Zeit und Ort | Di, 11:15–12:45, INF 327 / SR 3 (SR) |
Do, 16:15–17:45, INF 326 / SR 28 (SR) | |
Commitment-Frist | tbd. |
Teilnahmevoraussetzungen
Master: Grundlagen der Wahrscheinlichkeitstheorie und Statistik
Bachelor: Erfolgreicher Abschluss der Kurse "Formal Foundations of Computational Linguistics:
Mathematical Foundations " und "Statistical Methods for Computational Linguistics"
Leistungsnachweis
- Regelmässige Teilnahme an Seminar und Übung
- Bearbeitung der Übungsaufgaben
- Referat inklusive Vorbereitung von Diskussionsfragen
- Hausarbeit und/oder Implementierungsprojekt
Inhalt
Automatic speech recognition (ASR) and machine translation (MT) are among the hardest
problems in NLP, yet they belong to the few success stories in our area, due to the
availability of large amounts of training data "in the wild". Furthermore, ASR and
MT were among the first applications where deep learning methodology could be shown
to be beneficial. The combination of large amounts of real-world data and sophisticated
machine learning technology makes both topics interesting research problems.
The seminar will start with introductory lectures to both topics, with the goal to
prepare for an even harder problem - automatic translation of speech input - and its
specific challenges. These include disfluencies in speech input, error propagation
in pipelines that combine ASR with MT, or the challenge of translating speech directly
from acoustic signals.
Possible topics of the seminar include
- basics of phonetics and acoustic models
- basics of automatic speech recognition
- basics of neural machine translation
- in-depth readings of research papers on speech translation
- practical exercises for all discussed topics
Literatur
Jurafsky & Martin (2008). Speech and Language Processing. Prentice Hall.
Holmes & Holmes (2001). Speech Synthesis and Recognition. Taylor & Francis.
Ladefoged (2006). Elements of Acoustic Phonetics. University of Chicago Press.
Rabiner and Schafer (2007). Introduction to Digital Speech Processing. now publications.
Goldberg (2015). A Primer on Neural Network Models for Natural Language Processing.
https://arxiv.org/abs/1510.00726
Cho (2015). Natural Language Understanding with Distributed Representation.
https://arxiv.org/abs/1511.07916
Neubig (2017). Neural Machine Translation and Sequence-to-sequence Models: A Tutorial.
https://arxiv.org/abs/1703.01619