Ruprecht-Karls-Universität Heidelberg
Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Recent Advances In Sequence-To-Sequence Learning

Module Description

Course Module Abbreviation Credit Points
BA-2010[100%|75%] CS-CL 6 LP
BA-2010[50%] BS-CL 6 LP
BA-2010[25%] BS-AC 4 LP
BA-2010 AS-CL 8 LP
Master SS-CL, SS-TAC 8 LP
Lecturer Tsz Kin Lam
Module Type Proseminar / Hauptseminar
Language English
First Session 24.10.2019
Time and Place Thursday, 14:15-15:45
INF 326 / SR 27
End of Commitment Period 21.01.2020

Prerequisite for Participation

•Basic knowledge in probability, statistics, and linear algebra, e.g. Mathematical Foundations and Statistical Methods for Computational Linguistics.

•Basic knowledge in neural networks, e.g., Neural Networks: Architectures and Applications for NLP


•Regular and active attendance of seminar

•Reading of rest of papers from seminar reading list

•Presentation of paper(s) from seminar reading list

•Implementation project


Neural Sequence-To-Sequence Learning (Seq2Seq) is about finding mappings between input and the output sequences using neural networks. The “sequence” can be of many different forms, e.g., financial time series, DNA, audio signals, texts, and images. The two sequences can be of the same or different modality, leading to various applications such as textual or spoken machine translation, text summarisation, caption generation, and semantic parsing.

Seq2Seq is a fully data-driven approach. It has been empirically shown to perform better than traditional statistical learning approaches in many scenarios, especially on large-scale problems. Additionally, it is end-to-end which naturally prevents error propagation as happening in cascaded systems.

In this seminar, we will discuss recent advances in Seq2Seq learning with three main themes centred around Machine Translation. We first start from (1) Neural Network Architectures for Seq2Seq, e.g., ConvSeq2Seq, Transformer and RNMT+. We then move to (2) Low-resources setting with strong focus on leveraging unpaired data, e.g., Fusion Techniques, Back-Translation and Dual Learning. Finally, (3) we will break the left-to-right generation order and move to a novel regime, called Non-autoregressive Neural Sequence Generation, e.g., Levenshtein Transformer and Insertion Transformer.


Project Guidelines

Module Overview

All papers have been uploaded. You can access it via *More Materials* below.


Date Session Materials
24 Oct 2019 Tsz Kin Lam: Introduction, organisation and topic distribution Intro
31 Oct 2019 Tsz Kin Lam: Tutorial on Seq2Seq
7 Nov 2019 Katharina Korfhage: Gehring, Jonas, et al. "Convolutional Sequence to Sequence Learning" slides
14 Nov 2019 Christoph Schneider: Vaswani, Ashish, et al. "Attention is all you need" slides
21 Nov 2019 Philip Wiesenbach: Chen, Mia Xu, et al. "Quasi-Recurrent Neural Networks"
28 Nov 2019 Lisa Kuhn: Xia, Yingce, et al. "Deliberation Networks: Sequence Generation Beyond One-Pass Decoding" slides
12 Dec 2019 Benjamin Beilharz: Oord, Aaron van den, et al. "WaveNet: A generative model for raw audio" slides
19 Dec 2019 Carlos: Pham, Ngoc-Quan, et al. "Very Deep self-attention networks for end2end speech recognition"

Tsz Kin: Summary of module 1
9 Jan 2020 Rebekka: Chorowski, Jan, and Navdeep Jaitly "Towards better decoding and language model integration in sequence to sequence models"
Gulcehre, Caglar, et al. "On Using Monolingual Corpora in Neural Machine Translation"
16 Jan 2020 Philip Meier: Sriram, Anuroop, et al. "Training Seq2Seq Models Together with Language Models"

Leander Girrbach: He, Di, et al. "Dual Learning for Machine Translation"

23 Jan 2020 Ozan Yilmaz: Kim, Yoon, and Alexander M. Rush "Sequence-Level knowledge distillation"

Simon Will: Baskar, Murali Karthick, et al. "Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text"

30 Jan 2020 Siting Liang: Liu, Alexander H., et. al. "Adversarial training of end-to-end speech recognition using a criticizing language model"
07 Feb 2020 Nathan Berger: Gu, Jiatao, et al. "Non-Auto regressive NMT"

Yoalli Rezepka García: Ghazvininejad, Marjan, et al. "Mask-Predict: Parallel Decoding of Conditional Masked Language Models"


» More Materials

zum Seitenanfang