Ruprecht-Karls-Universität Heidelberg
Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

New Developments in NN Architectures and Representation Learning for NLU Tasks

Module Description

Course Module Abbreviation Credit Points
BA-2010 AS-FL 8 LP
BA-2010[100%|75%] CS-CL 6 LP
BA-2010[50%] BS-CL 6 LP
BA-2010[25%] BS-AC, BS-FL 4 LP
Master SS-CL, SS-TAC, SS-FAL 8 LP
Lecturer Anette Frank
Module Type Proseminar / Hauptseminar
Language English
First Session 25.04.2019
Time and Place Thursday, 16:15-17:45, INF 329 / SR 26
Commitment Period tbd.

Prerequisite for Participation

  • good knowledge of statistical methods, incl. neural networks
  • advanced BA students or MA students

Module Overview

Seminar schedule and reading list are found here (local access)

Assessment

  • regular, active participation; completion of exercises
  • presentation
  • project, seminar paper or equivalent contributions to the seminar

Content

Neural network architectures have by now been well explored in a large variety of NLP tasks, where systems are trained to learn representations based on characters, tokens, sentences or sentence pairs as a basis for solving supervised classification tasks.

The classical NN architectures and vector-based representations are, however, limited when it comes to encoding content from longer text passages, up to paragraphs or complete stories. Similarly, NN systems have difficulties learning all required knowledge from a single training set, thus we are in need of methods that can transfer knowlege from related tasks or integrate knowledge from external sources. Finally, recent work has shown that neural systems tend to overfit on data, and tend to exploit surface clues while showing restricted generalization capacities.

Current research indicates that learning pre-trained representations in unsupervised language modeling tasks helps to alleviate some of these problems, when using deep hierarchical language encoding models, called transformers. Other strands of research investigate learning of (latent) structured representations that correspond more closely to compositional structures and relational meaning in language and other modalities, thus facilitating the interfacing with external knowledge and reasoning processes, and offering more transparency and interpretability of what the model learns.

In the seminar we will study important developments in these research directions and how they can be applied in various NLP and multimodal analysis, prediction and generation tasks.

Topics include

  • knowledge-rich contextualized language models -- e.g. ELMO, BERT, GPT
  • transfer learning by fine-tuning pre-trained language models -- e.g. ULM-FiT
  • hierarchical attention models to encode large contexts -- e.g. in QA or summarization tasks
  • inducing latent structured representations' graphs or trees
  • interfacing structured representations (from different modalities) to perform reasoning and language generation -- e.g. in (visual) QA, Machine Comprehension or Dialogue tasks
  • deeper analysis and interpretability of neural models

Literature

Literature will be provided by the beginning of the term.

» More Materials

zum Seitenanfang