Ruprecht-Karls-Universität Heidelberg
Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Is Attention all you need? The Search for a New Architecture

Module Description

Course Module Abbreviation Credit Points
BA-2010[100%|75%] CS-CL 6 LP
BA-2010[50%] BS-CL 6 LP
BA-2010 AS-CL 8 LP
Master SS-CL-TAC 8 LP
Lecturer Michael Staniek
Module Type Proseminar / Hauptseminar
Language English
First Session 13.10.2025
Time and Place Mo, 13:15 - 14:45, SR 26 / INF 329
Commitment Period tbd.

Participants

All advanced CL Bachelor students and all CL master students. Students from MSc Data and Computer Science or MSc Scientific Computing with Field of Application Computational Linguistics are welcome after getting permission from the lecturer. MSc Scientific Computing students can only take the course as HS for 8 LP.  If the seminar should be oversubscribed, CL students will have priority.

Prerequisites for Participation

  • Introduction to Neural Networks

Assessment

  • Presentation
  • Second Presentation OR Project

Content

The Transformer Architecture improved neural machine translation results drastically and was quickly adopted by the natural language processing community. Training Transformer models due to the inherent parallelism was very efficient and networks could get very deep, yielding improvements in other tasks such as language modeling, completely replacing RNNs. However, inference with transformer models is not very efficient due to the lack of a hidden state summarizing all information up to that point, and researchers actively search for other architectures.

This course focuses on works that try to find a new way of doing things, either by investigating improvements to the transformer architecture, investigating improvements or alternatives to RNNs or other nifty ideas to get better results (e.g. attention modifications like ROPE).

Agenda

Date Session Materials

» More Materials

zum Seitenanfang