Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Lehrveranstaltungen
heiCO
Ressourcen	Fachschaft
Studien-FAQ	Technik-FAQ

Is Attention all you need? The Search for a New Architecture

Module Description

Course	Module Abbreviation	Credit Points
BA-2010[100%\|75%]	CS-CL	6 LP
BA-2010[50%]	BS-CL	6 LP
BA-2010	AS-CL	8 LP
Master	SS-CL-TAC	8 LP

Lecturer	Michael Staniek
Module Type	Proseminar / Hauptseminar
Language	English
First Session	13.10.2025
Time and Place	Mo, 13:15 - 14:45, SR 26 / INF 329
Commitment Period	tbd.

Participants

All advanced CL Bachelor students and all CL master students. Students from MSc Data and Computer Science or MSc Scientific Computing with Field of Application Computational Linguistics are welcome after getting permission from the lecturer. MSc Scientific Computing students can only take the course as HS for 8 LP. If the seminar should be oversubscribed, CL students will have priority.

Prerequisites for Participation

Introduction to Neural Networks

Assessment

Presentation
Second Presentation OR Project

Content

The Transformer Architecture improved neural machine translation results drastically and was quickly adopted by the natural language processing community. Training Transformer models due to the inherent parallelism was very efficient and networks could get very deep, yielding improvements in other tasks such as language modeling, completely replacing RNNs. However, inference with transformer models is not very efficient due to the lack of a hidden state summarizing all information up to that point, and researchers actively search for other architectures.

This course focuses on works that try to find a new way of doing things, either by investigating improvements to the transformer architecture, investigating improvements or alternatives to RNNs or other nifty ideas to get better results (e.g. attention modifications like ROPE).

Agenda

Date

Session

Materials