(Trans|Lin|Long|...)former: Self-Attention Mechanisms
Module Description
| Course | Module Abbreviation | Credit Points |
|---|---|---|
| BA-2010[100%|75%] | CS-CL | 6 LP |
| BA-2010[50%] | BS-CL | 6 LP |
| BA-2010 | AS-CL, AS-FL | 8 LP |
| Master | SS-CL, SS-TAC, SS-FAL | 8 LP |
| Lecturer | Juri Opitz |
| Module Type |
|
| Language | English |
| First Session | 28.10.2021 |
| Time and Place | Thursday, 14:15-15:45 |
| Commitment Period | tba |
Prerequisite for Participation
- Statistical methods
- Mathematical foundations
Assessment
Description
Module Overview
Agenda
| Date | Session | Materials |
| 28.08. | Intro | slides |
| 04.11. | Paper: Self-Attention is all you need; Speaker(s): Benjamin and Max | manuscript |
| 11.11. | no session (conference) | na |
| 18.11. | Paper: Longformer; Speaker(s): Feisal | slides |
| 25.11. | Paper: Big Bird; Speaker(s): na | na |
| 2.12. | Paper: Reformer; Speaker(s): Ines | slides |
| 9.12. | Paper: Transformers are RNNs; Speaker(s): Marinco and Phan | na |
| 16.12. | Paper: Linformer; Speaker(s): Dang and Laura | na |
| 13.1. | Paper: Performer; Speaker(s): na | na |
| 20.1. | Paper: Survey: Efficient transformers; Speaker(s): Laura | na |
| 27.1. | Paper: Benchmark: long range arena; Speaker(s): Frederick and Hanna | na |
| 3.2. | Paper: Mixing tokens with Fourier transform; Speaker(s): Nadia and Pablo | na |
| 10.2. | Paper: MLP-Mixer; Speaker(s): Frederick | na |
| 17.2. | Wrap-up and discussion | na |
Literature


