Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Lehrveranstaltungen
heiCO
Ressourcen	Fachschaft
Studien-FAQ	Technik-FAQ

Mechanistically Interpreting Multilingual Language Models

Module Description

Course	Module Abbreviation	Credit Points
BA-2010[100%\|75%]	CS-CL	6 LP
BA-2010[50%]	BS-CL	6 LP
BA-2010[25%]	BS-AC	4 LP
BA-2010	AS-CL	8 LP
Master	SS-CL-TAC	8 LP

Lecturer	Frederick Riemenschneider
Module Type	Proseminar / Hauptseminar
Language	English
First Session	17.10.2025
Time and Place	Friday, 10:15 - 11:45, INF 329 / SR 26
Commitment Period	tbd.

Participants

All advanced CL Bachelor students and all CL master students. Students from MSc Data and Computer Science or MSc Scientific Computing with Field of Application Computational Linguistics are welcome after getting permission from the lecturer. MSc Scientific Computing students can only take the course as HS for 8 LP. If the seminar should be oversubscribed, CL students will have priority.

Prerequisites for Participation

Completion of Programming I and Introduction to Computational Linguistics or similar introductory courses
Programming II, Mathematical Foundations of Computational Linguistics and Statistics are heavily suggested

Assessment

Active participation, including exercises
Presentation
Implementation project

Content

Multilingual Language Models (MLLMs) can process and connect dozens of languages, but the internal mechanisms that enable this are not well understood. Do they develop a universal "interlingua," or a complex patchwork of language-specific skills? This seminar will address these questions by applying the principles of Mechanistic Interpretability to reverse-engineer the computations within these models.

To do so, we will look at the circuit and neuron level, probing whether MLLMs reuse the same components across different languages. We will compare these to the circuits found in separate monolingual models, exploring ideas like the "Platonic representation hypothesis." Our analysis will also examine the boundaries of multilinguality by looking at phenomena where knowledge fails to transfer, considering both the pre-training process and the final model.