Ruprecht-Karls-Universität Heidelberg
Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Dear LLM, we have some questions!

Module Description

Course Module Abbreviation Credit Points
BA-2010[100%|75%] CS-CL 6 LP
BA-2010[50%] BS-CL 6 LP
BA-2010[25%] BS-FL 4 LP
BA-2010 AC-FL 8 LP
Master SS-CL, SS-FAL 8 LP
Lecturer Anette Frank
Module Type Proseminar / Hauptseminar
Language English
First Session Tuesday, 16.04.2024
Time and Place 10:15 - 11:45,
INF 327 / SR 2
Commitment Period tbd.

Participants

All advanced Bachelor students and all Master students. Students from Computer Science, Mathematics or Scientific computing with Anwendungsgebiet Computational Linguistics are welcome.

Prerequisite for Participation

  • Statistical Methods
  • Neural Networks (or comparable)

Assessment

  • Regular attendance (max. 2 times excused absence)
  • Handing in assignments (e.g., questions on papers; solving confined tasks)
  • Paper presentation
  • Written homework or project or second presentation

Content

Large Language Models (LLMs) do not only affect our Society, they also revolutionize established methods in NLP. So there is reason to investigate their impact from two perspectives: How do LLMs change the way we solve NLP tasks, leveraging their strengths while combating their weaknesses? And from a wider perspective: What are their limitations and where to impose limits for their use?

As a guide, we will focus on the questions below:



Dear LLM(s), we have some questions:

1. How to adapt our working style to assess and exploit your capabilities?

  • How best to elicit your knowledge?
  • How to test your capabilities, while controlling what you have already seen?
  • How can we identify what exactly you know?
2. What are you up to and what are your limitations?
  • What kind of world models do you learn?
  • Can you reflect and self-correct your outputs, or outputs from other LLMs?
  • Do you understand ambiguity? Do you know how to refer?
  • Can you unlearn what you have learned?
  • Can you modularize your knowledge or team up with others?
  • Can you adapt to structured representations and how do you fuse modalities?
3. Who are you, and if so, how many? And can we trust what you say?
  • How many personas are you and how consistent are your beliefs?
  • Why or when do you hallucinate? And how do we know when you lie to us?
  • How can we study what you do under the hood?
4. Do you process language as we Humans do?
  • How different is your way of processing language to how humans do?
  • Why do you need so much data? Is the way you learn not efficient enough?
  • Can you systematically generalize? How eager are you to learn?

The seminar starts with taught lectures, giving relevant background and motivating the above questions. We will then have presentations by participants, based on a reading list reporting on recent investigations and insights. The themes are diverse but aim to convey a global picture.

We then dive into model types and prompting strategies that offer novel ways to exploit generative models for NLP tasks, aiming to better understand how they work under the hood. In the age of LLMs, the way evaluation is conducted changes radically, and equally, the way to compile training data. Efficiently working with LLMs includes methods of model combination, such as distillation or fusion, or having specialized LLM agents collaborate, and also profits from efficiently combining data from different modalities. We will investigate how models make predictions without fine-tuning, and how to get closer to find out how they represent knowledge. Given the impressive capabilities of LLMs, we finally look at studies that examine similarities and differences between LLMs and humans processing language – in view of representations, modularization, generalization and learning mechanisms.

Literature

  • S. Minaee, T. Mikolov, N. Nikzad, M. Chenaghlu R. Socher, X. Amatriain, J. Gao (2024): Large Language Models: A Survey, arXiv.
  • H. Naveed, Asad Ullah Khan, Shi Qiu,, M. Saqib, S. Anwar, M. Usman, N. Akhtar, N. Barnes, A. Mian(2023): A Comprehensive Overview of Large Language Models, arXiv.

A full reading list will be published before the beginning of term. Topics that are suitable for PS participants will be indicated.

» More Materials

zum Seitenanfang