Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Lehrveranstaltungen
heiCO
Ressourcen	Fachschaft
Studien-FAQ	Technik-FAQ

Developments in Natural Language Generation

Kursbeschreibung

Studiengang	Modulkürzel	Leistungs- bewertung
BA-2010[100%\|75%]	CS-CL	6 LP
BA-2010[50%]	BS-CL	6 LP
BA-2010[25%]	BS-AC, BS-FL	4 LP
BA-2010	AS-CL	8 LP
Master	SS-CL, SS-TAC	8 LP

Dozenten/-innen	Anette Frank
Veranstaltungsart	Proseminar / Hauptseminar
Sprache	English
Erster Termin	17.04.2023
Zeit und Ort	Montags, 13:15-14:45, INF 328 / SR 25
Commitment-Frist	tbd.

Teilnahmevoraussetzungen

Programming I and II
Mathematical Methods
Statistical Methods
Semantics

Kommentar

BA Computational Linguistics (advanced students)
MA Computational Linguistics
MA Data and Computer ScienceMA Scientific Computing

Leistungsnachweis

Active participation including assignments;
Presentation;
Project, second presentation or other agreed forms of homework

Inhalt

Natural Language Generation (NLG) is a key functionality for many NLP applications which may be grounded in purely linguistic or in multimodal, e.g., visual-linguistic contexts. Typically, language generation is conditioned on textual inputs, as in interactive question-answering or dialogue settings, or in text rewriting tasks (e.g., summarization, text simplification or style transfer). In restricted domains, NLG can also be data-driven. Here, the generated language is conditioned on structured inputs, such as database query results, domain knowledge graphs, or internal states of artificial agents that interact with their environment and with humans.

NLG methods have been revolutionized through the use of autoregressive large pre- trained language models (LLMs). Yet, despite their stunning versatility in generating high-quality texts, we are still left with important research questions to be solved:

How can we guarantee the validity, faithfulness and consistency of LLM outputs, to make such systems trust-worthy and safe?
How can we make these models interpretable, so that we can ground their predictions in a transparent way? Can we enhance their trustworthiness by making them interact with interpretable, symbolic knowledge sources, and can we use these to guide generation? Or can we use neural representations derived such knowledge resources for this purpose?
Which capabilities do LLM-based NLG systems really acquire, are they capable of solving higher-level reasoning problems, and how can we work towards this?
Can we deploy NLG methods for controlling language generation by means of self-rationalization, question generation & answering, or contextualized conclusion generation?

In the seminar we will review different varieties of language model types, exemplified by specific NLP tasks, and how to measure their performance. We will focus on methods that aim to control NLG systems in specific ways, so that their output is valid, faithful and consistent, to enhance their trustworthiness and usefulness. This includes the interaction with symbolic knowledge resources (e.g., by search, joint encoding) and inference methods. We will also investigate ways of using NLG methods as a means to control the validity of system predictions, to deal with complex tasks that go beyond learning statistical regularities of language.

Topics will include:

neural architecture types for NLG (encoder-decoder; decoder-only; instruction-based models; text editing models)
decoding methods (autoregressive vs. non-autoregressive; contrastive search, iterative decoding; reranking)
prompting for instruction tuning and decoding; in-context learning; chain-of- thought prompting
content control (plugn play; text generation re-ranking; knowledge-guided pre-training, knowledge retrieval and injection)
applications and tasks: dialogue; question and answer generation; dataset generation

Early parts of the seminar will be introductory lectures, to provide overview and to create common ground. We will then read and discuss selected papers with presentations and discussion rounds.

Seminarplan

Datum

Sitzung

Materialien