Ruprecht-Karls-Universität Heidelberg
Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg

Natural Language Generation

Module Description

Course Module Abbreviation Credit Points
BA-2010[100%|75%] CS-CL 6 LP
BA-2010[50%] BS-CL 6 LP
BA-2010[25%] BS-AC 4 LP
BA-2010 AS-CL 8 LP
Master SS-CL, SS-TAC 8 LP
Lecturer Anette Frank
Module Type Proseminar / Hauptseminar
Language English
First Session 28.04.2020
Time and Place Tuesday, 16:15-17:45, INF 327 / SR 2
Commitment Period tbd.



Das Seminar wurde abgesagt.
The seminar has been cancelled.




Wichtige Hinweise zum Semesterstart

Das Seminar findet vorbehaltlich Änderungen online statt. Die Zugänge zu den Online-Foren werden noch bekanntgegeben. Für synchrone Formate gelten die angegebenen Veranstaltungszeiten: Di. 16:15-17:45.

Alle angemeldeten Kursteilnehmer werden per email informiert. Sofern angemeldete Teilnehmer noch keine Email von mir erhalten haben, sollten Sie sich erneut mit FramaDate anmelden und mir eine Nachricht senden.

Important Notice

Except for different notice, the seminar will take place online. Access information to the course will be made available soon. For synchonous formats, the published course times stay valid: Thursdays, 16:15-17:45. All registered students will be informed via email. Students who have registered but did not receive an email are asked to register again via FramaDate and to send me a notice via email.

Prerequisite for Participation

* Statistical Methods
* Foundational Knowledge in Neural Networks (e.g. Neural Networks Class)

Assessment

* Active Participation

* Presentation

* Homework or Project

Inhalt

Natural Language Generation (NLG) is a key component in many NLP applications. Depending on the kind of input for generation, we distinguish data-driven from text-driven language generation. Data-driven NLG aims to verbalize information captured in knowledge bases or linguistic representations (e.g., semantic dependencies or abstract meaning representations), in structured dialogue turns or search results in database-driven Question Answering. Text-driven NLG is found in text-to-text transduction tasks such as text compression, simplification, summarization or paraphrasing or end-to-end dialogue systems.

Applications involving NLG are frequent in multi-modal settings (e.g. generating descriptions of images or answering questions about them) and are strongly connected to robotics, where intelligent systems need to interact with humans, and where NLG can help making such systems more intellegible and self-explanatory. There are also first attempts to apply NLG techniques for argument synthesis in computational argumentation.

Similar to the radical changes that Natural Language Understanding has faced with the advent of Neural Network methods, these methods have likewise revolutionized NLG especially through the framework of autoregressive networks and with many influences from Machine Translation, which embodies NLG on its target side.

In the seminar we will review the fundamentals of NLG and study aspects of NLG that are particularly challenging, i.a.

* how to ensure faithfulness to the input - , e.g., when generating a text from a database entry to produce an advertisement, from a graph-structured meaning representation to produce a story or dialogue turn, or from visual input to generate the description of an image;

* how to ensure coherence when generating longer texts, e.g., how to choose an appropriate linguistic form (a name, definite description or pronoun) when repeatedly referring to the same entry, when generating a narrative, or choosing to aggregate text by using ellipsis, thereby producing non-redundant texts that sound natural to humans;

* how to arrange content when planning longer texts (so-called "text planning strategies"), e.g., by deciding which portions of content to choose and to link with appropriate discourse markers;

* how to evaluate the generated texts with automatic measures, including novel metrics such as BERTScore, metrics that try to assess both the quality and the diversity of generated texts, or methods that analyze the generated text for semantic coherence and possible contradictions.

Module Overview

Agenda

Date Session Materials

Literature

Literature will be provided by the beginning of the term.

» More Materials

zum Seitenanfang