Ruprecht-Karls-Universität Heidelberg
Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg
Siegel der Uni Heidelberg

Discourse Unplugged: From Framework “Fights” to Generalization Goals

Abstract

Despite recent progress in discourse processing, discourse models remain 1) limited in their ability to generalize across domains and genres, and 2) constrained by framework-dependent discourse representations. In this talk, I will first present work that examines the generalizability of discourse parsing in RST. We show that even state-of-the-art models trained on the standard English newswire benchmark do not generalize well, even within the news domain, and that training on heterogeneous, multi-genre data is essential for improving generalization. To unify the framework divide, I present our effort in creating a PDTB-style multi-genre benchmark by leveraging discourse relation annotations existing in RST.

To address the second point, I will present recent work where we develop a unified discourse relation label set to facilitate cross-lingual and cross-framework discourse analysis, and probe LLMs to assess whether they encode generalizable discourse abstractions. The talk will conclude by discussing challenges and biases in LLM discourse representations, providing insights into the limitations and potential avenues for improving discourse modeling in multilingual and generalization settings.

Bio

Yang Janet Liu is currently a Postdoctoral Researcher at the MaiNLP research lab at the Center for Information and Language Processing (CIS) at LMU Munich led by Prof. Dr. Barbara Plank. Janet received her PhD in Computational Linguistics from Georgetown University where she was advised by Amir Zeldes, PhD. Her research interests involve 1) computational approaches to discourse-level linguistic phenomena across genres and their NLP applications such as summarization, 2) cross-framework discourse understanding and unifying discourse resources (co-organizer of the DISRPT shared task), and 3) tackling variation in NLP. Previously, she did internships at Spotify (2021, 2023) and Alex AI at Amazon (2020).
Personal website

zum Seitenanfang