Statistical Natural Language Processing Group

Welcome to the Statistical Natural Language Processing Group at the Institute for Computational Linguistics at Heidelberg University. Our research is at the intersection of machine learning, natural language processing, and medical informatics, with a special focus on interactive statistical learning techniques.

We organize the weekly Statistical NLP Colloquium.

group photo — StatNLP group @ Botanical Gardens (next to our department)

Latest news

New publication at ML4H 2025

New research from the StatNLP group titled Embedding-Space Data Augmentation to Prevent Membership Inference Attacks in Clinical Time Series Forecasting from our group members Marius Fracarolli, Michael Staniek, and Stefan Riezler which will be presented at ML4H in San Diego in December 2025.

October 27, 2025

publications

Paper accepted at EMNLP 2025

New research from the StatNLP group titled Learning to Translate Ambiguous Terminology by Preference Optimization on Post-Edits has been accepted for EMNLP 2025, Industry Track. The paper is available at aclanthology.

July 08, 2025

publications

New publication in TMLR

New research from the StatNLP group titled Compositionality in Time Series: A Proof of Concept using Symbolic Dynamics and Compositional Data Augmentation will be published in the journal TMLR. A copy is already available on OpenReview.

March 21, 2025

publications

New publication at WMT 2024

New research from the StatNLP group titled Post-edits Are Preferences Too will be presented at WMT 2024. A pre-print of the paper is available here.

August 09, 2024

publications

New publication at MLHC 2024

New research from the StatNLP group about Early Prediction of Causes (not Effects) in Healthcare by Long-Term Clinical Time Series Forecasting will be presented at MLHC 2024. The paper is available here.

August 09, 2024

publications