Welcome to the Statistical Natural Language Processing Group at the Institute for Computational Linguistics at Heidelberg University. Our research is at the intersection of machine learning, natural language processing, and medical informatics, with a special focus on interactive statistical learning techniques.

We organize the weekly Statistical NLP Colloquium.

group photo
StatNLP group @ Botanical Gardens (next to our department)

Latest news

New publication in ML4H

New research from the StatNLP group titled Embedding-Space Data Augmentation to Prevent Membership Inference Attacks in Clinical Time Series Forecasting from our group members Marius Fracarolli, Michael Staniek, and Stefan Riezler which will be presented at ML4H in December 2025.

Paper accepted at EMNLP 2025

New research from the StatNLP group titled Learning to Translate Ambiguous Terminology by Preference Optimization on Post-Edits has been accepted for EMNLP 2025, Industry Track. The paper is available as a pre-print on arXiv.

New publication in TMLR

New research from the StatNLP group titled Compositionality in Time Series: A Proof of Concept using Symbolic Dynamics and Compositional Data Augmentation will be published in the journal TMLR. A copy is already available on OpenReview.

New publication at WMT 2024

New research from the StatNLP group titled Post-edits Are Preferences Too will be presented at WMT 2024. A pre-print of the paper is available here.

New publication at MLHC 2024

New research from the StatNLP group about Early Prediction of Causes (not Effects) in Healthcare by Long-Term Clinical Time Series Forecasting will be presented at MLHC 2024. The paper is available here.