Ruprecht-Karls-Universität Heidelberg

HeiST – Heidelberg Sentiment Treebank

A German dataset for Compositional Sentiment Analysis

HeiST originated in the MA project of Michael Haas (Weakly Supervised Learning for Compositional Sentiment Recognition) as a German counterpart to the Stanford Sentiment Treebank, and has been constructed in a similar fashion. The textual basis of HeiST are creative-commons-licensed reviews from the German movie review site, from which we extracted the evaluation summary ("Fazit") sentences.

HeiST comprises 1184 trees where each node has a sentiment label.

The crowdsourcing of HeiST has been supported in part by the Institute of Computational Linguistics and by Yannick Versley's private funds.


HeiST can be downloaded here:

The code for the experiments can be found in Michael Haas' github project

For additional bachground, see the following material:

zum Seitenanfang