Ruprecht-Karls-Universität Heidelberg
Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg
Siegel der Uni Heidelberg

X-SRL: Parallel Cross-lingual Semantic Role Labeling

The Heidelberg University NLP Group announces a new dataset for multilingual SRL parsing: “X-SRL: Parallel Cross-lingual Semantic Role Labeling”

  • it is based on the English CoNLL-09 dataset,
  • it is parallel between the four languages: English, French, German and Spanish via high-quality Machine Translation using DeepL;
  • it uniformly applies the PropBank SRL labeling scheme as developed for English for all covered languages, using a novel, dense and precise label projection mechanism, and
  • it has been automatically and manually controlled for training and evaluation sections, respectively.

The corpus is available through LDC under this Link:

A description for the motivation, development and analysis of this dataset, as well as experiments on multilingual and crosslingual SRL labeling you find in the publication below:

Daza, A. and Frank, A. (2020): X-SRL: A Parallel Cross-Lingual Semantic Role Labeling Dataset. The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), pp. 3904--3914.


zum Seitenanfang