Ruprecht-Karls-Universität Heidelberg

DeCOCO: COCO English-German Parallel Captions

DeCOCO is a bilingual (English-German) corpus of image descriptions, where the English part is extracted from the COCO dataset, and the German part are translations by a native German speaker.

Terms of Use

DeCOCO is licensed under a Creative Commons Attribution 4.0 License. Creative Commons License

If you use the corpus in your work, please cite:

Julian Hitschler, Shigehiko Schamoni, Stefan Riezler. "Multimodal Pivots for Image Caption Translation". To appear in: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL), Berlin, Germany, 2016
(click here to get a bibtex file).

Data

For a detailed description of the corpus and its application, please see the above publication.

For detailed licensing information, please see the enclosed terms_of_use.txt.

Download

Parallel data: ms_coco_parallel.tar.bz2 (31kB, md5)

zum Seitenanfang