Title: Multimodal Information for Statistical Machine Translation

Abstract: This work investigates the feasibility of making non-linguistic multimodal perceptual information available to a statistical machine translation system. We present the translation task of caption translation, which consists of translating an image caption into a target language using a caption in the source language alongside the image itself, as well as appropriate parallel corpora (both hand-translated and extracted from the web) for system evaluation. A variety of possible text-based and image-based retrieval methods are discussed with respect to their potential for improving machine translation.