Data release: WikiCLIR – New Corpus for Cross-Language Information Retrieval
The ICL Statistical Natural Language Processing Group has released a large-scale retrieval data set for Cross-Language Information Retrieval (CLIR). It contains more than 245k German single-sentence queries with 3,2M automatically extracted relevance judgments for 1,2M English Wikipedia articles as documents. For more information and download see the WikiCLIR-webpage.