Resources / corpora / l / ru

 
 

Resources

  • 2005 NIST Speaker Recognition Evaluation Training Data
    2005 NIST Speaker Recognition Evaluation Training Data consists of 392 hours of conversational telephone speech in English, Arabic, Mandarin Chinese, Russian and Spanish and associated English transcripts used as training data in the NIST-sponsored 2005 Speaker Recognition Evaluation (SRE).
  • Reuters Corpus
    A collection of Reuters newswire texts, sorted by months.
  • UN Corpora
    The corpus is a paragraph-aligned six-language collection of resolutions of the General Assembly from Volume I of GA regular sessions 55-62. The corpus is described in an academic paper that will be presented (as a poster) at Machine Translation Summit XII on August 28th, 2009.