I'm a researcher at the Statistical NLP Group, lead by Prof. Stefan Riezler. My current research concerns structural machine learning and ranking algorithms with applications to Statistical Machine Translation and Information Retrieval.
I moved to Heidelberg from the Spoken Language Processing Group, at the LIMSI laboratory in Orsay (Prof. François Yvon). Previously, at Orange Labs, R&D France Telecom in Lannion (France), I worked, as a postdoctoral researcher, on learning-to-rank algorithms for search engines with Dr. Tanguy Urvoy.
I obtained a PhD (Computer Science and Artificial Intelligence) at the Neural Information Processing Technologies department of the Int. Research & Training Center for Information Technologies and Systems in Kyiv (Ukraine), in randomized distributed representations and neural-like locality-sensitive (similarity-preserving) symbol sequence embeddings, under supervision of DSc. Dmitri Rachkovskij.
- I'm co-organizing a shared task on learning machine translation systems from weak feedback at WMT'17. Consider participating!
- I gave a guest lecture introducing online learning under full and partial information in the Machine Learning course at the University of Mannheim on 25 Nov. 2015
- In the second half of 2015 we will start a new project on learning machine translations from data that are not strictly parallel, but only weakly supervised by relevance indicators such as citations in patents or hyperlinks in Wikipedia pages. The research will be conducted in the DFG funded research project "Weakly Supervised Learning of Cross-Lingual Systems".
Current and Recent Projects
- "Weakly Supervised Learning of Cross-Lingual Systems" - accepted DFG project on learning machine translation systems from non-parallel data. Starts in second half of 2015.
- "Cross-language Learning-to-Rank for Patent Retrieval" - DFG project that aims at direct integration of patent search quality metrics and translation systems to improve cross-lingual patent search
- Quaero - European research and industrial project to develop technologies for automatic analysis and classification of multimedia and multilingual documents
- Madspam - ANR project on methods of automatic spamdexing detection in large information networks
- C. Lawrence, A. Sokolov, S. Riezler. Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation, Empirical Methods in Natural Language Processing (EMNLP), Copenhagen, Denmark, 2017 [poster]
- A. Sokolov, J. Kreutzer, K. Sunderland, P. Danchenko, W. Szymaniak, H. Fürstenau, S. Riezler. A Shared Task on Bandit Learning for Machine Translation, Conference on Machine Translation (WMT), Copenhagen, Denmark, 2017 [slides]
- J. Kreutzer, A. Sokolov, S. Riezler. Bandit Structured Prediction for Neural Sequence-to-Sequence Learning, Association of Computational Linguistics (ACL), Vancouver, Canada, 2017 [poster]
- A. Sokolov, J. Kreutzer, C. Lo, S. Riezler. Stochastic Structured Prediction under Bandit Feedback, Neural Information Processing Systems (NIPS), Barcelona, Spain, 2016 [poster]
- A. Sokolov, J. Kreutzer, C. Lo, S. Riezler. Learning Structured Predictors from Bandit Feedback for Interactive NLP, Association of Computational Linguistics (ACL), Berlin, Germany, 2016 [slides] [code 1] [code 2]
- V. Boteva, D. Gholipour, A. Sokolov, S. Riezler. A Full-Text Learning to Rank Dataset for Medical Information Retrieval, European Conference on Information Retrieval (ECIR), Padova, Italy, 2016 [poster] [data] [bib]
- A. Sokolov, S. Riezler, T. Urvoy. Bandit Structured Prediction for Learning from Partial Feedback in Statistical Machine Translation, MT Summit, Miami, FL, USA, 2015
- A. Sokolov, S. Riezler, S. B. Cohen. A Coactive Learning View of Online Structured Prediction in Statistical Machine Translation. In Proc. of the Conference of Computational Natural Language Learning (CoNLL), Beijing, China, 2015. [slides]
- A. Sokolov, F. Hieber, S. Riezler. Learning to Translate Queries for CLIR. In Proc. of the ACM SIGIR Conference (SIGIR), Gold Coast, Australia, 2014. [poster]
- A. Sokolov, G. Wisniewski, F. Yvon. Lattice BLEU Oracles for Machine Translation. Transactions on Speech and Language Processing (TSLP), ACM, 10(4)18:1-18:29, 2014. [publisher]
Institut für Computerlinguistik,
Im Neuenheimer Feld 325, Room 107