Ruprecht-Karls-Universität Heidelberg
Institut für Computerlinguistik

Bilder vom Neuenheimer Feld, Heidelberg und der Universität Heidelberg
Siegel der Uni Heidelberg

New research on sequence-to-sequence learning from human bandit feedback at ACL and EAMT.

The Statistical NLP group published new work on sequence-to-sequence learning from human bandit feedback at ACL and EAMT:

Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning.
Julia Kreutzer, Joshua Uyheng, Stefan Riezler. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL). Melbourne, Australia.

Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback.
Carolin Lawrence and Stefan Riezler. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL). Melbourne, Australia.

A Reinforcement Learning Approach to Interactive-Predictive Neural Machine Translation.
Tsz Kin Lam, Julia Kreutzer, Stefan Riezler. In Proceedings of the 21st Annual Conference of the European Association for Machine Translation (EAMT). Alicante, Spain.

zum Seitenanfang