From 2018 to 2023 I worked in the Natural Language Processing Group of Anette Frank at the Department of Computational Linguistics at Heidelberg University: researching, teaching, advising students, and writing my doctoral thesis.
For more recent information please see my personal webpage.
During teaching and research, a re-occuring question seems to be: What evaluation metric should I use? Why does paper x use metric y for evaluating their classifier?. A summary and overview of evaluation and common classification metrics (Macro F1, Weighted F1, Accuracy, Kappa, MCC, etc.) can be found in this TACL paper. (There's also an old and outdated preliminary notes.) Also of interest may be the analysis two homonymic metrics: Macro F1 and Macro F1.