Ruprecht-Karls-Universität Heidelberg

Thesis Topics

My areas of research are in discourse, semantics, pragmatics (particularly figurative language and sentiment analysis) and summarization. Please contact me if you are interested in a BA or MA thesis in those topics. Current ideas for thesis topics and areas include but are not limited to the following:

  1. Integrating Advanced Date Selelection into Submodular Algorithms for Timeline Summarization (MA).

    Sebastian Martschat and me developed a submodular framework called TILSE for the generation of timeline overviews such as timelines on long-running wars or other events. The date selection algorithm in this framework is rather basic, i.e. it only looks at the frequency a date is mentioned to determine its importance. The proposed MA work will integrate more sophisticated date selection methods (such as by which other important dates a date is mentioned, causel relations between dates) into TILSE.
  2. Modelling the Influence of Information Retrieval on Timeline Summarization (MA).

    Timeline Summarization generates dated timelines for long-running social events such as wars, disease outbreaks or financial crises. In contrast to standard single or multidocument timeline summarization, it summarizes hundreds of documents and is therefore dependent on good IR processes before summarization. Current research neglects this aspect; in particular, it is unclear whether performance differences for summarization systems hold up across differing prior IR models. This project will explore the combination of different IR and different timeline summarization models to explore this question.
  3. Models for comparative anaphora and bridging (MA/BA).

    In both comparative and bridging anaphora, anaphora and antecedent are not coreferent. In comparative anaphora, anaphor and antecedent are mutually exclusive sets (Donald Trump --- other presidents) whereas in bridging the anaphor is associated by a world knowledge or lexical relation to the antecedent (a house --- the door). For both problems, automatic resolution results are substantially lower than for coreference resolution. Several subjects for the automatic modelling of these problems are possible as thesis topics:
    • Building a minimalistic classifier for comparative/bridging anaphora. Training data for these problems is limited and often domain-specific. It is therefore useful to explore a domain-independent general rule-based classifier for these problems.
    • Word embeddings for comparative/bridging anaphora resolution.
    • An integrated, joint approach to coreference and bridging resolution.
  4. Modelling figurative language (BA/MA)

    I am interested in the resolution of metaphor and metonymy. I have the following topics in mind:
    • Do current metaphor systems actually learn metaphor or do they just learn fossilized metaphoric word senses? (BA/MA)
    • Crowdsourcing for generating metonymy datasets (BA)
    • World knowledge integration into metaphor/metonymy resolution
  5. Several projects on offensive language identification.

    In addition, my PostDoc Michael Wiegand offers several projects on offensive language identification. The descriptions can be found under https://www.cl.uni-heidelberg.de/~wiegand/thesisTopics.txt

If you do not find anything you like, but still would like something in the area of summarization, sentiment or discourse please contact me.

zum Seitenanfang