Chris Biemann Ubiquitous Knowledge Processing Lab TU Darmstadt http://www.ukp.tu-darmstadt.de/people/prof-dr-chris-biemann/ I. Lexical Substitution using Crowdsourcing When moving from keywords to semantic search, a desired behavior of information retrieval systems is to match words with the same meaning. When using simple synonym lists, this leads to spurious matches across word senses, which motivates the usage of word sense disambiguation. Word sense disambiguation systems usually employ WordNet as a sense inventory. Its fine-grained sense distinctions are widely believed to be the reason of weak system performance. In the first part of this talk, I describe how I used crowdsourcing to obtain the Turk bootstrap Word Sense Inventory*, a freely available word sense inventory with ample annotated senses in context. This complex resource was produced by a alternating cycle of three simple crowdsourcing tasks. Empowered with this data, I succeeded in building a system that produces lexical substitutions in context by using supervised word sense disambiguation techniques. At this, unsupervised features help achieving both state-of-the-art performance on a standard task and very high substitution acceptability. II. Graph Measures for the Quality of Language Models In the second part of the talk, I will be introducing network motif analysis for co-occurrence graphs. Motifs are local connectivity patterns that reflect functional structure of a network . Differences between motif profiles for co-occurrence graphs of real language and language generated by a language model quantify the amount of how well a language model takes homonymy and polysemy into account. An outlook is given on how to incorporate these semantic invariants of language into future language models.