The Underlying Logic of Language Models
Abstract
The formal foundations of computation are rooted in the study of languages—subsets of Σ*, the set of all strings over an alphabet Σ. Models of computation can be taxonomized by the classes of languages they decide, that is, by the sets for which they can determine membership. For example, finite-state automata correspond exactly to the regular languages.
Language models may be viewed as probabilistic generalizations of languages, where the notion of a discrete set is relaxed into that of a probability distribution over Σ*. Recent years have seen remarkable progress in natural language processing driven by language models parameterized by recurrent neural networks, transformers, and state-space models. In parallel to the classical taxonomy of deterministic computational models, researchers have begun to categorize the expressive power of these architectures in terms of the distributions over strings they can represent.
This tutorial provides a self-contained overview of the formal methods used to analyze and compare the expressivity of language models. These methods span formal language and automata theory, several branches of formal logic, circuit complexity, and programming frameworks such as RASP. As one example, we show how transformers—under varying assumptions—can be precisely characterized by different fragments of formal logic.
Bio
Ryan Cotterell has been an assistant professor of computer science at ETH Zürich since 2020. Previously, he was a lecturer at the University of Cambridge. He received his PhD from Johns Hopkins University under the supervision of Jason Eisner. His research interests include natural language processing, computational linguistics, and machine learning. He has published in major venues in both natural language processing (ACL, NAACL, EMNLP) and machine learning (NeurIPS, ICML, ICLR). He has also received several paper awards, including the overall Best Paper Award at ACL 2017.

