Valentin Hofmann

Young Investigator (Postdoc)

Allen Institute for AI & University of Washington

About

I am a Young Investigator (Postdoc) at the Allen Institute for AI and the University of Washington, where I work with Noah Smith and Hannaneh Hajishirzi. Previously, I was a DPhil (PhD) student at the University of Oxford and a researcher at LMU Munich, advised by Janet Pierrehumbert and Hinrich Schütze. During my DPhil, I also spent time at DeepMind and at Stanford University.

My work integrates machine learning with insights from linguistics and social science to identify and address limitations in natural language processing systems. Currently, I focus on large language models, particularly in the areas of tokenization, sociolinguistic grounding, and bias.

Interests

Large Language Models
Tokenization
Computational Morphology
Sociolinguistic Grounding
Fair Language Technology

Education

DPhil (PhD) in Linguistics

University of Oxford, 2024
MSc in Computational Linguistics

LMU Munich, 2020
MSt in Linguistics

University of Oxford, 2018

News

05/2025: Our work on linguistic generalization in language models was featured in University of Oxford News
05/2025: Our paper on linguistic generalization in language models was published as an article in PNAS
05/2025: New preprint on long-form benchmarking of audio language models
04/2025: I was interviewed by The Nikkei about the cultural impact of language models
04/2025: Giving a talk on AI dialect prejudice at Carnegie Mellon University
04/2025: Giving a talk on AI dialect prejudice at Ohio State University
04/2025: Our Tokenization Workshop (TokShop) was accepted to ICML 2025
03/2025: Our work on graph-enhanced language models was featured in iX Magazin
03/2025: New preprint on superword tokenization
02/2025: Giving a talk on AI dialect prejudice at the University of Washington
02/2025: New preprint on measuring political bias in language models
12/2024: Giving a talk on AI dialect prejudice at the University of Hamburg
12/2024: Giving a talk on AI dialect prejudice at the University of Southern California
11/2024: New preprint on linguistic generalization in language models
10/2024: Giving a talk on AI dialect prejudice at the SESP Annual Conference
10/2024: New preprint on dialect fairness and robustness in language model reasoning
10/2024: Our papers on multilingual fairness in tokenization and the Paloma benchmark were both accepted to NeurIPS 2024
09/2024: Our work on dialect prejudice was featured in Science News, ScienceNews, and PNAS Front Matter
08/2024: Our paper on dialect prejudice in language models was published as an article in Nature
08/2024: Two awards at ACL 2024: Outstanding Paper for our work on evaluating political values in language models, and Best Resource Paper for Dolma
05/2024: Our papers on evaluating political values in language models and the Dolma corpus were both accepted to ACL 2024
05/2024: Our paper on graph-enhanced language models was accepted to ICML 2024
03/2024: Our work on AI dialect prejudice was featured in The Guardian, Sky News, New Scientist, MIT Tech Review, The Register, and Nature News
03/2024: New preprint on AI dialect prejudice against speakers of African American English
02/2024: New preprint on evaluating political values in language models
02/2024: New preprint on graph-enhanced language models for asynchronous plan reasoning
02/2024: Giving a talk on geographically grounded language models at the University of Cambridge
02/2024: New preprint on the Dolma corpus
01/2024: Our paper on measuring and improving the geolinguistic knowledge of language models was published as an article in TACL

Publications

2025

Aligned but blind: Alignment increases implicit bias by reducing awareness of race
Lihao Sun, Chengzhi Mao, Valentin Hofmann, Xuechunzi Bai
ACL 2025

Assessing dialect fairness and robustness of large language models in reasoning tasks
Fangru Lin, Shaoguang Mao, Emanuele La Malfa, Valentin Hofmann, Adrian de Wynter, Xun Wang, Si-Qing Chen, Michael Wooldridge, Janet Pierrehumbert, and Furu Wei
ACL 2025

Derivational morphology reveals analogical generalization in large language models
Valentin Hofmann, Leonie Weissweiler, David Mortensen, Hinrich Schütze, and Janet Pierrehumbert
PNAS

BLAB: Brutally long audio bench
Orevaoghene Ahia, Martijn Bartelds, Kabir Ahuja, Hila Gonen, Valentin Hofmann, Siddhant Arora, Shuyue Stella Li, Vishal Puttagunta, Mofetoluwa Adeyemi, Charishma Buchireddy, Ben Walls, Noah Bennett, Shinji Watanabe, Noah Smith, Yulia Tsvetkov, and Sachin Kumar
arXiv:2505.03054

SuperBPE: Space travel for language models
Alisa Liu, Jonathan Hayase, Valentin Hofmann, Sewoong Oh, Noah Smith, and Yejin Choi
arXiv:2503.13423

IssueBench: Millions of realistic prompts for measuring issue bias in LLM writing assistance
Paul Röttger, Musashi Hinck, Valentin Hofmann, Kobi Hackenburg, Valentina Pyatkin, Faeze Brahman, and Dirk Hovy
arXiv:2502.08395

2024

AI generates covertly racist decisions about people based on their dialect
Valentin Hofmann, Pratyusha Ria Kalluri, Dan Jurafsky, and Sharese King
Nature

MAGNET: Improving the multilingual fairness of language models with adaptive gradient-based tokenization
Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Valentin Hofmann, Tomasz Limisiewicz, Yulia Tsvetkov, and Noah Smith
NeurIPS 2024

Paloma: A benchmark for evaluating language model fit
Ian Magnusson, Akshita Bhagia, Valentin Hofmann, Luca Soldaini, Ananya Harsh Jha, Oyvind Tafjord, Dustin Schwenk, Evan Pete Walsh, Yanai Elazar, Kyle Lo, Dirk Groeneveld, Iz Beltagy, Hannaneh Hajishirzi, Noah Smith, Kyle Richardson, and Jesse Dodge
NeurIPS 2024

Political compass or spinning arrow? Towards more meaningful evaluations for values and opinions in large language models
Paul Röttger*, Valentin Hofmann*, Valentina Pyatkin, Musashi Hinck, Hannah Kirk, Hinrich Schütze, and Dirk Hovy
ACL 2024
🏆 Outstanding Paper Award

Dolma: An open corpus of three trillion tokens for language model pretraining research
Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Pete Walsh, Luke Zettlemoyer, Noah Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, and Kyle Lo
ACL 2024
🏆 Best Resource Paper Award

Graph-enhanced large language models in asynchronous plan reasoning
Fangru Lin, Emanuele La Malfa, Valentin Hofmann, Elle Michelle Yang, Anthony Cohn, and Janet Pierrehumbert
ICML 2024

Geographic adaptation of pretrained language models
Valentin Hofmann, Goran Glavas, Nikola Ljubesic, Janet Pierrehumbert, and Hinrich Schütze
TACL

2023

Counting the bugs in ChatGPT's wugs: A multilingual investigation into the morphological capabilities of a large language model
Leonie Weissweiler*, Valentin Hofmann*, Anjali Kantharuban, Anna Cai, Ritam Dutt, Amey Hengle, Anubha Kabra, Atharva Kulkarni, Abhishek Vijayakumar, Haofei Yu, Hinrich Schütze, Kemal Oflazer, and David Mortensen
EMNLP 2023

Explaining pretrained language models' understanding of linguistic structures using construction grammar
Leonie Weissweiler, Valentin Hofmann, Abdullatif Köksal, and Hinrich Schütze
Frontiers in Artificial Intelligence

2022

The better your syntax, the better your semantics? Probing pretrained language models for the English comparative correlative
Leonie Weissweiler, Valentin Hofmann, Abdullatif Köksal, and Hinrich Schütze
EMNLP 2022

Unsupervised detection of contextualized embedding bias with application to ideology
Valentin Hofmann, Janet Pierrehumbert, and Hinrich Schütze
ICML 2022

Modeling ideological salience and framing in polarized online groups with graph neural networks and structured sparsity
Valentin Hofmann, Xiaowen Dong, Janet Pierrehumbert, and Hinrich Schütze
NAACL 2022 (Findings)

The Reddit Politosphere: A large-scale text and network resource of online political discourse
Valentin Hofmann, Hinrich Schütze, and Janet Pierrehumbert
ICWSM 2022

An embarrassingly simple method to mitigate undesirable properties of pretrained language model tokenizers
Valentin Hofmann, Hinrich Schütze, and Janet Pierrehumbert
ACL 2022

CaMEL: Case marker extraction without labels
Leonie Weissweiler, Valentin Hofmann, Masoud Jalili Sabet, and Hinrich Schütze
ACL 2022

2021

Superbizarre is not superb: Derivational morphology improves BERT’s interpretation of complex words
Valentin Hofmann, Janet Pierrehumbert, and Hinrich Schütze
ACL 2021

Dynamic contextualized word embeddings
Valentin Hofmann, Janet Pierrehumbert, and Hinrich Schütze
ACL 2021

2020

DagoBERT: Generating derivational morphology with a pretrained language model
Valentin Hofmann, Janet Pierrehumbert, and Hinrich Schütze
EMNLP 2020

Predicting the growth of morphological families from social and linguistic factors
Valentin Hofmann, Janet Pierrehumbert, and Hinrich Schütze
ACL 2020

A graph auto-encoder model of derivational morphology
Valentin Hofmann, Hinrich Schütze, and Janet Pierrehumbert
ACL 2020