Maarten Sap

I am an assistant professor at CMU's LTI department with a courtesy appointment in HCII, and a part-time research scientist and AI safety lead at the Allen Institute for AI (AI2). My research focuses on (1) measuring and improving AI systems' social and interactional intelligence, (2) assessing and combatting social inequality, safety risks, and socio-cultural biases in human- or AI-generated language, and (3) building narrative language technologies for prosocial outcomes. I was named a 2025 Packard Fellow and a recipient of the 2025 Okawa Research Award.

I received my PhD from the University of Washington where I was advised by Noah Smith and Yejin Choi.
[bio for talks]

Recent updates:

December 2025 🏅📃: Very excited to have our paper Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond) selected for a Best Paper Award at NeurIPS 2025 (Datasets and Benchmarks Track)!! Huge congrats to the first author Liwei Jiang!!!

November 2025 💎🚀: Honored to be a Spring 2025 recipient of the Amazon Research Award for our project on measuring AI agentic safety!

October 2025 🏅⭐: I’m super excited and grateful to announce that I'm part of the 2025 class of Packard Fellows. The Packard Foundation and this fellowship will allow me to explore exciting research directions towards culturally responsible and safe AI 🌍🌈

October 2025 🔍🧑‍🎓: Due to my lab being quite full already, I'm not taking looking for any new students in this upcoming PhD application cycle 😟.

October 2025 🇨🇦🎉: Excited to be attending COLM 2025 in Montreal this October! I'll be giving a talk at the Social Sim Workshop on Unlocking Social Intelligence in AI agents. I'm also thrilled that five papers I co-authored will be presented by my amazing collaborators at COLM: HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions (led by Xuhui Zhou et al.), ALFA: Aligning LLMs to Ask Good Questions: A Case Study in Clinical Reasoning (co-led by Jimin Mun et al.), PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages, Fluid Language Model Benchmarking, and The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains.

August 2025 🌟: Incredibly honored to be one of 7 US recipients of the 2025 Okawa Research Grant from the Okawa Foundation!

August 2025 🧑‍🎓: Welcoming my first postdoc, Vasudha Varadarajan, to the lab!

[older news]

My research group:

Dan Chechelnitsky

CMU Portugal LTI PhD student
co-advised with Chrysoula Zerva

Joel Mire

LTI PhD student

Karina Halevy

LTI PhD student
co-advised with Mona Diab

Malia Morgan

Pre-doctoral Young Investigator at Ai2

Jimin Mun

LTI PhD student

Jocelyn Shen

MIT PhD student
co-advised with Cynthia Breazeal

Kynnedy Smith

HCII PhD student
co-advised with Motahhare Eslami

Vasudha Varadarajan

LTI Postdoc

Akhila Yerukola

LTI PhD student

Mingqian Zheng

LTI PhD student
co-advised with Carolyn Rosé

Xuhui Zhou

LTI PhD student

Overarching Research Themes

Themes extracted and images generated with the OpenAI API; there may be inconsistencies.

Social Pragmatics and Theory of Mind

My research group explores how to evaluate and improve the social intelligence of AI systems, especially their ability to read context, manage information, and respond appropriately in multi-party interaction. A key strand of work asks whether language agents truly understand social situations or merely imitate them, as seen in [SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents](https://arxiv.org/abs/2310.11667) and [Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models](https://arxiv.org/abs/2305.14763). More recent work such as [Cognitive Chain-of-Thought: Structured Multimodal Reasoning about Social Situations](https://arxiv.org/abs/2507.20409) and [SOTOPIA-ToM: Evaluating Information Management in Multi-Agent Interaction with Theory of Mind](https://arxiv.org/abs/2605.02307) pushes toward richer, more structured assessments of social reasoning and perspective-taking. Across these papers, the field is moving from static benchmarks to interactive and situation-aware evaluations that better capture real pragmatic competence.

Measuring Agentic Safety and Reliance

My research group explores how to measure the safety of increasingly agentic AI systems, while also understanding how people rely on them in ways that can help or harm. Work such as [OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety](https://arxiv.org/abs/2507.06134) and [HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions](http://arxiv.org/abs/2409.16427) reflects a push to test agents in realistic, high-stakes settings rather than only in abstract benchmarks. On the human side, [Rel-A.I.: An Interaction-Centered Approach To Measuring Human-LM Reliance](https://aclanthology.org/2025.naacl-long.556/) and [Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty](https://arxiv.org/abs/2401.06730) examine how model behavior shapes trust, dependence, and decision-making. Together, these papers show growing concern with both direct agent failure modes and subtler safety risks such as overreliance, manipulation, and miscalibrated confidence.

Cultural Adaptation and Fairness

My research group explores how AI systems can become more culturally competent, while avoiding harms that arise when personalization or localization ignores community norms. A central theme is benchmarking adaptation across cultures, with [CCBENCH: Assessing LLM Cultural Competence via Implicitly Signaled Norms using Health Queries](https://arxiv.org/abs/2607.05405) and [NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models](https://aclanthology.org/2025.naacl-long.120/) providing explicit measures of cross-cultural responsiveness. At the same time, fairness work such as [Black LLMirror: User (Self) Perceptions in Black American English Interactions with LLMs](https://dl.acm.org/doi/abs/10.1145/3772318.3791111) highlights how insufficient adaptation can produce alienation, misrecognition, or bias toward minority varieties. This line of research treats cultural competence not as a cosmetic feature, but as a safety and equity requirement for personalized AI.

Narrative Understanding for Human Connection

My research group explores how AI can support human-human connection by understanding stories, personal narratives, and the social meaning embedded in them. [HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs](https://arxiv.org/abs/2405.17633) and [Modeling Empathic Similarity in Personal Narratives](https://arxiv.org/abs/2305.14246) study how narrative style and emotional resonance can be identified and modeled computationally. More recent work like [Social Story Frames: Contextual Reasoning about Narrative Intent and Reception](https://arxiv.org/abs/2512.15925) extends this toward understanding how stories are intended, interpreted, and socially received. Overall, these papers suggest that story understanding is becoming a pathway for better empathy modeling, interpersonal communication, and AI systems that help people connect rather than merely generate text.

Maarten Sap

Recent updates:

My research group:

Overarching Research Themes

Social Pragmatics and Theory of Mind

Measuring Agentic Safety and Reliance

Cultural Adaptation and Fairness

Narrative Understanding for Human Connection

More about me