Maarten Sap

I am an assistant professor at CMU's LTI department with a courtesy appointment in HCII, and a part-time research scientist and AI safety lead at the Allen Institute for AI (AI2). My research focuses on (1) measuring and improving AI systems' social and interactional intelligence, (2) assessing and combatting social inequality, safety risks, and socio-cultural biases in human- or AI-generated language, and (3) building narrative language technologies for prosocial outcomes. I was named a 2025 Packard Fellow and a recipient of the 2025 Okawa Research Award.

I received my PhD from the University of Washington where I was advised by Noah Smith and Yejin Choi.
[bio for talks]

Recent updates:

December 2025 πŸ…πŸ“ƒ: Very excited to have our paper Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond) selected for a Best Paper Award at NeurIPS 2025 (Datasets and Benchmarks Track)!! Huge congrats to the first author Liwei Jiang!!!

November 2025 πŸ’ŽπŸš€: Honored to be a Spring 2025 recipient of the Amazon Research Award for our project on measuring AI agentic safety!

October 2025 πŸ…β­: I’m super excited and grateful to announce that I'm part of the 2025 class of Packard Fellows. The Packard Foundation and this fellowship will allow me to explore exciting research directions towards culturally responsible and safe AI 🌍🌈

October 2025 πŸ”πŸ§‘β€πŸŽ“: Due to my lab being quite full already, I'm not taking looking for any new students in this upcoming PhD application cycle 😟.

October 2025 πŸ‡¨πŸ‡¦πŸŽ‰: Excited to be attending COLM 2025 in Montreal this October! I'll be giving a talk at the Social Sim Workshop on Unlocking Social Intelligence in AI agents. I'm also thrilled that five papers I co-authored will be presented by my amazing collaborators at COLM: HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions (led by Xuhui Zhou et al.), ALFA: Aligning LLMs to Ask Good Questions: A Case Study in Clinical Reasoning (co-led by Jimin Mun et al.), PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages, Fluid Language Model Benchmarking, and The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains.

August 2025 🌟: Incredibly honored to be one of 7 US recipients of the 2025 Okawa Research Grant from the Okawa Foundation!

August 2025 πŸ§‘β€πŸŽ“: Welcoming my first postdoc, Vasudha Varadarajan, to the lab!

[older news]


My research group:

Dan Chechelnitsky

CMU Portugal LTI PhD student
co-advised with Chrysoula Zerva

Joel Mire

LTI PhD student

Karina Halevy

LTI PhD student
co-advised with Mona Diab

Jimin Mun

LTI PhD student

Jocelyn Shen

MIT PhD student
co-advised with Cynthia Breazeal

Kynnedy Smith

HCII PhD student
co-advised with Motahhare Eslami

Vasudha Varadarajan

LTI Postdoc

Akhila Yerukola

LTI PhD student

Mingqian Zheng

LTI PhD student
co-advised with Carolyn RosΓ©

Xuhui Zhou

LTI PhD student


Overarching Research Themes

Themes extracted and images generated with the OpenAI API; there may be inconsistencies.

Ethics and Responsible AI Practices

My research group explores the critical implications of ethics in AI development and deployment. One key area of focus is safety metrics for AI agents, as discussed in the paper [OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety](https://arxiv.org/abs/2507.06134), which provides a framework for assessing the risks associated with AI technologies. Additionally, we investigate the spectrum of human judgments on AI harm through the findings in [PluriHarms: Benchmarking the Full Spectrum of Human Judgments on AI Harm](https://arxiv.org/abs/2601.08951), shedding light on diverse perceptions of AI impacts on society. Lastly, our work also addresses the challenges of user perceptions in AI interactions, as exemplified in [Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and Preferences](https://arxiv.org/abs/2506.00195).

Exploring Narrative Intent and Empathy

My research group explores the intersection of narrative analysis and technology, particularly focusing on how narratives shape understanding and empathy. An important contribution in this area is the study [Social Story Frames: Contextual Reasoning about Narrative Intent and Reception](https://arxiv.org/abs/2512.15925), which delves into how contextual frameworks influence narrative reception and interpretation. Another vital work, [HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs](https://arxiv.org/abs/2405.17633), examines how narrative style affects emotional engagement and empathy in stories. Furthermore, our research investigates the variability in perceptions of social media narratives, highlighted in [The Empirical Variability of Narrative Perceptions of Social Media Texts](https://aclanthology.org/2024.emnlp-main.1113/), reflecting the evolving landscape of digital communication.

AI Agents and Social Intelligence Dynamics

My research group explores the development and evaluation of AI agents designed to enhance social intelligence. We focus on understanding the complexities of human-like interactions in AI, as demonstrated in the paper [SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents](https://arxiv.org/abs/2310.11667), which assesses the social reasoning capacities of language agents. Additionally, we investigate tools for better integrating Theory of Mind in AI interactions, as explored in [SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions](https://arxiv.org/abs/2506.23046). This body of work enriches our understanding of social dynamics in AI systems and how they mimic human-like empathy and reasoning.

Leveraging Language in Technical Contexts

My research group explores the advancements in large language models (LLMs) for various technical and practical applications. We analyze the effectiveness of LLMs in software engineering through [Ambig-SWE: Interactive Agents to Overcome Underspecificity in Software Engineering](https://arxiv.org/abs/2502.13069), which shows promise in enhancing problem-solving capabilities in this domain. Additional insights come from the research addressed in [Important findings highlight bias in LLM performance across different linguistic contexts. The paper,](https://arxiv.org/abs/2504.08231) investigates RAG's sensitivity to linguistic variation, underlining the need for more inclusive models that accommodate diverse linguistic styles. Our research aims not only to enhance technical tasks but also to address and solve inherent biases within these models.