A Causal Model of Theory-of-Mind in AI Agents

MATS Fellow:

Jack Foxabbott

Authors:

Jack Foxabbott, Rohan Subramani, James Fox, Francis Rhys Ward

Citations

0 Citations

Abstract:

Agency is a vital concept for understanding and predicting the behaviour of future AI systems. There has been much focus on the goal-directed nature of agency, i.e., the fact that AI agents may capably pursue goals. However, the dynamics of agency become significantly more complex when autonomous agents interact with other agents and humans, necessitating engagement in theory-of-mind, the ability to reason about the beliefs and intentions of others. In this paper, we extend the framework of multi-agent influence diagrams (MAIDs) to explicitly capture this complex form of reasoning. We also show that our extended framework, MAIDs with incomplete information (II-MAIDs), has a strong theoretical connection to dynamic games with incomplete information with no common prior over types. We prove the existence of important equilibria concepts in these frameworks, and illustrate the applicability of II-MAIDs using an example from the AI safety literature.

Recent research

What Should Frontier AI Developers Disclose About Internal Deployments?

Authors:

Jacob Charnock, Raja Moreno, Justin Miller, William L. Anderson

Date:

April 24, 2026

Citations:

Where is the Mind? Persona Vectors and LLM Individuation

Authors:

Pierre Beckmann

Date:

April 20, 2026

Citations:

Frequently asked questions

What is the MATS Program?
How long does the program last?