MATS Fellow:
Pierre Beckmann
Authors:
Pierre Beckmann, Patrick Butlin
Citations
Abstract:
The individuation problem for large language models asks which entities associated with them, if any, should be identified as minds. We approach this problem through mechanistic interpretability, engaging in particular with recent empirical work on persona vectors, persona space, and emergent misalignment. We argue that three views are the strongest candidates: the virtual instance view and two new views we introduce, the (virtual) instance-persona view and the model-persona view. First, we argue for the virtual instance view on the grounds that attention streams sustain quasi-psychological connections across token-time. Then we present the persona literature, organised around three hypotheses about the internal structure underlying personas in LLMs, and show that the two persona-based views are promising alternatives.
Where is the Mind? Persona Vectors and LLM Individuation
Authors:
Pierre Beckmann
Date:
April 20, 2026
Citations:
More Capable, Less Cooperative? When LLMs Fail At Zero-Cost Collaboration
Authors:
Advait Yadav
Date:
April 9, 2026
Citations:
The MATS Program is an independent research and educational initiative connecting emerging researchers with mentors in AI alignment, governance, and security.
Each MATS cohort runs for 12 weeks in Berkeley, California, followed by an optional 6–12 month extension in London for selected scholars.