MATS Mentors

Research Scientist

—

Teun works at Apollo Research as a research scientist. He is currently working on the science of scheming, e.g. specifically on reward-seeking dynamics during reinforcement learning.

Before that, he worked on control, sandbagging, building an AI superforecaster, and more. He took part in MATS 5.0!

Teun is also a board member for ENAIS and SAIN.

Focus:

Empirical

Control, Scheming and Deception, Dangerous Capability Evals, Monitoring

Programs:

Autumn 2026

Luke Drago

Thinking Machines

Member of Technical Staff

—

Luke is a member of technical staff at Thinking Machines. He is also the co-author of The Intelligence Curse, an essay series that examines the potential for mass automation to drive economic gradual disempowerment.

He previously co-founded Workshop Labs -- an AI research company building user-aligned models to combat disempowerment, which recently joined Thinking Machines. Prior to Workshop Labs, he was the AI governance and AI economics lead at BlueDot Impact. Before AI safety, he managed winning local election campaigns in North Carolina. He studied History & Politics at Oxford.

Focus:

Strategy and Forecasting

Policy and Governance, Strategy and Forecasting

Programs:

Autumn 2026

Pegah Maham

Google DeepMind

Policy Development Manager

—

Pegah Maham is a Policy Development and Strategy Manager within Google DeepMind’s Frontier Policy Development team, where she works at the intersection of technical AI safety and security and international governance. Her work is focused on frontier AI risks, such as biosecurity and AGI safety. Topics she is thinking about include risk assessments and mitigations, threat modelling, external testing, transparency, system integrity and model weight security.

Focus:

Policy and Governance

Policy and Governance, Strategy and Forecasting

Programs:

Autumn 2026

Geoff Ralston

Safe AI Fund

Founder

—

Geoff was a partner at Y Combinator beginning in 2011 and served as president of the accelerator from early 2019 until the end of 2022. Geoff has worked with hundreds of YC companies like Stripe, Gingko, Clever, Helion and Boom and has been an angel investor in over 100 companies. Earlier in his career Geoff was part of the team that built Yahoo! Mail and was Yahoo!’s Chief Product Officer until 2006. Geoff was also CEO of Lala Media, which was purchased by Apple in 2009.

Focus:

Founding and Field-Building

Programs:

Autumn 2026

Charles Petty (Charlie)

Stripe

Head of soon to launch biotech fund at Stripe

—

Focus:

Founding and Field-Building

Programs:

Autumn 2026

Benjamin Chang

Constellation

Resident

—

Ben is in the process of co-founding a new AI strategy research organization, to launch in the fall of 2026. His co-founder is Eli Rose, formerly a Program Director for AI safety grantmaking at Coefficient Giving. See here for more about our new organization's research agenda and what we're looking for in a collaborator.

Ben is currently a Resident at Constellation, where he just finished drafting a book-length report on national security and advanced AI that will ground our org’s intellectual vision. Ben served in the White House Office of Science and Technology Policy during the Biden-Harris administration, got his PhD in Security Studies from MIT, and has also worked at/for RAND, CSET, IARPA, and the Office of Net Assessment, among other misadventures.

Programs:

Autumn 2026

Mirko Bronzi

LawZero

Senior Applied Research Scientist

—

Mirko is a research scientist with over 15 years of experience in Machine Learning and Natural Language Processing, specializing in applied research across diverse industries.

His expertise spans Python, Pytorch, TensorFlow, and NLP frameworks like Hugging Face, enabling him to drive impactful machine learning solutions for businesses.

Mirko is passionate about optimizing deep learning model performance and advancing software engineering best practices in deep learning projects.

Focus:

Agent Foundations, Dangerous Capability Evals, Monitoring, Control, Red-Teaming, Scalable Oversight

Programs:

Autumn 2026

Buck Shlegeris

Redwood Research

CEO

—

Buck is the CEO of Redwood Research.

Focus:

Empirical

Control, Model Organisms, Scheming and Deception, Strategy and Forecasting

Programs:

Member of Technical Staff

—

Ethan Perez is a researcher at Anthropic, where he leads a team working on AI control, adversarial robustness, and other areas of AI safety research. His interests span many areas of LLM safety; he's previously led work on sleeper agents, red-teaming language models with language models, developing AI safety via debate using LLMs, and demonstrating and improving unfaithfulness in chain of thought reasoning. Read more on his website.

Focus:

Empirical

Control, Model Organisms, Red-Teaming, Scheming and Deception

Programs:

Member of Technical Staff

—

Sam leads the Cognitive Oversight subteam of Anthropic's Alignment Science team. Their goal is to be able to oversee AI systems not based on whether they have good input/output behavior, but based on whether there's anything suspicious about the cognitive processes underlying those behaviors. For example, one in-scope problem is "detecting when language models are lying, including in cases where it's difficult to tell based solely on input/output". His team is interested in both white-box techniques (e.g. interpretability-based techniques) and black-box techniques (e.g. finding good ways to interrogate models about their thought processes and motivations). For more flavor on this research direction, see his post here.

Focus:

Empirical

Control, Model Organisms, Red-Teaming, Scheming and Deception

Programs:

Member of Technical Staff

—

Fabien Roger is an AI safety researcher at Anthropic and previously worked at Redwood Research. Fabien’s research focuses on AI control and dealing with alignment faking.

Focus:

Empirical

Control, Model Organisms, Red-Teaming, Scheming and Deception

Programs:

Research Scientist

—

Nicholas is a research scientist at Google DeepMind researching adversarial machine learning; he likes to break things.

Focus:

Empirical

Control, Model Organisms, Red-Teaming, Scheming and Deception

Programs:

Member of Technical Staff

—

Sam Bowman leads a research group working on AI alignment and welfare at Anthropic, with a particular focus on evaluation. Sam is also on leave from NYU as an Associate Prof. of Computer Science and Data Science. He has been studying neural network language models since 2012.

Focus:

Empirical

Control, Model Organisms, Red-Teaming, Scheming and Deception

Programs:

Member of Technical Staff

—

Joe is a member of the Alignment Science team at Anthropic. He's currently working on scalable oversight and also has interests in control, chain-of-thought monitoring, and alignment evaluations. For some examples of recent projects, including MATS collaborations, see: https://joejbenton.com/research/.

Focus:

Empirical

Control, Model Organisms, Red-Teaming, Scheming and Deception

Programs:

Co-President and Scientific Director (LawZero) / Full Professor (UdeM) / Founder and Scientific Advisor (Mila)

—

Yoshua Bengio is Full Professor of Computer Science at Université de Montreal, Co-President and Scientific Director of LawZero, as well as the Founder and Scientific Advisor of Mila. He also holds a Canada CIFAR AI Chair. Considered one of the world’s leaders in Artificial Intelligence and Deep Learning, he is the recipient of the 2018 A.M. Turing Award, considered to be the "Nobel Prize of computing." He is the most cited computer scientist worldwide, and the most-cited living scientist across all fields (by total citations).

Professor Bengio is a Fellow of both the Royal Society of London and Canada, an Officer of the Order of Canada, a Knight of the Legion of Honor of France, a member of the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology, and chairs the International AI Safety Report.

Focus:

Empirical

Agent Foundations, Monitoring, Control, Red-Teaming, Scalable Oversight, Dangerous Capability Evals

Programs:

Research Scientist

—

Mary is a research scientist on the AGI Safety and Alignment team at Google DeepMind, where she works on preparedness for loss of control risks (misalignment, ML R&D, model poisoning). Her role involves making sure GDM has sufficient early warning signals and response plans in place for these threats. Previously, she has worked on AI control, dangerous capability evaluations for scheming precursor capabilities (stealth and situational awareness) as well catastrophic misuse capabilities.

Focus:

Empirical

Control, Scheming and Deception, Model Organisms, Monitoring

Programs:

Research scientist

—

Alex is a Research Scientist at Google DeepMind. He’s currently working on training invariants into model behavior. In the past, he formulated and proved the power-seeking theorems, co-formulated the shard theory of human value formation, and proposed the Attainable Utility Preservation approach to penalizing negative side effects.

Highlighted outputs from past streams:

Mechanistic interpretability to understand and control maze-solving agents (MATS 3.0, paper)
- Introduced the now-staple technique of “steering vectors”
Steering GPT-2-XL by adding an activation vector
Steering Llama-2 with contrastive activation additions (MATS 4.0, paper)
Unsupervised discovery of model behaviors using steering vectors (MATS 5.0)
Gradient routing (MATS 6.0)
Unlearn and distill for making robust unlearning a reality

Focus:

Empirical

Interpretability, Agent Foundations

Programs:

Executive Director

—

Daniel is working on forecasting detailed AI scenarios with Eli Lifland, Thomas Larsen, Jonas Vollmer, and Romeo Dean.

Focus:

Strategy and Forecasting

Strategy and Forecasting, Policy and Governance

Programs:

Threat Modeler Lead

—

I am a Senior Research Fellow at the Center for the Governance of AI, leading a work stream that investigates national security threats from advanced AI systems. I am also a collaborator at METR, where I help improve the rigor of system cards and evals, and a senior at the Forecasting Research Institute.

I am interested in mentoring projects that create rigorous threat models of near-term AI misuse, especially within biosecurity. Given that this work can include sensitive topics, the final output might look like writing memos and briefings for decision-makers instead of academic publications.

I am also interested in projects that try to strengthen the science and transparency of dangerous capability evaluations reporting. This includes creating standards and checklists, writing peer reviews of model cards, and designing randomized control trials that can push the current frontier.

Focus:

Biosecurity

Biorisk, Security, Safeguards

Programs:

Research Scientist

—

David Lindner is a Research Scientist on Google DeepMind's AGI Safety and Alignment team where he works on evaluations and mitigations for deceptive alignment and scheming. His recent work includes MONA, a method for reducing multi-turn reward hacking during RL, designing evaluations for stealth and situational awareness, and helping develop GDM's approach to deceptive alignment. Currently, David is interested in studying mitigations for scheming, including CoT monitoring and AI control. You can find more details on his website.

Focus:

Empirical

Control, Monitoring, Safeguards, Scheming and Deception, Dangerous Capability Evals

Programs:

MATS mentors are advancing the frontiers of AI alignment, transparency, and security

Frequently asked questions