Alan Cooney

This stream focuses on empirical AI control research, including defending against AI-driven data poisoning, evaluating and attacking chain-of-thought monitorability, and related monitoring/red-teaming projects. It is well-suited to applicants already interested in AI safety with solid Python skills, and ideally prior research or familiarity with control literature/tools (e.g. Inspect/ControlArena). 

Stream overview

  • Control for data poisoning. Studying scenarios where an AI system covertly data poisons  another AI, to e.g., instill backdoors or secret loyalties.
  • Chain-of-thought monitorability. Evaluating the monitorability of reasoning verbalised by AIs & studying ways this may be compromised.
  • Other Control projects (see https://alignmentproject.aisi.gov.uk/research-area/empirical-investigations-into-ai-monitoring-and-red-teaming for ideas)

Mentors

Alan Cooney
UK AISI
,
Researcher
London
Control, Monitoring

Alan Cooney leads the Autonomous Systems workstream within the UK's AI Safety Institute. His team is responsible for assessing the capabilities and risks of Frontier AI systems released by AI labs such as OpenAI, Google and Anthropic. Prior to working in AI safety, he was an investment consultant and start-up founder, with his company Skyhook being acquired in 2023. He also completed Stanford’s Machine Learning and Alignment Theory Scholars Programme, where he was supervised by Google DeepMind researcher Neel Nanda.

Mentorship style

1-hour weekly meetings for going through your research log & high level guidance. Daily updates on slack are also very useful and I typically reply within 2 days to any questions.

Scholars we are looking for

Essential:

  • Existing interested in AI safety
  • Programming experience with Python/similar

You may be a good fit if you also have some of:

  • Research experience: prior AI research experience (on any topic)
  • Strong Python skills (essential for CoT monitorability eval projects)
  • Familiarity with the Control literature - see here
  • Familiarity with Inspect/ControlArena

Not a good fit:

  • Scholars primarily interested in conceptual rather than empirical research

Collaborating with other MATS scholars.

Project selection

By default I'll propose several projects for you to choose from, but you can also pitch ideas that you're interested in.

Community at MATS

MATS Research phase provides scholars with a community of peers.

During the Research phase, scholars work out of a shared office, have shared housing, and are supported by a full-time Community Manager.

Working in a community of independent researchers gives scholars easy access to future collaborators, a deeper understanding of other alignment agendas, and a social network in the alignment community.

Previous MATS cohorts included regular lightning talks, scholar-led study groups on mechanistic interpretability and linear algebra, and hackathons. Other impromptu office events included group-jailbreaking Bing chat and exchanging hundreds of anonymous compliment notes.  Scholars organized social activities outside of work, including road trips to Yosemite, visits to San Francisco, and joining ACX meetups.