
Anthropic
John Hughes is an AI control researcher at Anthropic and previously worked as a machine learning engineer at Speechmatics (Cambridge, UK). He was mentored by Ethan Perez since Summer 2023 when he took part in MATS4. During his time at MATS, he worked on an AI safety via debate paper, which was awarded best paper at ICML 2024. After MATS, he worked with Anthropic as a contractor on jailbreak robustness in a narrow domain, Best-of-N Jailbreaking and why models alignment fake.
The Summer 2023 cohort supported 60 scholars with 15 mentors, working across 12 different research areas. The program consisted of a remote 4-week training phase, an 8-week research phase in Berkeley, and a 4-month extension phase. MATS leadership co-founded the London Initiative for Safe AI (LISA) in September 2023 to provide a dedicated research space for AI safety researchers and organizations in London, and for MATS scholars to continue their research projects. Research projects were distributed across multiple areas, with approximately one-third focused on evaluations and capability demonstrations and one-fifth on mechanistic interpretability, alongside work on agent foundations, activation engineering, and cooperative AI.
John Hughes