
UK AISI
—
Head of Science of Evaluation
I currently lead the science of evaluation team at the AI Security Institute in London. I joined AISI early in its life, and have worked in several roles, including co-leading the team responsible for our pre-deployment testing programme.
I'm generally interested in topics around dangerous capability evals, and understanding agent behaviours and their implications for policy. In particular, I'm interested in:
- Developing predictive statistical models that capture the relationships between LLM performance, task characteristics and LLM characteristics.
- Developing methods to predict when models might cross a capability threshold.​
- Developing hinting-based methods for partial progress & estimating gaps
in knowledge/skills required to cross thresholds (and track these over
time).
Before AISI I was chief scientist at a startup in Cambridge, where I led a team of 25 researchers with a mission to optimize decision making in electricity grids, and improve economic efficiency and reduce emissions.
I have a PhD in physics from the University of Waterloo and Perimeter Institute for Theoretical Physics. My focus was on reconstructing quantum theory from simple first principles, so we can all stop worrying about the reality of the wave-function.