Robert Kirk

UK AISI

Research Scientist

Links

Focus

Monitoring, Adversarial Robustness, Control, Model Organisms, Red-Teaming, Dangerous Capability Evals, Safeguards

H-index

12

Robert is a research scientist and the acting lead of the alignment red-teaming sub-team at UK AISI. This team's focus is on stress-testing model alignment to detect and understand model propensities relevant to loss-of-control risks. Before that, he's most recently worked on misuse research, focusing on evaluations of safeguards against misuse and mitigations for misuse risk, particularly in open-weight systems. He graduated from his PhD from University College London on generalisation in LLM fine-tuning and RL agents in January 2025.