Evan Hubinger

Anthropic

—

Head of Alignment Stress-Testing

Evan Hubinger is a research scientist at Anthropic where he leads the Alignment Stress-Testing team. Before joining Anthropic, Evan was a research fellow at the Machine Intelligence Research Institute. Evan has done both very empirical alignment research, such as “Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training,” and very theoretical alignment research, such as "Risks from Learned Optimization in Advanced Machine Learning Systems.”