Ethan Perez

Anthropic

—

Member of Technical Staff

Links

Focus

Control, Model Organisms, Red-Teaming, Scheming and Deception

H-index

Stream

Anthropic and OpenAI Megastream

Ethan Perez is a researcher at Anthropic, where he leads a team working on AI control, adversarial robustness, and other areas of AI safety research. His interests span many areas of LLM safety; he's previously led work on sleeper agents, red-teaming language models with language models, developing AI safety via debate using LLMs, and demonstrating and improving unfaithfulness in chain of thought reasoning. Read more on his website.