
OpenAI
—
AI Alignment Research Engineer
Links
Focus
Scalable Oversight, Control, Monitoring, Interpretability, Adversarial Robustness, Red-Teaming, Alignment Training, Security, Scheming and Deception, Multi-Agent Safety
Stream
OpenAI Safety Team
Juan is a researcher in OpenAI’s Safety Systems team. He is broadly interested in mitigating catastrophic risks. He works on adversarial robustness training and automated red-teaming (recent work https://openai.com/index/instruction-hierarchy-challenge/).