Christopher Choquette Choo

OpenAI

Research Scientist

Links

Focus

Scalable Oversight, Control, Monitoring, Interpretability, Adversarial Robustness, Red-Teaming, Alignment Training, Security, Scheming and Deception, Multi-Agent Safety

My focus these days is on adversarial machine learning: safety, security, and alignment of frontier models. I am particularly interested in alignment/safety RL and evaluations. In the past, I studied memorization, privacy, and security harms in language modelling, including auditing for risks and mitigating them. I've also worked on DP training algorithms, unlearning, collaborative learning approaches, and methods for ownership-verification.