
OpenAI
—
Research Scientist (Interpretability)
Links
Focus
Scalable Oversight, Control, Monitoring, Interpretability, Adversarial Robustness, Red-Teaming, Alignment Training, Security, Scheming and Deception, Multi-Agent Safety
Stream
OpenAI Safety Team
Tom is a research scientist at OpenAI, working on interpretability of language models, for AI safety. He was also a core developer of scikit-learn between 2015 and 2022.