Tom Dupre la Tour

OpenAI

Research Scientist (Interpretability)

Links

Focus

Scalable Oversight, Control, Monitoring, Interpretability, Adversarial Robustness, Red-Teaming, Alignment Training, Security, Scheming and Deception, Multi-Agent Safety

Tom is a research scientist at OpenAI, working on interpretability of language models, for AI safety. He was also a core developer of scikit-learn between 2015 and 2022.