
Google DeepMind
—
Research Scientist
Links
Focus
Control, Monitoring, Safeguards, Dangerous Capability Evals, Scheming and Deception
H-index
16
Stream
David Lindner
David Lindner is a Research Scientist on Google DeepMind's AG Safety and Alignment team where he works on evaluations and mitigations for deceptive alignment and scheming. His recent work includes MONA, a method for reducing multi-turn reward hacking during RL, designing evaluations for stealth and situational awareness, and helping develop GDM's approach to deceptive alignment. Currently, David is interested in studying mitigations for scheming, including CoT monitoring and AI control. You can find more details on his website.