Constantin Weisser

Deepgram

As part of MATS 6.0 supervised by CHAI’s Micah Carroll, Constantin and his collaborators demonstrated that targeted manipulation and deception emerge in LLMs trained on user rather than annotator feedback. His MATS stream’s paper was accepted as an oral contribution at the SATA workshop and a spotlight at the SoLaR workshop, both at NeurIPS.

Constantin's experience at MATS led to his role as the first technical staff member at Haize Labs. There, he focuses on LLM automated red teaming and jailbreak protection supporting labs such as Anthropic, OpenAI, and AI21.

Prior to MATS, Constantin completed a PhD applying machine learning to particle physics and worked as a machine learning consultant at McKinsey.

The Summer 2024 cohort marked a significant expansion, supporting approximately 90 scholars with 40 mentors—the broadest mentor selection in MATS history. This cohort incorporated MATS as a 501(c)(3) nonprofit organization, formalizing its institutional structure. The program expanded its research portfolio to include at least four governance mentors alongside technical research streams, reflecting growing interest in AI policy and technical governance work. The 10-week research phase continued in Berkeley, with scholars conducting work across mechanistic interpretability, evaluations, scalable oversight, and governance research.Notable outputs from this cohort include research on targeted manipulation and deception in LLMs trained on user feedback, which was accepted to NeurIPS workshops, and contributions to an AI safety via debate paper that won best paper at ICML 2024. One scholar co-founded Decode Research, a new AI safety organization focused on building interpretability tools.

Constantin Weisser