Stephen Casper (Cas)

I (Cas) work on a range of projects from technical safeguards to technical governance. This stream follows an academic collaboration model and will work will likely focus on technical topics in AI governance. 

Stream overview

Technical work: Making safeguards 'run deep', including safeguards and risk management for open-weight models. 

Governance work: Critical review of industry self-governance, critical review of national AI governance institutes, open-weight model governance, predicting and mitigating future AI incidents. 

Mentors

Stephen Casper (Cas)
MIT CSAIL
,
PhD Candidate
Boston
Adversarial Robustness, Policy & Governance, Red-Teaming, Safeguards

Stephen (“Cas”) Casper is a final year Ph.D student at MIT in the Algorithmic Alignment Group  advised by Dylan Hadfield-Menell. His work focuses on AI safeguards and technical governance. His research has been featured at NeurIPS, AAAI, Nature, FAccT, EMNLP, SaTML, TMLR, IRAIS, several course curricula, a number of workshops, and over 20 news articles and newsletters. He is also a writer for the International AI Safety Report and the Singapore Consensus. In addition to MATS, he also mentors for ERA and GovAI. In the past, he has worked closely with over 30 mentees on various safety-related research projects.

Mentorship style

2-3 meetings per week plus regular messaging and collaborative writing. 

Scholars we are looking for

Green flags include:

  • Research experience and taste
  • Demonstrated initiative in ideating and leading prior projects
  • Demonstrated ability to 'succeed even when not set up to succeed' in past research/projects.
  • Taking a critical mindset about projects and impact
  • Interest in academia

This stream will follow an academic collaboration model. Scholars will be free to discuss and collaborate externally. However, scholars should also expect to work in collaboration with others in the stream.

Project selection

Mentor(s) will talk through project ideas with scholar.

Community at MATS

MATS Research phase provides scholars with a community of peers.

During the Research phase, scholars work out of a shared office, have shared housing, and are supported by a full-time Community Manager.

Working in a community of independent researchers gives scholars easy access to future collaborators, a deeper understanding of other alignment agendas, and a social network in the alignment community.

Previous MATS cohorts included regular lightning talks, scholar-led study groups on mechanistic interpretability and linear algebra, and hackathons. Other impromptu office events included group-jailbreaking Bing chat and exchanging hundreds of anonymous compliment notes.  Scholars organized social activities outside of work, including road trips to Yosemite, visits to San Francisco, and joining ACX meetups.