Technical work: Making safeguards 'run deep', including safeguards and risk management for open-weight models.
Governance work: Critical review of industry self-governance, critical review of national AI governance institutes, open-weight model governance, predicting and mitigating future AI incidents.
Stephen (“Cas”) Casper is a final year Ph.D student at MIT in the Algorithmic Alignment Group advised by Dylan Hadfield-Menell. His work focuses on AI safeguards and technical governance. His research has been featured at NeurIPS, AAAI, Nature, FAccT, EMNLP, SaTML, TMLR, IRAIS, several course curricula, a number of workshops, and over 20 news articles and newsletters. He is also a writer for the International AI Safety Report and the Singapore Consensus. In addition to MATS, he also mentors for ERA and GovAI. In the past, he has worked closely with over 30 mentees on various safety-related research projects.
2-3 meetings per week plus regular messaging and collaborative writing.
Here are some examples of papers related to safeguards and technical AI governance. If you are interested in any of them, you might be interested in this stream:
Green flags include:
This stream will follow an academic collaboration model. Scholars will be free to discuss and collaborate externally. However, scholars should also expect to work in collaboration with others in the stream.
Mentor(s) will talk through project ideas with scholar.
MATS Research phase provides scholars with a community of peers.
.webp)
During the Research phase, scholars work out of a shared office, have shared housing, and are supported by a full-time Community Manager.
Working in a community of independent researchers gives scholars easy access to future collaborators, a deeper understanding of other alignment agendas, and a social network in the alignment community.
Previous MATS cohorts included regular lightning talks, scholar-led study groups on mechanistic interpretability and linear algebra, and hackathons. Other impromptu office events included group-jailbreaking Bing chat and exchanging hundreds of anonymous compliment notes. Scholars organized social activities outside of work, including road trips to Yosemite, visits to San Francisco, and joining ACX meetups.