The stream will focus on conceptual, empirical, and theoretical work on scalable oversight and control. This includes but is not limited to creating model organisms for specific failure modes, designing training procedures against them, and making progress on subproblems involved in safety cases.
Shi Feng leads a research group working on oversight and control. He is an assistant professor at George Washington University. Prior to that, he was a postdoc in the NYU Alignment Research Group under Sam Bowman. He currently focuses on deception and collusion, with an emphasis on propensity and evaluation realism.
Scholars will collaborate with people involved in the group but can also find new collaborators.
A research agenda document will be shared ahead of time with a short list of project ideas. The scholars can also brainstorm and pitch ideas that are aligned with the research agenda. We will decide on assignments in week 2.
MATS Research phase provides scholars with a community of peers.
.webp)
During the Research phase, scholars work out of a shared office, have shared housing, and are supported by a full-time Community Manager.
Working in a community of independent researchers gives scholars easy access to future collaborators, a deeper understanding of other alignment agendas, and a social network in the alignment community.
Previous MATS cohorts included regular lightning talks, scholar-led study groups on mechanistic interpretability and linear algebra, and hackathons. Other impromptu office events included group-jailbreaking Bing chat and exchanging hundreds of anonymous compliment notes. Scholars organized social activities outside of work, including road trips to Yosemite, visits to San Francisco, and joining ACX meetups.