Some examples of projects I'd be excited about are:
These project ideas are less well-scoped by me but I'd be interested and excited if scholars had clear ideas of what to work on here:
I like to make computers do interesting things, deeply understand concepts and build interesting, useful tools. I’m currently thinking about AI alignment, control, and evaluations, and work with frontier models at METR.
Recent work I've done involves MALT, training models to fool monitors in QA settings and RE-Bench.
I've previously worked at Stripe and CSM, and did a concurrent BSc/MSc in Computer Science at Brown.
Time commitments: I expect to not be able to spend more than 5 hours on any week.
Meetings: I expect to have project meetings weekly for about an hour, where we chat about your results from last week, the planned next steps, any blockers or uncertainties. We'll have a monthly overall project check-in about broader progress towards overall goals.
Help outside of meetings: I am available to provide some help most weeks outside of the meeting, but by and large I expect mentees to be self-directed and self-sufficient in solving problems.
An ideal mentee has a strong AI research (software engineering is a plus) background. It's important that they are self-motivated and can make weekly progress with little intervention. If you are interested in working on non-concretely scoped projects, I would expect mentees to have the ability to write well-scoped project proposals, with realistic planned milestones and deliverables. Evidence of successful projects here would be very helpful in evaluating this.
A mentee can be a PhD student and they can work on a paper that will be part of their thesis.
Can independently find collaboraters, but not required
I will talk through project ideas with the scholar
MATS Research phase provides scholars with a community of peers.
.webp)
During the Research phase, scholars work out of a shared office, have shared housing, and are supported by a full-time Community Manager.
Working in a community of independent researchers gives scholars easy access to future collaborators, a deeper understanding of other alignment agendas, and a social network in the alignment community.
Previous MATS cohorts included regular lightning talks, scholar-led study groups on mechanistic interpretability and linear algebra, and hackathons. Other impromptu office events included group-jailbreaking Bing chat and exchanging hundreds of anonymous compliment notes. Scholars organized social activities outside of work, including road trips to Yosemite, visits to San Francisco, and joining ACX meetups.