
A\ Model Psych and Biology
Wes participated in the MATS 3.0 cohort under the supervision of Neel Nanda where he worked on methods for extracting concepts out of superposition. After MATS, he finished his PhD at MIT working on a broad range of interpretability topics with Dimitris Bertsimas, Max Tegmark, and MATS scholars from subsequent cohorts. After grad school, he joined the interpretability team at Anthropic where he currently works on methods to extract circuits from frontier LLMs as well as researching various aspects of model biology."
"
The Winter 2022-23 cohort supported 58 scholars with 17 mentors including researchers from Anthropic, MIRI, ARC, Redwood Research, and other leading organizations. This cohort introduced the Scholar Support team to provide research coaching and unblocking assistance to scholars throughout the program. The program ran 6 weeks online followed by 2 months in-person in Berkeley and featured scholar-led activities including study groups on mechanistic interpretability and linear algebra, weekly lightning talks, and workshops on research tools and technical writing.Notable alumni from this cohort include Marius Hobbhahn, who founded Apollo Research and published work on mechanistic interpretability; Asa, who co-authored papers on measuring situational awareness and the "reversal curse" in large language models; and Jesse Hoogland, who founded Timaeus and developed the developmental interpretability research agenda.
Wes Gurnee