We are excited to supervise projects that fall within the two following categories:
For 1., we are particularly interested in:
For 2., we are especially interested in:
There is growing evidence that language models exhibit degrees of situational awareness [1]( https://arxiv.org/abs/2309.00667), [2], i.e., they encode or verbalize information consistent with their actual operating context, ranging from (i) distinguishing training vs testing vs deployment (see, e.g., [3, 4] and frontier models' system cards), to (ii) recognizing their own outputs [5] or exhibiting introspective access over their activations [6, 7].
We are especially interested in supervising projects that empirically investigate the science behind these phenomena.
Examples of projects about evaluation awareness (for representative projects see, e.g., [8, 9, 10]):
Examples of projects about introspection (for representative projects see, e.g., [5, 6, 7, 11]) including investigating whether introspection can be helpful or harmful for safety.
LawZero is building the Scientist AI (SAI), a system based on the intuitions that it is possible to disentangle understanding from agency [12], and that oracles can be used as guardrails for agents [13].
Two components of the SAI will be a (i) "truthifier", that decomposes a corpus of text into statements with attribution sources [14], and (ii) an estimator of a predictor's uncertainty.
We are especially interested in projects that use amortized inference methods (such as GFlowNets [15, 16, 17]) to approximate posteriors over latent variables, such as (i) sources or (ii) predictors behind an autoregressive model, such that predictive uncertainty can be estimated from learned distributions as opposed to single-point estimates.
For projects related to “truth-ification”, we would like to investigate whether the truthification pipeline allows to learn better world models, (of the form of, e.g., [18]) in the presence of unreliable agent-generated data.
Damiano is a research scientist at LawZero, where he works on (i) the maths behind the Scientist AI, (ii) model organisms to study elicitation, (iii) interpretability and evaluation techniques for situational awareness and introspection.
Jean-Pierre is a machine learning research scientist at LawZero, focused on designing model-based AI systems with quantitative safety guarantees. His primary interests are in probabilistic inference in graphical models, and he draws inspiration from his multidisciplinary background in neurology and neuroscience, which informs his understanding of human cognition. Jean-Pierre studied at McGill University, obtaining a medical degree in 2017, completing a neurology residency in 2022, and earning a master's degree in neuroscience in 2023. During his master’s, he developed causal machine learning methods for precision medicine. Concurrently with his work at LawZero, Jean-Pierre is completing a PhD in computer science at Mila and Université de Montréal, supervised by Yoshua Bengio. In addition to contributing to the foundations of guaranteed-safe AI, Jean-Pierre is passionate about translating advances in AI into clinically meaningful, safety-critical applications.
Marc-Antoine is a Research Scientist at LawZero. His main area of expertise is NLP and applied ML, and he is currently applying this to AI safety projects.
His research areas include interpretability and evaluation.
OIi(ver) is a computer scientist (a staff member at LawZero and postdoc under Yoshua Bengio) with unusually broad scientific and mathematical expertise.
He is a sucker for pretty demos and grand unifying theories—unfortunately, sometimes losing sight of what is practical. Over the last few years (i.e., during his PhD at Cornell), Oli has discovered a beautiful theory describing how a great deal of artificial intelligence, classical and modern, can be fruitfully understood as resolving a natural information-theoretic measure of epistemic inconsistency. There remain many unanswered questions, but the hope is that this already much clearer view can lead to powerful generalist AI systems that are safer because they fundamentally do not meaningfully have goals or desires.
Pierre-Luc St-Charles is a researcher and developer specializing in applied machine learning with over a decade of experience across different non-profit institutes. He has held research roles at the Computer Research Institute of Montréal and senior research roles at Mila, collaborating with industrial partners and multidisciplinary academic teams on innovative projects in natural resources, transportation, digital media, document intelligence, and earth observation. Pierre-Luc earned his PhD in Computer Vision from Polytechnique Montréal in 2018, receiving the departmental Best Thesis Award. Since 2024, he has joined LawZero, a Mila-incubated organization focused on developing safe AI technologies. He is currently focused on building benchmarks and evaluation methodologies for frontier AI systems.
Yoshua Bengio is Full Professor of Computer Science at Université de Montreal, Co-President and Scientific Director of LawZero, as well as the Founder and Scientific Advisor of Mila. He also holds a Canada CIFAR AI Chair. Considered one of the world’s leaders in Artificial Intelligence and Deep Learning, he is the recipient of the 2018 A.M. Turing Award, considered to be the "Nobel Prize of computing." He is the most cited computer scientist worldwide, and the most-cited living scientist across all fields (by total citations).
Professor Bengio is a Fellow of both the Royal Society of London and Canada, an Officer of the Order of Canada, a Knight of the Legion of Honor of France, a member of the UN’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology, and chairs the International AI Safety Report.
Mentors' vacation will be communicated as soon as possible, both to MATS and the mentees. Should the primary mentor go on vacation during MATS (e.g., 2 weeks at the end of August), it will be responsibility of the mentor to (i) provide the minimal amount of mentorship required by the guidelines, (ii) ensure that the secondary mentor can act as primary for the duration of the vacation, (iii) ensure that the tasks for the project are well scoped and doable.
Mentors' vacation will be communicated as soon as possible, both to MATS and the mentees. Should the primary mentor go on vacation during MATS (e.g., 2 weeks at the end of August), it will be responsibility of the mentor to (i) provide the minimal amount of mentorship required by the guidelines, (ii) ensure that the secondary mentor can act as primary for the duration of the vacation, (iii) ensure that the tasks for the project are well scoped and doable.
See the papers linked earlier.
Essential knowledge:
Essential experience:
Desired experience:
Bonus:
MATS Research phase provides scholars with a community of peers.
.webp)
During the Research phase, scholars work out of a shared office, have shared housing, and are supported by a full-time Community Manager.
Working in a community of independent researchers gives scholars easy access to future collaborators, a deeper understanding of other alignment agendas, and a social network in the alignment community.
Previous MATS cohorts included regular lightning talks, scholar-led study groups on mechanistic interpretability and linear algebra, and hackathons. Other impromptu office events included group-jailbreaking Bing chat and exchanging hundreds of anonymous compliment notes. Scholars organized social activities outside of work, including road trips to Yosemite, visits to San Francisco, and joining ACX meetups.