Oliver Sourbut

Making society safe from AI doesn't just mean making safe AI: we're figuring out how to uplift human collective intelligence, manage a highly multiagent world, improve foresight and institutional competence, ideally learning how to make best positive use of frontier AI systems as we go. FLF has a small, sharp team of researchers with a wide network, and we're looking to nurture new and missing approaches to minimising large-scale risks while steering to a flourishing future.

Stream overview

I'll work with scholars to determine a suitable project fit. Here are some directions:

Collective intelligence in line with FLF's work on AI for Human Reasoning: 'Human' as in individuals, small groups, societies, humanity; 'Reasoning' as in the whole decision cycle, observing, understanding, learning, deciding, acting (especially acting together in cooperative ways). The design space is huge and very undertapped - and of timely need!

Some specific possibilities:

  • collective epistemics: what foundational protocols can we build for more reliable and repeatable information sharing and discourse? Information-focused, communicator-focused, and platform-focused interventions all show promise here. We might connect with the Community Notes team at X/Twitter, or with Wikipedians. LLMs offer new opportunities for scaled clerical labour.
  • coordination: enriching high-stakes negotiation settings, improving group deliberation at scale, connecting prosocial interests in public/tech insider/representative contexts, red-teaming and improving institution and constitution design. Coordination failures are increasingly high stakes, while we see room for ambitious improvement from the right tech. Here we might be in touch with the Cooperative AI Foundation, Collective Intelligence Project, Computational Democracy Project, or others.

Foresight, risk modelling, institutional scenario planning: adjacent to collective intelligence, improving the resolution and lead time for planners involved in x-risk reduction, national security, and other foresight activities. Here AI (and LMs in particular) offer promise for scaled, fluent, foresight - but applied naively they'll at best reinforce existing assumptions or give non-insights. We might connect with philanthropic orgs in our network or with UK AISI, RAND, OECD, and similar governance orgs. Can we develop robust, legible leading indicators for concerning tech or societal developments, and discover ways to intervene? How can these insights be applied directly to governance of AI and related tech?

Legal and accountable multiagent society: are there protocols, laws, structures to be put in place to preserve some of the collective steering mechanisms modern society rests on through an influx of diverse autonomous AI? I don't expect such a world to be stable for long, but it might be coming soon, and be the staging ground for the really highest-stakes decisions facing humanity about its future. Ideally things would be scrutable, orderly, somewhat trusting, and contained, rather than chaotic and anarchic in that period. (Here I have fewest concrete ideas, but some gestures and an eagerness to explore.)

Mentors

Oliver Sourbut (Oly)
Future of Life Foundation
,
Researcher, AI
London
Dangerous Capability Evals, Compute Infrastructure, Policy & Governance, Strategy & Forecasting

Oly works at the Future of Life Foundation on sourcing and developing ambitious ideas to build a flourishing future, grounded in realistic scenarios for AI and other technological development. Priorities include human collective intelligence uplift, gentle and manageable multiagent transitions, and defensive tech.

Oly previously worked on loss of control risk modelling and evaluation at the UK AI Safety/Security Institute and continues to engage with the OECD, UK FCDO, DSIT, and parliamentarians on AI governance.

He researched (LM) agent oversight and multiagent safety at Oxford and was one of the first beneficiaries of the MATS program in 2021-22. Before his AI safety work, he was a senior data scientist and software engineer.

Mentorship style

Willing to devote a few hours per week to this - I'll keep a 30m or 1h slot available weekly, and interact on Slack circa daily. Some closer projects might be much more interactive.

Scholars we are looking for

Depends a lot on direction. Ideally be able to make proposals and dig into things somewhat independently. Be good at explaining your thinking, and able+willing to teach me things!

For collective intelligence/human reasoning, I'd usually want someone very familiar with software production, at least skilled in software development or in product management and prototyping. Other candidates with great vision can succeed here if they're able to work with complementary talent to get things going.

For foresight, any of: polymathic/multi-STEM/futurism background, deep expertise in bio and/or AI, natsec experience or connections, unusual writer/game dev talent, safety engineering background, other background that you think I might want to hear about.

For multiagent accountability: law, economics, politics, history, or a combination, plus some familiarity with AI and agents.

Depends a lot on focus. I've listed some groups above: FLF, CAIF, CIP, X, Wikimedia, UK AISI, RAND, ...

Absolutely welcome scholars working with other collaborators, and with other MATS participants.

Project selection

I'll ask for interests and (if you have them) a proposal or two right away. We'll spend the first week or two iterating that, discussing other options, and maybe trying out little experiments. Likely we'll pick a direction then, but it's also fine if we pivot later.

Community at MATS

MATS Research phase provides scholars with a community of peers.

During the Research phase, scholars work out of a shared office, have shared housing, and are supported by a full-time Community Manager.

Working in a community of independent researchers gives scholars easy access to future collaborators, a deeper understanding of other alignment agendas, and a social network in the alignment community.

Previous MATS cohorts included regular lightning talks, scholar-led study groups on mechanistic interpretability and linear algebra, and hackathons. Other impromptu office events included group-jailbreaking Bing chat and exchanging hundreds of anonymous compliment notes.  Scholars organized social activities outside of work, including road trips to Yosemite, visits to San Francisco, and joining ACX meetups.