This stream focuses on building realistic defensive cybersecurity benchmarks utilizing data from Asymmetric Security's work on real-world incidents.
Existing cybersecurity benchmarks lack realism, rarely testing how models behave in realistic security scenarios. This is especially challenging in cybersecurity because most relevant data is private.
Asymmetric Security responds to real cyber incidents and therefore holds data not available in the public domain. We would like to work with MATS scholars to build realistic benchmarks grounded in these real cyber incidents.
1 hour weekly meetings by default for high-level guidance. We will respond within a day to async communication.
Essential:
Preferred:
Scholars can collaborate with other MATS scholars and can find collaborators on their own. Asymmetric Security staff may also engage deeply.
We will assign the project direction; scholars will have significant tactical freedom.