Agentic Property-Based Testing: Finding Bugs Across the Python Ecosystem

MATS Fellow:

Muhammad Maaz

Authors:

Muhammad Maaz, Liam DeVoe, Zac Hatfield-Dodds, Nicholas Carlini

Citations

Citations

Abstract:

We developed an agent that can efficiently identify bugs in large software projects. To do this, our agent infers general properties of code that should be true, and then by applying property-based testing—a technique similar to fuzz testing—we are able to discover bugs in top Python packages like NumPy, SciPy, and Pandas. After extensive manual validation, we are in the process of reporting these bugs to the developers, several of which have already been patched.

For more information, read the full paper, take a look at the GitHub repository, or browse the bugs we found at our site.

Recent research

What Should Frontier AI Developers Disclose About Internal Deployments?

Authors:

Jacob Charnock, Raja Moreno, Justin Miller, William L. Anderson

Date:

April 24, 2026

Citations:

Where is the Mind? Persona Vectors and LLM Individuation

Authors:

Pierre Beckmann

Date:

April 20, 2026

Citations:

Frequently asked questions

What is the MATS Program?
How long does the program last?