Gabriel Kulp

In this project, we will explore GPU side-channel attacks to extract information about model usage. A simple example is to observe (via radio, power fluctuations, acoustics, etc.) which experts were used in each forward pass of an MOE model, then use those observations to guess which tokens were produced.

Apply

View all streams

Stream overview

As a team, we will decide which projects to pursue based on individual interest and skills. Broadly, we want to demonstrate information leaving a GPU in unexpected or surprising ways, especially to steal prompt tokens, response tokens, or model weights. Additionally, we are interested in training a model to induce leakage, thus turning the side-channel into a covert channel. We are interested in standard tech stacks and in hardened tech stacks, and scholars more interested in defense will have the freedom to research hardware and software countermeasures.

The experimental setup is a frontier data center GPU with various sensors attached, including an oscilloscope on the power supply and an electromagnetic probe near the GPU die. These and more sensors are software-accessible from the same Jupyter notebook which runs inference and training on the GPU, making it easy to correlate sensor readings with code execution, performance counters, and built-in sensors (as in nvidia-smi).

Mentors

Gabriel Kulp

RAND

Adjunct Staff

Washington, D.C.

—

Compute and Hardware

Security

Gabriel works with RAND on hands-on projects to build and test prototypes of secure compute infrastructure. He focuses on how to secure the most sensitive AI data centers against the most sophisticated current and future threats. Gabriel has also worked on hardware-enabled governance mechanisms (HEMs, at the intersection of GPU export control and hardware security) and on technical verification of agreements on the development and use of AI systems. He holds a master's degree in computer science and is pursuing a PhD in AI.

My two scholars will work together and with non-scholars on the team, including with direct hires and mentees from other programs. I'm not positive what this cast of characters will look like when MATS begins.

This project is supported via a new spin-out nonprofit which works closely with RAND on more-physical projects which RAND procurement is not a good fit for. From the scholars' perspective, I don't expect this detail to matter.

Mentorship style

Co-working 2-4 hours per week, including detailed guidance. Flexible. 1 hour check-ins per week. You can schedule ad-hoc calls if stuck or wanting to brainstorm.

Representative papers

Gregersen, et al., Input-Dependent Power Usage in GPUs
A blog post exploring the sensitivity of GPU power draw to the content of matrices during multiplication
A rundown of common methods in hardware security
A paper on leaking model weights from GPUs
A paper on leaking information from existing on-chip sensors
Another paper on leaking model weights
A paper on recovering tokens from LLM side channels

Scholars we are looking for

Please note: experience with hardware is not a requirement for this stream, as long as you are willing to work hard and learn fast, and can show other evidence of exceptional ability. If in doubt: we encourage you to apply!

We will provide you with a lot of autonomy and plug-and-play access to a rare combination of tools and equipment—in exchange we expect you to have a strong self-direction, intellectual ambition, and a lot of curiosity. This stream requires you to have a tight experiment loop to form and test hypotheses on the fly.

Example skill profiles:

Machine learning: familiarity with LLM architecture, especially for mixture-of-experts. Making classifier, regression, and generative models for unusual data types.
Math: statistical tests of correlation and mutual information.
Electrical engineering: signal processing, transmission lines, antennae and radio, switching power supplies.
Hardware: GPU architecture (memory and compute), PCIe traffic, matrix multiplication circuits, dynamic frequency scaling.

Must have: Trained or fine-tuned a transformer language model in PyTorch (toy models and following guides is fine). Familiar with basic electronics concepts (voltage, current, transistors). Has experience writing research papers, even as a class assignment.

Nice to have: Familiarity with LaTeX, PyTorch internals, CUDA/OpenCL, GPU architecture, chip design, oscilloscopes, signal processing, electrical engineering.

You will collaborate with other scholars and researchers pursuing similar topics. You may find other collaborators, but they will need to be individually approved for access to the remote research testbenches.

Project selection

There is a cluster of potential projects to choose from. As a team, we will decide which to pursue based on individual interest and skills. Mentors will pitch example projects and scholars can then modify and re-pitch them. Once the research problem, hypothesis, and testing plan are written and agreed on, scholars begin object-level work. We encourage failing fast and jumping to a fallback project.