
ARC
—
Researcher
Links
Focus
Interpretability
H-index
4
Wilson Wu is a researcher at the Alignment Research Center (ARC), which is working on a systematic and theoretically grounded approach to mechanistic interpretability. He has previously worked on alternate approaches to interpretability including compact proofs and applications of singular learning theory.