CV

My name is George Ma (Jiangyan Ma). I am an EECS PhD student at UC Berkeley, advised by Prof. Somayeh Sojoudi. Previously, I was an undergraduate student at Peking University, where I did research on graph learning at Prof. Yisen Wang’s lab. My email: george_ma@berkeley.edu. My homepage: George Ma’s Homepage. I actively write blogs on Zhihu: George M’s Zhihu Homepage.

Education

Undergraduate; Information and Computing Science
I studied at School of Electronic Engineering and Computer Science, Peking University. I majored in Applied Physics in the first two years of college and switched to Information and Computing Science thereafter.
PhD student; Electrical Engineering and Computer Sciences
Currently, I am an EECS PhD student at UC Berkeley.

Publications

Falsifying Sparse Autoencoder Reasoning Features in Language Models

George Ma, Zhongyuan Liang, Irene Y. Chen, Somayeh Sojoudi (2026). Falsifying Sparse Autoencoder Reasoning Features in Language Models. In Forty-Third International Conference on Machine Learning.

SpecAgent: A Speculative Retrieval and Forecasting Agent for Code Completion

George Ma, Anurag Koul, Qi Chen, Yawen Wu, Sachit Kuhar, Yu Yu, Aritra Sengupta, Varun Kumar, Murali Krishna Ramanathan (2026). SpecAgent: A Speculative Retrieval and Forecasting Agent for Code Completion. In Sixty-Fourth Annual Meeting of the Association for Computational Linguistics.

Revising and Falsifying Sparse Autoencoder Feature Explanations

George Ma, Samuel Pfrommer, Somayeh Sojoudi (2025). Revising and Falsifying Sparse Autoencoder Feature Explanations. In Thirty-Ninth Conference on Neural Information Processing Systems.

A Canonicalization Perspective on Invariant and Equivariant Learning

George Ma, Yifei Wang, Derek Lim, Stefanie Jegelka, Yisen Wang (2024). A Canonicalization Perspective on Invariant and Equivariant Learning. In Thirty-Eighth Conference on Neural Information Processing Systems.

Baking Symmetry into GFlowNets

Jiangyan Ma, Emmanuel Bengio, Yoshua Bengio, Dinghuai Zhang (2023). Baking Symmetry into GFlowNets. In NeurIPS 2023 AI for Science: from Theory to Practice.

Laplacian Canonization: A Minimalist Approach to Sign and Basis Invariant Spectral Embedding

Jiangyan Ma, Yifei Wang, Yisen Wang (2023). Laplacian Canonization: A Minimalist Approach to Sign and Basis Invariant Spectral Embedding. In Thirty-Seventh Conference on Neural Information Processing Systems.

Preprints

ScribbleEdit: Synthetic Data for Image Editing with Scribbles and Text

Anya Ji, George Ma, Téa Wright, Yiming Zhang, David M. Chan, Alane Suhr, Somayeh Sojoudi (2026). ScribbleEdit: Synthetic Data for Image Editing with Scribbles and Text. arXiv preprint arXiv:2605.01135.

Spooky Action at a Distance: Normalization Layers Enable Side-Channel Spatial Communication

Samuel Pfrommer, George Ma, Yixiao Huang, Somayeh Sojoudi (2025). Spooky Action at a Distance: Normalization Layers Enable Side-Channel Spatial Communication. arXiv preprint arXiv:2507.04709.

Internship Experience

Aug 2026 – Nov 2026; Microsoft
May 2026 – Aug 2026; Amazon Web Services
May 2025 – Aug 2025; Amazon Web Services
Jun 2023 – Sep 2023; Mila - Quebec AI Institute

Research Experience

Dec 2025 – Jan 2026; Evaluating Reasoning Features in Sparse Autoencoders
Studied whether sparse autoencoders (SAEs) isolate genuine reasoning features in large language models. Developed a falsification-based evaluation framework combining causal token injection, LLM-guided counterexample generation, and steering experiments. Conducted large-scale analysis across multiple models, layers, and reasoning datasets, finding that features identified by contrastive methods are predominantly explained by linguistic confounds rather than reasoning computations.
May 2025 – Sep 2025; SpecAgent for Code Completion
Interned at Amazon and proposed SpecAgent, an indexing-time retrieval agent that anticipates future edits in code repositories to reduce inference-time latency. Designed a leakage-free benchmark to avoid future context leakage and provide more realistic evaluation. Experiments demonstrated 9–11% absolute improvements in code completion performance. This work was accepted at ACL 2026 main conference.
Nov 2024 – May 2025; Revising SAE Feature Explanations
Investigated mechanistic interpretability of LLMs using sparse autoencoders (SAEs), which disentangle hidden representations into interpretable features. Proposed structured explanations, a tree-based explainer, and hard-negative sampling to address biases in current methods. The resulting paper was published at NeurIPS 2025.
Dec 2024 – Feb 2025; Normalization Layers and Side-Channel Communication
Studied the role of normalization layers in CNNs and discovered that they enable long-range spatial communication beyond local receptive fields. Analyzed this effect in a toy localization task, showing normalization layers act as iterative message-passing mechanisms. The work highlights risks in applications requiring spatial locality.
Jun 2023 – May 2024; Canonicalization for Invariant & Equivariant Learning
Collaborated with MIT researchers to introduce a canonicalization framework that unifies invariant and equivariant learning. This framework resolved an open problem on the expressiveness of invariant networks with equivariance constraints. We also designed new canonicalization algorithms for eigenvectors, leading to a publication at NeurIPS 2024.
Jun 2023 – Sep 2023; Baking Symmetry into GFlowNets
Interned at Prof. Yoshua Bengio’s lab (Mila) and applied my background in invariant networks to address symmetric actions in GFlowNets. Developed methods to incorporate symmetries into the generation process, improving both diversity and reward. The paper was presented as an oral at NeurIPS 2023 AI4Science.
Oct 2022 – Aug 2023; Laplacian Canonization for GNNs
Explored Laplacian eigenvectors as universal graph positional encodings, which suffer from sign and basis ambiguity. Proposed Laplacian Canonization, a preprocessing algorithm that resolves these ambiguities. The work was published as a poster at NeurIPS 2023.

Professional Activities

Top reviewer, NeurIPS 2024
Reviewer, ICLR 2025
Reviewer, AISTATS 2025
Reviewer, ICML 2025
Reviewer, NeurIPS 2025
Reviewer, ICML 2026
Reviewer, NeurIPS 2026
Reviewer, TMLR

Awards

Top 10 Excellent Graduation Thesis of School of EECS, Peking University
Excellent Graduation Thesis, Peking University

Scholarships

National Encouragement Scholarship, 2020–2024
Cyrus Tang Scholarship, 2020–2024