Publications

Falsifying Sparse Autoencoder Reasoning Features in Language Models

Published in ICML, 2026

Sparsity-biased SAEs tend to latch onto low-dimensional cue tokens that co-occur with reasoning, and we find most contrastive “reasoning features” are largely explainable by such cues rather than robust reasoning signals.

Recommended citation: George Ma, Zhongyuan Liang, Irene Y. Chen, Somayeh Sojoudi (2026). Falsifying Sparse Autoencoder Reasoning Features in Language Models. In Forty-Third International Conference on Machine Learning. https://openreview.net/forum?id=TCFtA9CI3U

SpecAgent: A Speculative Retrieval and Forecasting Agent for Code Completion

Published in ACL, 2026

SpecAgent proactively explores software repositories at indexing time to build speculative context that eliminates inference-time retrieval latency, improves LLM code-generation accuracy by up to 11% absolute, and introduces a new leakage-free benchmark construction method for realistic evaluation.

Recommended citation: George Ma, Anurag Koul, Qi Chen, Yawen Wu, Sachit Kuhar, Yu Yu, Aritra Sengupta, Varun Kumar, Murali Krishna Ramanathan (2026). SpecAgent: A Speculative Retrieval and Forecasting Agent for Code Completion. In Sixty-Fourth Annual Meeting of the Association for Computational Linguistics. https://openreview.net/forum?id=sSA2VqcyCd

Revising and Falsifying Sparse Autoencoder Feature Explanations

Published in NeurIPS, 2025

We developed new methods to refine and falsify sparse autoencoder feature explanations, yielding higher-quality interpretability of large language models.

Recommended citation: George Ma, Samuel Pfrommer, Somayeh Sojoudi (2025). Revising and Falsifying Sparse Autoencoder Feature Explanations. In Thirty-Ninth Conference on Neural Information Processing Systems. https://openreview.net/forum?id=OJAW2mHVND

A Canonicalization Perspective on Invariant and Equivariant Learning

Published in NeurIPS, 2024

We analysed the efficiency and expressiveness of invariant and equivariant networks from a canonicalization perspective.

Recommended citation: George Ma, Yifei Wang, Derek Lim, Stefanie Jegelka, Yisen Wang (2024). A Canonicalization Perspective on Invariant and Equivariant Learning. In Thirty-Eighth Conference on Neural Information Processing Systems. https://openreview.net/forum?id=jjcY92FX4R&noteId=jjcY92FX4R

Baking Symmetry into GFlowNets

Published in NeurIPS-AI4Science, 2023

We proposed to incorporate state and action symmetries into GFlowNets.

Recommended citation: Jiangyan Ma, Emmanuel Bengio, Yoshua Bengio, Dinghuai Zhang (2023). Baking Symmetry into GFlowNets. In NeurIPS 2023 AI for Science: from Theory to Practice. https://openreview.net/forum?id=CZGHAeeBk3

Laplacian Canonization: A Minimalist Approach to Sign and Basis Invariant Spectral Embedding

Published in NeurIPS, 2023

We designed the Laplacian Canonization algorithm to address the sign and basis ambiguities of Laplacian eigenvectors.

Recommended citation: Jiangyan Ma, Yifei Wang, Yisen Wang (2023). Laplacian Canonization: A Minimalist Approach to Sign and Basis Invariant Spectral Embedding. In Thirty-Seventh Conference on Neural Information Processing Systems. https://openreview.net/forum?id=1mAYtdoYw6