Ke Zhang

PhD Student in Mathematics • University of California, Riverside

GitHub | Google Scholar | LinkedIn | CV (PDF)

I am a PhD student in Mathematics at UC Riverside. My research centers on large language models, evaluation, and AI for science: I design benchmarks and tool-augmented agents that measure what LLMs can actually do, from faithful Lean 4 autoformalization to driving scientific simulators and repairing optimization models.

Research

Beyond Compilation: Evaluating Faithful Natural-Language-to-Lean Statement Formalization
Ke Zhang, Patricio Gallardo, Sudhir Murthy, Yi Xie, Wang Zhi, Maziar Raissi.
Under Review. Tool-augmented agent that translates natural-language mathematics into Lean 4 code, with symbol retrieval, expert drafting, and REPL feedback in a closed loop; a factorial analysis of how each tool causally affects compilation and semantic faithfulness on a 400-theorem graduate-level benchmark. Previous work: [2604.16538]
PHREEQC-MCQ-200: A Diagnostic Benchmark for Tool-Augmented Scientific Simulator Agents
Ke Zhang, Sahchit Chundur, Mohammad Javad Abdolhosseini Qomi, Maziar Raissi.
Under Review. LLM agents equipped with the PHREEQC geochemistry simulator as an external tool, evaluated on a new 200-question benchmark; tool access lifts top-model accuracy by ~40 points but is not purely additive. [Poster]
Learning Parameterized Nonlinear Elasticity on Curved Surfaces
Yankang Liu, Ke Zhang, Maziar Raissi, Roya Zandi.
AI&PDE Workshop @ ICLR 2026 — Poster. A parameter-conditioned physics-informed neural network that represents a continuous family of nonlinear elastic equilibria on curved manifolds from a single trained model. I led the training pipeline and ablation studies. [arXiv]
Agentic Repair of Gurobi Optimization Models via Tool Use: A Fractional-Factorial Study of Knowledge, Diagnostics, and Execution
Ke Zhang, Maziar Raissi.
AAAI 2025 Workshop (AI4Research) — Accepted. Benchmark of 26 Gurobi problems with 260 unit tests, an LLM-based bug generator, and a RAG-grounded repair agent over the Gurobi Python API, analyzed via fractional factorial design. [Project page]
Agentic Lean Auformalization (ALA) v1: An LLM collaborative approach to autoformalization in LEAN
Patricio Gallardo, Maziar Raissi, Ke Zhang, Sudhir Murthy.
NeurIPS 2025 LLM Evaluation Workshop — Poster. Earlier version of the Lean autoformalization system. [Poster]

About Me

Outside research, I climb mountains — Mt. Siguniang Erfeng (5,276 m / 17,309 ft) in Sichuan, and Mt. San Jacinto via Devil's Slide Trail (16 mi) — and I do my own car maintenance, down to the oil, brake pads, and rotors.

Contact

Email: kzhan153@ucr.edu