Qi Cao

Qi Cao

PhD Student

University of California, San Diego

Research Interests

LLM and LLM agent
Machine Learning
Reinforcement Learning

About

I am Qi Cao, a 2nd-year Ph.D. student in the Department of Electrical and Computer Engineering at the University of California, San Diego, advised by Prof. Pengtao Xie.

Previously, I received my B.S. in Mathematics and Physics Class from the Yingcai Honors School at the University of Electronic Science and Technology of China, where I was fortunate to be advised by Prof. Liang-jian Deng.

I am currently a research scientist intern at Meta (Summer 2026). My research focuses on automating the AI development lifecycle: enabling LLMs to build, evaluate, and control other LLMs. I work on autonomous machine learning engineering systems, reward modeling, routing, test-time scaling and control for reasoning models and agents:

News

2026-06

Started my research scientist internship at Meta!

2026-05

AIBuildAI v2 is out!

2026-05

SCOPE is accepted by ICML 2026!

2026-03

DeepTech, ASI, and AIxiv cover our work AIBuildAI!

2026-03

AIBuildAI ranks No.1 on MLE-bench!

2026-02

UCSD Today News covers our work DreamPRM!

2026-02

We built a project page for SCOPE.

2025-12

I will join Meta as a research scientist intern in Summer 2026!

2024-09

Starting my PhD at UCSD.

Selected Publications

View All →

LLMs Know When They Know, but Do Not Act on It: A Metacognitive Harness for Test-time Scaling

Qi Cao, Yufan Wang, Peijia Qin, Shuhao Zhang, Pengtao Xie

Arxiv Preprint

A training-free metacognitive harness that turns LLMs' pre-solve feeling-of-knowing and post-solve judgment-of-learning signals into an explicit test-time control interface, boosting a fixed Claude Sonnet-4.6 across text, code, and multimodal benchmarks.

AIBuildAI: An AI agent that automatically builds AI models

Ruiyi Zhang, Peijia Qin, Qi Cao, Li Zhang, Pengtao Xie

Arxiv Preprint

We present AIBuildAI, an AI agent that automatically builds AI models, with the goal of solving general AI tasks in an end-to-end manner.

Models Under SCOPE: Scalable and Controllable Routing via Pre-hoc Reasoning

Qi Cao, Shuhao Zhang, Ruizhe Zhou, Ruiyi Zhang, Peijia Qin, Pengtao Xie

The Forty-Third International Conference on Machine Learning (ICML)

SCOPE, a model routing framework that predicts how accurate and how expensive each model will be before running it, allowing users to control cost-accuracy trade-offs and naturally handle new models.

DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning

Qi Cao, Ruiyi Wang, Ruiyi Zhang, Sai Ashish Somayajula, Pengtao Xie

The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS)

Spotlight @ Multimodal Algorithmic Reasoning Workshop

A multimodal Process Reward Model (PRM) trained with domain-reweighting. Top 1 method on MathVista, MMMU & R-Bench-V.

Bidomain Modeling Paradigm for Pansharpening

Junming Hou, Qi Cao, Ran Ran, Che Liu, Junling Li, Liang-jian Deng

Proceedings of the 31st ACM international conference on multimedia (ACM MM)

Oral

We propose BiPan, a bidomain pansharpening framework that models band-specific local spectral features and global spatial details in the Fourier domain, achieving state-of-the-art performance by better handling spectral diversity and MS image degradation.