Qi Cao

Qi Cao

PhD Student

University of California, San Diego

Research Interests

LLM Reasoning
Machine Learning
Reinforcement Learning

About

I am Qi Cao, a 2nd-year Ph.D. student in the Department of Electrical and Computer Engineering at the University of California, San Diego, advised by Prof. Pengtao Xie.

Previously, I received my B.S. in Mathematics and Physics Class from the Yingcai Honors School at the University of Electronic Science and Technology of China, where I was fortunate to be advised by Prof. Liangjian Deng.

I will join Meta as a research scientist intern in Summer 2026. My current research focuses on large language model (LLM) reasoning and building LLM-based reasoning systems.

News

2026-02

We built a project page for SCOPE.

2025-12

I will join Meta as a research scientist intern in Summer 2026!

2024-09

Starting my PhD at UCSD.

Selected Publications

View All →

Models Under SCOPE: Scalable and Controllable Routing via Pre-hoc Reasoning

Qi Cao, Shuhao Zhang, Ruizhe Zhou, Ruiyi Zhang, Peijia Qin, Pengtao Xie

Arxiv Preprint

SCOPE, a model routing framework that predicts how accurate and how expensive each model will be before running it, allowing users to control cost-accuracy trade-offs and naturally handle new models.

DreamPRM-1.5: Unlocking the Potential of Each Instance for Multimodal Process Reward Model Training

Qi Cao, Pengtao Xie

Arxiv Preprint

An instance-reweighting updated version of DreamPRM, higher accuracy and more robust.

DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning

Qi Cao, Ruiyi Wang, Ruiyi Zhang, Sai Ashish Somayajula, Pengtao Xie

The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS)

Spotlight @ Multimodal Algorithmic Reasoning Workshop

A multimodal Process Reward Model (PRM) trained with domain-reweighting. Top 1 method on MathVista, MMMU & R-Bench-V.

Bidomain Modeling Paradigm for Pansharpening

Junming Hou, Qi Cao, Ran Ran, Che Liu, Junling Li, Liang-jian Deng

Proceedings of the 31st ACM international conference on multimedia (ACM MM)

Oral

We propose BiPan, a bidomain pansharpening framework that models band-specific local spectral features and global spatial details in the Fourier domain, achieving state-of-the-art performance by better handling spectral diversity and MS image degradation.

Zero-shot Semi-supervised Learning for Pansharpening

Qi Cao, Liang-Jian Deng, Wu Wang, Junming Hou, Gemine Vivone

Information Fusion

Zero-shot pansharpening (ZS-Pan) only requires a single pair of PAN/LRMS images. Any pansharpening network can take the ZS-Pan as a plug-and-play module. A two-phase three-component semi-supervised model is designed for ZS-Pan.