About me
I am a research scientist at Luma AI, working on multimodal reasoning and generation. Previously, I was a research scientist at NVIDIA Research, where I was a core contributor to NVIDIA Cosmos. I obtained my Ph.D. in 2024 from the University of Illinois Urbana-Champaign (UIUC), and my B.S. in 2019 from UIUC with double majors in Physics and Statistics & Computer Science.
My research focuses on post-training for multimodal models and LLMs, including reasoning, RLHF, reward modeling, and synthetic data curation.
Selected Projects
Luma Uni-1 — A frontier unified understanding and generation model.
- Core contributor to Uni-1
- #1 Elo rating in human preference evaluation across leading image generation models
NVIDIA Cosmos — Open models for physical AI.
- Core contributor to Cosmos-Reason1 (2M+ total model downloads) and Cosmos-Reason2, reasoning vision-language models for Physical AI
- Core contributor of Cosmos World Foundation Models (8K+ GitHub Stars)
RLHFlow — Open-source RLHF for LLM post-training.
- Developed ArmoRM, a multi-objective reward model that achieved #1 on RewardBench (May 2024), surpassing GPT-4 judges; 400K+ downloads on HuggingFace
- Built the first open-source online iterative DPO framework
Selected Publications
See the full list on Google Scholar.
Foundation Model Development
[2026] Luma Uni-1: A Frontier Unified Understanding and Generation Model. Luma AI — Core Contributor.
[2025] Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning. NVIDIA — Core Contributor.
[2025] Cosmos World Foundation Model Platform for Physical AI. NVIDIA — Core Contributor.
LLM Post-Training
[ICLR 2026] NFT: Bridging Supervised Learning and Reinforcement Learning in Math Reasoning.
[EMNLP 2024] Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts (ArmoRM).
[TMLR 2024] RLHF Workflow: From Reward Modeling to Online RLHF.
Multimodal Generation
[ICLR 2026] DiffusionNFT: Online Diffusion Reinforcement with Forward Process. (Oral)
[ICLR 2026] InfoTok: Adaptive Discrete Video Tokenizer via Information-Theoretic Compression. (Oral)
News
- [Mar 2026] Luma Uni-1 is released!
- [Jun 2025] NFT (Negative-aware Fine-Tuning) is released! [Code]
- [Mar 2025] Cosmos-Reason1 is released!
- [Jan 2025] Cosmos World Foundation Model is released!
- [Sep 2024] ArmoRM accepted by EMNLP 2024!
- [Sep 2024] RLHF Workflow accepted by TMLR!
