Xinchen Zhang

I am a second-year master student at IIGroup in Tsinghua University, supervised by Prof. Yujiu Yang. I received my Bachelor's degree at the School of Artificial Intelligence, Xidian University.

I am currently a research intern at ByteDance Seed, focusing on reinforcement learning in multimodal large language models. I work closely with Dr. Ling Yang from the AI Lab at Princeton University.

Email  /  WeChat  /  GitHub  /  Google Scholar

I will graduate in 2027. Feel free to contact me via email or WeChat if you are recruiting!

Xinchen Zhang profile photo
Research

My current research focuses on post-training and reinforcement learning for Multimodal Large Language Models (MLLMs), with two primary directions:

  • Agentic Visual Coding: I develop post-training methods for multimodal agents to iteratively inspect, debug, and refine visual code by multi-turn self-improvement, contributing to Seed2.0 and Seed2.1.
  • Universal Visual Verifier: I work on building universal visual verifiers that provide scalable reward signals for multimodal reasoning and generation, while exploring the reward paradigms and upper limits of reinforcement learning. This direction contributes to Seed1.8 and Seed2.0, with representative works including OmniVerifier (ICLR'26 Oral) and OmniVerifier-M1 (ICML'26).

Previously, I focused on post-training of generative models, including joint understanding-generation reinforcement learning for unified models, such as HermesFlow (NeurIPS'25) and MMaDA (NeurIPS'25), as well as post-training of text-to-image models for complex compositional generation, including IterComp (ICLR'25) and RealCompo (NeurIPS'24).

Research overview
News
  • [Jun. 2026] We released Seed2.1, where I worked on post-training for agentic visual coding.
  • [May. 2026] OmniVerifier-M1 is accepted by ICML 2026.
  • [May. 2026] I propose OmniVerifier-M1, advancing multimodal verifier with symbolic meta-verification.
  • [Feb. 2026] We release Seed2.0, a generalist agentic model for real-world complexity.
  • [Jan. 2026] OmniVerifier is accepted by ICLR 2026 (Oral Paper, Top 1%).
  • [Dec. 2025] We release Seed1.8, a generalized agentic model in real-world scenarios.
  • [Oct. 2025] I propose OmniVerifier, a universal verifier for generalist multimodal foundation models.
  • [Sep. 2025] Three papers about reinforcement learning and MLLMs are accepted by NeurIPS 2025, including HermesFlow, MMaDA, and PeRL.
  • [Aug. 2025] RPF-Net is accepted by Pattern Recognition.
  • [May. 2025] We release Seed1.5-VL, a series of state-of-the-art vision-language models.
  • [Feb. 2025] I started as a research intern at ByteDance Seed, focusing on MLLM Post-training.
  • [Jan. 2025] IterComp is accepted by ICLR 2025.
  • [Nov. 2024] I gave a talk at TechBeat about compositional text-to-image generation.
  • [Oct. 2024] I propose IterComp, leveraging iterative RLHF to achieve fast and realistic text-to-image generation.
  • [Sep. 2024] RealCompo is accepted by NeurIPS 2024.
  • [Feb. 2024] I propose RealCompo, achieving the balance of compositionality and realism in controllable text-to-image generation.
  • [Sep. 2023] Qualified to be exempted from Tsinghua University for postgraduate studies.
  • [May. 2023] Check out our recent work, RPF-Net.
Technical Reports
Seed2.1 model card thumbnail Seed2.1 Model Card: Agentic Intelligence for Productivity
Technical Report
Project Page / Preprint

Seed2.0 model card thumbnail Seed2.0 Model Card: Towards Intelligence Frontier for Real-World Complexity
Technical Report
Project Page / Preprint

Seed1.8 model card thumbnail Seed1.8 Model Card: Towards Generalized Real-World Agency
Technical Report
Project Page / Preprint

Seed1.5-VL technical report thumbnail Seed1.5-VL Technical Report
Technical Report
Project Page / Preprint

Publications

(* denotes equal contribution.)

OmniVerifier-M1 paper thumbnail OmniVerifier-M1: Multimodal Meta-Verifier with Explicit Structured Recalibration
Xinchen Zhang, Bowei Liu, Jiale Liu, Chufan Shi, Yizhen Zhang, Junhong Liu, Youliang Zhang, Zhiheng Li, Yujiu Yang, Ling Yang
ICML, 2026
Preprint / Code

OmniVerifier paper thumbnail Generative Universal Verifier as Multimodal Meta-Reasoner
Xinchen Zhang, Xiaoying Zhang, Youbin Wu, Yanbin Cao, Renrui Zhang, Ruihang Chu, Ling Yang, Yujiu Yang, Guang Shi
ICLR, 2026 (Oral Paper, Top 1%)
Preprint / Code / Checkpoint / Benchmark

HermesFlow paper thumbnail HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation
Ling Yang*, Xinchen Zhang*, Ye Tian, Chenming Shang, Minghao Xu, Wentao Zhang, Bin Cui
NeurIPS, 2025
Preprint / Code / Checkpoints

IterComp paper thumbnail IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Xinchen Zhang, Ling Yang, Guohao Li, Yaqi Cai, Jiake Xie, Yong Tang, Yujiu Yang, Mengdi Wang, Bin Cui
ICLR 2025
Preprint / Code / Checkpoints (Over 3.3W downloads)

RealCompo paper thumbnail RealCompo: Balancing Realism and Compositionality Improves Text-to-Image Diffusion Models
Xinchen Zhang, Ling Yang, Yaqi Cai, Zhaochen Yu, Kaini Wang, Jiake Xie, Ye Tian, Minkai Xu, Yong Tang, Yujiu Yang, Bin Cui
NeurIPS 2024
Project page / Preprint / Code

MMaDA paper thumbnail MMaDA: Multimodal Large Diffusion Language Models
Ling Yang, Ye Tian, Bowen Li, Xinchen Zhang, Ke Shen, Yunhai Tong, Mengdi Wang
NeurIPS, 2025
Preprint / Code / Checkpoints

HEAR paper thumbnail HEAR: High-frequency Enhanced Autoregressive Modeling for Identity-Preserving Image Generation
Shiyi Zhang*, Xinchen Zhang*, Youliang Zhang, Yongxin Xiao, Xiu Li, Jian Song, Yujiu Yang
Under Review

SparseAR paper thumbnail SparseAR: Not All Visual Tokens Are Crucial in Autoregressive Image Model Training
Ling Yang*, Zhaochen Yu*, Xinchen Zhang*, Peng Cao, Yujiu Yang, Bin Cui, Shuicheng Yan
Under Review

PeRL paper thumbnail PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning
Yizhen Zhang, Yang Ding, Shuoshuo Zhang, Xinchen Zhang, Haoling Li, Zhong-zhi Li, Peijie Wang, Jie Wu, Lei Ji, Yelong Shen, Yujiu Yang, Yeyun Gong
NeurIPS, 2025
Preprint / Code

Diffusion-Sharpening paper thumbnail Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening
Ye Tian, Ling Yang, Xinchen Zhang, Yunhai Tong, Mengdi Wang, Bin Cui
arXiv, 2025
Preprint / Code
profile photo Compositional Generalization through Brain-inspired Geometric Constraints on Representation Structure
Chenming Shang, Shiji Zhou, Hengyuan Zhang, Xinchen Zhang, Lei Ke, Yuwang Wang, Yujiu Yang
Under Review

RPF-Net paper thumbnail Recurrent Progressive Fusion-based Learning for Multi-source Remote Sensing Image Classification
Xinchen Zhang, Hao Zhu, Xiaotong Li, Biao Hou, Wenhao Zhao, Xiaoyu Yi, Wenping Ma, Licheng Jiao
Pattern Recognition
Paper / Code
Education
Tsinghua University logo Tsinghua University
M.Eng. in Big Data Technology and Engineering (2024 - )
Advisor: Prof. Yujiu Yang
Xidian University logo Xidian University
B.Eng. in Artificial Intelligence (2020 - 2024)
Advisor: Prof. Hao Zhu, Prof. Licheng Jiao
Experience
ByteDance Seed logo ByteDance Seed
Multimodal Interaction and World Model Team

Research Intern (Feb. 2025 - Present)
Topic: VLM Foundation Model Post-training
Advisor: Xiaoying Zhang, Youbin Wu, Guang Shi
Services
  • Conference Reviewer:
    • International Conference on Computer Vision (ICCV) 2025
    • International Conference on Machine Learning (ICML) 2025, 2026
    • IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025, 2026
    • International Conference on Learning Representations (ICLR) 2025, 2026
    • Conference on Neural Information Processing Systems (NeurIPS) 2025
  • Journal Reviewer:
    • International Journal of Computer Vision (IJCV)
Talks
  • IterComp, RealCompo: Towards Compositional Text-to-Image Generation, TechBeat, 2024
Honors & Awards
  • Special Prize Scholarship, 2022
  • First Prize Scholarship, 2021
  • First Prize, The Chinese Mathematics Competitions (CMC), 2021
  • First Prize, China Undergraduate Mathematical Contest in Modeling (CUMCM), 2021
  • First Prize (Meritorious Winner), International Mathematical Contest in Modeling (MCM/ICM), 2022