Yiqun Chen

Gaoling School of AI

Renmin University of China

Beijing, China

chenyiqun990321
@{ruc.edu.cn, gmail.com}

13853687820

My name is Yiqun Chen (陈逸群). Currently, I am pursuing my Ph.D. at the Gaoling School of Artificial Intelligence, Renmin University of China (RUC), under the guidance of Prof. Jiaxin Mao.

🔬 Research Interests

My research interests primarily lie in Multi-Agent Reinforcement Learning and Agentic Search:

LLM Agent & Reinforcement Learning:
- General LLM-based Multi-Agent Optimization Framework (UnityMAS-O)
- Data Synthesis & Agent Memory & Evaluation/Reward
- Multi-Agent Reinforcement Learning (MARL)
AI Search:
- Retrieval-Augmented Generation (RAG)
- Agentic Search & Deep Search/Research
Information Retrieval (IR):
- Large Language Models for Ranking (LLM4Ranking)
- Application of Reinforcement Learning for IR (e.g., RL for Diversified Search)

🏢 Industry Collaboration & Leadership

🚀 Recent Focus: Multi-Agent/Agent-Swarm Joint Optimization (RL)

Recently, I have maintained close collaborations with leading tech companies on LLM-based Multi-Agent RL, leading the development of UnityMAS-O, a Ray + veRL-based multi-agent reinforcement learning framework that supports customizable agent workflows, flexible agent-to-model mapping, and scalable distributed PPO optimization across shared, partially shared, or independent models.

🌟 Previous Internships My internship experiences include:

XiaoHongShu (Dots Agent & AI Search) (✨Ace Top Intern Program): End-to-end Multi-Agent RL optimization and full-link Agent research.
Baidu (Search Dept. & Intelligent Cloud): Agentic Search, Dumate Agent research.
ByteDance (Feishu/Lark): Memory-augmented AI search.
Huawei (Noah’s Ark Decision Making & Reasoning Lab): Multi-Agent Reinforcement Learning (MARL).
DiDi Chuxing (Ride-hailing Dept.): Pick-up/Drop-off location recommendation.

👨‍🎓 Job Market: Fall 2026 Internship

As a prospective Ph.D. graduate (Class of 2027), I am actively seeking a Fall 2026 Internship (targeting the 2027 campus recruitment season).

🤝 Why me? My mission is to build robust, scalable Multi-Agent paradigms and efficient infra/training framework. I prioritize practical utility over theoretical narratives (rejecting mere “storytelling”). I am dedicated to bringing tangible performance gains and genuine, deployable innovation to industrial scenarios.

If you are looking for a researcher who focuses on what actually works, please contact me!

🎓 Education

Ph.D. Candidate in Artificial Intelligence Gaoling School of Artificial Intelligence (GSAI), Renmin University of China (RUC) 2023 - 2027 (Expected)
M.Sc. in Pattern Recognition and Intelligent Systems Institute of Automation, Chinese Academy of Sciences (CASIA) 2020 - 2023
B.Sc. in Automation Shandong University (SDU) 2016 - 2020

📰 News

2026.5: 🎉 One paper is accepted by ICML 2026.
2025.12: 🔥 We released a comprehensive survey: Deep Research: A Systematic Survey.
2025.9: 🎉🎉 Two papers are accepted by NeurIPS 2025.
2025.8: 🎉 One paper is accepted by CIKM 2025.
2025.7: 🎉 One paper is accepted by MM 2025.
2025.6: 🔥 Our AI Search Paradigm paper is publicly available.
2025.1: 🎉🎉 Two first-author papers are accepted by WWW 2025.
2024.4: 🎉 One first-author paper is accepted by IJCAI 2024.
2023.9: I joined Renmin University of China to pursue my Ph.D.
2023.4: I joined the Search Department of Baidu Inc. as an algorithm intern.

🗺️ Visitors

selected publications

arXiv

UnityMAS-O: A General RL Optimization Framework for LLM-Based Multi-Agent Systems

Yiqun Chen, Wei Yang, Erhan Zhang, and 14 more authors

arXiv preprint arXiv:2605.26646, 2026

Bib PDF Code

@article{chen2026unitymasogeneralrloptimization,
  title = {UnityMAS-O: A General RL Optimization Framework for LLM-Based Multi-Agent Systems},
  author = {Chen, Yiqun and Yang, Wei and Zhang, Erhan and Wang, Shijie and Liu, Qi and Niu, Zechun and Zhang, Bin and Li, Haitao and Li, Rui and Yan, Lingyong and Feng, Jinyuan and Qi, Biqing and Wei, Xiaochi and Gao, Yan and Wu, Yi and Hu, Yao and Mao, Jiaxin},
  journal = {arXiv preprint arXiv:2605.26646},
  year = {2026},
  url = {https://arxiv.org/abs/2605.26646},
}

arXiv

Tournament-GRPO: Group-Wise Tournament Rewards for Reinforcement Learning in Open-Ended Long-Form Generation

Zixuan Yang^*, Yiqun Chen^*, Wei Yang, and 7 more authors

arXiv preprint arXiv:2605.26958, 2026

Bib PDF Code

@article{yang2026tournamentgrpogroupwise,
  title = {Tournament-GRPO: Group-Wise Tournament Rewards for Reinforcement Learning in Open-Ended Long-Form Generation},
  author = {Yang, Zixuan and Chen, Yiqun and Yang, Wei and Zhang, Erhan and Shen, Zihan and Wei, Xiaochi and Gao, Yan and Wu, Yi and Hu, Yao and Mao, Jiaxin},
  journal = {arXiv preprint arXiv:2605.26958},
  year = {2026},
  url = {https://arxiv.org/abs/2605.26958},
}

arXiv

PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training

Erhan Zhang^*, Yiqun Chen^*, Zechun Niu, and 6 more authors

arXiv:2604.03675, 2026

Bib PDF Code

@article{PRAISE,
  title = {PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training},
  author = {Zhang, Erhan and Chen, Yiqun and Niu, Zechun and Yang, Wei and Wei, Xiaochi and Gao, Yan and Wu, Yi and Hu, Yao and Mao, Jiaxin},
  journal = {arXiv:2604.03675},
  year = {2026},
  url = {https://arxiv.org/pdf/2604.03675},
}

ICML 2026

JADE: Bridging the Strategic-Operational Gap in Dynamic Agentic RAG

Yiqun Chen, Erhan Zhang, Tianyi Hu, and 8 more authors

arXiv preprint arXiv:2601.21916, 2026

Bib PDF Code

@article{chen2026jadebridgingstrategicoperationalgap,
  title = {JADE: Bridging the Strategic-Operational Gap in Dynamic Agentic RAG},
  author = {Chen, Yiqun and Zhang, Erhan and Hu, Tianyi and Wang, Shijie and Yang, Zixuan and Zhong, Meizhi and Wei, Xiaochi and Gao, Yan and Wu, Yi and Hu, Yao and Mao, Jiaxin},
  journal = {arXiv preprint arXiv:2601.21916},
  year = {2026},
  url = {https://arxiv.org/abs/2601.21916},
}

arXiv

Self-Compression of Chain-of-Thought via Multi-Agent Reinforcement Learning

Yiqun Chen, Jinyuan Feng, Wei Yang, and 9 more authors

arXiv preprint arXiv:2601.21919, 2026

Bib PDF Code

@article{chen2026selfcompressionchainofthoughtmultiagentreinforcement,
  title = {Self-Compression of Chain-of-Thought via Multi-Agent Reinforcement Learning},
  author = {Chen, Yiqun and Feng, Jinyuan and Yang, Wei and Zhong, Meizhi and Shi, Zhengliang and Li, Rui and Wei, Xiaochi and Gao, Yan and Wu, Yi and Hu, Yao and Pu, Zhiqiang and Mao, Jiaxin},
  journal = {arXiv preprint arXiv:2601.21919},
  year = {2026},
  url = {https://arxiv.org/abs/2601.21919},
}

arXiv

Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search

Yiqun Chen, Lingyong Yan, Zixuan Yang, and 5 more authors

arXiv preprint arXiv:2601.04703, 2026

Bib PDF Code

@article{chen2026monolithicarchitecturesmultiagentsearch,
  title = {Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search},
  author = {Chen, Yiqun and Yan, Lingyong and Yang, Zixuan and Zhang, Erhan and Zhao, Jiashu and Wang, Shuaiqiang and Yin, Dawei and Mao, Jiaxin},
  journal = {arXiv preprint arXiv:2601.04703},
  year = {2026},
  url = {https://arxiv.org/abs/2601.04703},
}

arXiv

Deep Research: A Systematic Survey

Zhengliang Shi#, Yiqun Chen#, Haitao Li, and 23 more authors

arXiv preprint arXiv:2512.02038, 2025

Bib PDF

@article{shi2025deep,
  title = {Deep Research: A Systematic Survey},
  author = {Shi#, Zhengliang and Chen#, Yiqun and Li, Haitao and Sun, Weiwei and Ni, Shiyu and Lyu, Yougang and Fan, Run-Ze and Jin, Bowen and Weng, Yixuan and Zhu, Minjun and Xie, Qiujie and Guo, Xinyu and Yang, Qu and Wu, Jiayi and Zhao, Jujia and Tang, Xiaqiang and Ma, Xinbei and Wang, Cunxiang and Mao, Jiaxin and Ai, Qingyao and Huang, Jen-Tse and Wang, Wenxuan and Zhang, Yue and Yang, Yiming and Tu, Zhaopeng and Ren, Zhaochun},
  journal = {arXiv preprint arXiv:2512.02038},
  year = {2025},
  url = {https://arxiv.org/abs/2512.02038},
}

NeurIPS 2025

Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning

Yiqun Chen, Lingyong Yan, Weiwei Sun, and 6 more authors

In Advances in Neural Information Processing Systems (NeurIPS), Dec 2025

Bib PDF Code

@inproceedings{chen2025improving,
  title = {Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning},
  author = {Chen, Yiqun and Yan, Lingyong and Sun, Weiwei and Ma, Xinyu and Zhang, Yi and Wang, Shuaiqiang and Yin, Dawei and Yang, Yiming and Mao, Jiaxin},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year = {2025},
  month = dec,
  url = {https://arxiv.org/abs/2501.15228},
}

NeurIPS 2025

Structured Spectral Reasoning for Frequency-Adaptive Multimodal Recommendation

Wei Yang^*, Rui Zhong^*, Yiqun Chen^*, and 2 more authors

In Advances in Neural Information Processing Systems (NeurIPS), Dec 2025

Bib PDF

@inproceedings{yang2025structured,
  title = {Structured Spectral Reasoning for Frequency-Adaptive Multimodal Recommendation},
  author = {Yang, Wei and Zhong, Rui and Chen, Yiqun and Lu, Chi and Jiang, Peng},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year = {2025},
  month = dec,
  url = {https://arxiv.org/abs/2512.01372},
}

arXiv

MAO-ARAG: Multi-Agent Orchestration for Adaptive Retrieval-Augmented Generation

Yiqun Chen, Erhan Zhang, Lingyong Yan, and 4 more authors

arXiv preprint arXiv:2508.01005, Dec 2025

Bib PDF Code

@article{chen2025mao,
  title = {MAO-ARAG: Multi-Agent Orchestration for Adaptive Retrieval-Augmented Generation},
  author = {Chen, Yiqun and Zhang, Erhan and Yan, Lingyong and Wang, Shuaiqiang and Huang, Jizhou and Yin, Dawei and Mao, Jiaxin},
  journal = {arXiv preprint arXiv:2508.01005},
  year = {2025},
  url = {https://arxiv.org/abs/2508.01005},
}

Baidu

Towards AI Search Paradigm (Technical Report of Baidu AI Search)

Yuchen Li, Hengyi Cai, Rui Kong, and 18 more authors

arXiv preprint arXiv:2506.17188, Dec 2025

Bib PDF

@article{li2025towards,
  title = {Towards AI Search Paradigm (Technical Report of Baidu AI Search)},
  author = {Li, Yuchen and Cai, Hengyi and Kong, Rui and Chen, Xinran and Chen, Jiamin and Yang, Jun and Zhang, Haojie and Li, Jiayi and Wu, Jiayi and Chen, Yiqun and Qu, Changle and Kong, Keyi and Ye, Wenwen and Su, Lixin and Ma, Xinyu and Xia, Long and Shi, Daiting and Zhao, Jiashu and Xiong, Haoyi and Wang, Shuaiqiang and Yin, Dawei},
  journal = {arXiv preprint arXiv:2506.17188},
  year = {2025},
  url = {https://arxiv.org/abs/2506.17188},
}

WWW 2025

MA4DIV: Multi-Agent Reinforcement Learning for Search Result Diversification (Oral Presentation (Rate: 6%))

Yiqun Chen, Jiaxin Mao, Yi Zhang, and 7 more authors

In Proceedings of the ACM on Web Conference (WWW), Dec 2025

Bib PDF Code

@inproceedings{chen2025ma4div,
  title = {MA4DIV: Multi-Agent Reinforcement Learning for Search Result Diversification (Oral Presentation (Rate: ~6%))},
  author = {Chen, Yiqun and Mao, Jiaxin and Zhang, Yi and Ma, Dehong and Xia, Long and Fan, Jun and Shi, Daiting and Cheng, Zhicong and Gu, Simiu and Yin, Dawei},
  booktitle = {Proceedings of the ACM on Web Conference (WWW)},
  year = {2025},
  pages = {1703--1715},
  url = {https://dl.acm.org/doi/pdf/10.1145/3696410.3714862},
}

WWW 2025

TourRank: Utilizing Large Language Models for Documents Ranking with a Tournament-Inspired Strategy (Oral Presentation (Rate: 6%))

Yiqun Chen, Qi Liu, Yi Zhang, and 6 more authors

In Proceedings of the ACM on Web Conference (WWW), Dec 2025

Bib PDF Code

@inproceedings{chen2025tourrank,
  title = {TourRank: Utilizing Large Language Models for Documents Ranking with a Tournament-Inspired Strategy (Oral Presentation (Rate: ~6%))},
  author = {Chen, Yiqun and Liu, Qi and Zhang, Yi and Sun, Weiwei and Ma, Xinyu and Yang, Wei and Shi, Daiting and Mao, Jiaxin and Yin, Dawei},
  booktitle = {Proceedings of the ACM on Web Conference (WWW)},
  year = {2025},
  pages = {1638--1652},
  url = {https://dl.acm.org/doi/pdf/10.1145/3696410.3714863},
}

IJCAI 2024

PTDE: Personalized Training with Distilled Execution for Multi-Agent Reinforcement Learning

Yiqun Chen, Hangyu Mao, Jiaxin Mao, and 5 more authors

In International Joint Conference on Artificial Intelligence (IJCAI), Dec 2024

Bib PDF Code

@inproceedings{chen2024ptde,
  title = {PTDE: Personalized Training with Distilled Execution for Multi-Agent Reinforcement Learning},
  author = {Chen, Yiqun and Mao, Hangyu and Mao, Jiaxin and Wu, Shiguang and Zhang, Tianle and Zhang, Bin and Yang, Wei and Chang, Hongxing},
  booktitle = {International Joint Conference on Artificial Intelligence (IJCAI)},
  year = {2024},
  url = {https://arxiv.org/abs/2210.08872},
}

IJCNN 2022

Commander-Soldiers Reinforcement Learning for Cooperative Multi-Agent Systems

Yiqun Chen, Wei Yang, Tianle Zhang, and 2 more authors

In International Joint Conference on Neural Networks (IJCNN), Dec 2022

Bib

@inproceedings{chen2022commander,
  title = {Commander-Soldiers Reinforcement Learning for Cooperative Multi-Agent Systems},
  author = {Chen, Yiqun and Yang, Wei and Zhang, Tianle and Wu, Shiguang and Chang, Hongxing},
  booktitle = {International Joint Conference on Neural Networks (IJCNN)},
  year = {2022},
  pages = {1--7},
}

ICONIP 2022

Multi-Agent Hyper-Attention Policy Optimization

Bin Zhang^*, Zhiwei Xu^*, Yiqun Chen^*, and 4 more authors

In International Conference on Neural Information Processing (ICONIP), Dec 2022

Bib

@inproceedings{zhang2022multi,
  title = {Multi-Agent Hyper-Attention Policy Optimization},
  author = {Zhang, Bin and Xu, Zhiwei and Chen, Yiqun and Li, Dapeng and Bai, Yunpeng and Fan, Guoliang and Li, Lijuan},
  booktitle = {International Conference on Neural Information Processing (ICONIP)},
  pages = {76--87},
  year = {2022},
  organization = {Springer},
}