*Taken in the KAUST AI Initiative.

Mingchen Zhuge

I am a PhD Candidate and recently start my third year at the KAUST AI Initiative. I am fortunate to have Jürgen Schmidhuber as my advisor.

My current research interests and skills include:

  • Multimodal: Vision-Language Pre-training; Currently Prefer Video-based LLMs.
  • LLMs: Post-Training.
  • Meta Learning: Recursive Self-Improvement.
  • Agents: LLM-based Multi-Agent Systems, Code Generation.
  • AI Judges: Agent-as-a-Judge, LLM-as-a-Judge.

😀 Join Meta as Research Scientist Intern in Summer 2024

😀 Feel free to contact me for future collaboration, whether it's for providing a 2025 summer internship, co-founding a venture, or a full-time job opportunity (industry or faculty). Appreciate your guidances.

⭐ Open-source enthusiasts. I am active in open-sourced communities like MetaGPT, GPTSwarm and OpenDevin.

Before joining KAUST, I've worked as an engineer, researcher (or intern) at NSFocus, Alibaba Group, IIAI (G42), SUSTech VIP Lab, and Microsoft.

Description of the image

Selected Publications ( Full list)


2024

Agent-as-a-Judge Image
Agent-as-a-Judge: Evaluate Agents with Agents
Mingchen Zhuge, Changsheng Zhao, Dylan Ashley, Wenyi Wang, Dmitrii Khizbullin, Yunyang Xiong, Zechun Liu, Ernie Chang, Raghuraman Krishnamoorthi, Yuandong Tian, Yangyang Shi, Vikas Chandra, Jürgen Schmidhuber
Preprint
Innovative evaluation framework using agent-based systems.
[Paper] [Code] [BibTex] [Tweet1] [Tweet2] [机器之心] [新智元]
Image Description
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
OpenHands Open-sourced Community
Preprint
Get 30k+ stars on GitHub.
[Paper] [Code] [BibTex] [机器之心]
Image Description
GPTSwarm: Language Agents as Optimizable Graphs
Mingchen Zhuge, Wenyi Wang, Louis Kirsch, Francesco Faccio, Dmitrii Khizbullin and Jürgen Schmidhuber
ICML 2024 (Oral)
Oral Presentation (top 1.5% in 9,473); The first framework emphasizes the importance of graphs in LLM-based agents.
[Paper] [Code] [BibTex] [麻省理工科技评论专访] [将门创投]

2023

Image Description
MetaGPT: Meta Programming For A Multi-Agent Collaborative Framework
Sirui Hong∗, Mingchen Zhuge∗, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Ceyao Zhang, Jinlin Wang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, Chenglin Wu†, Jürgen Schmidhuber
ICLR 2024 (Oral)
Oral Presentation (top 1.2% in 7,262); Get 40k+ stars on GitHub.
[Paper] [Code] [BibTex] [量子位报道]
Image Description
Mindstorms in Natural Language-Based Societies of Mind
Mingchen Zhuge*, Haozhe Liu*, Francesco Faccio*, Dylan R. Ashley*, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem, Jü̈rgen Schmidhuber
NeurIPS Ro-FoMo 2023 (Oral, Best Paper Award)
Best Paper Award in NeurIPS Ro-FoMo Workshop; Position Paper;
[Paper] [Code] [BibTex] [Poster] [Award]

2022

Image Description
Salient Object Detection via Integrity Learning
Mingchen Zhuge*, Deng-Ping Fan*, Nian Liu, Dingwen Zhang, Dong Xu, Ling Shao
TPAMI 2022
ESI Highly Cited Paper
[Paper] [Code] [BibTex]

2021

Image Description
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Mingchen Zhuge*, Dehong Gao*, Deng-Ping Fan†, Linbo Jin, Ben Chen, Haoming Zhou, Minghui Qiu, Ling Shao
CVPR 2021
Industry Application in alibaba.com
[Paper] [Code] [BibTex] [Talk]

Experience

June 2024 - Present

Research Scientist Intern, Meta
Advisor: Changsheng Zhao, Yangyang Shi
Topic: Multimodal LLMs, Agents
Location: Burlingame, United States

Aug 2022 - Present

PhD Student, KAUST AI Initiative
Advisor: Jürgen Schmidhuber
Topic: Multimodal, LLMs, Agent Society, Meta Learning
Location: Thuwal, KSA

May 2022 - Aug 2022

Research Scientist Intern, Microsoft (WizardLM)
Host: Chongyang Tao
Topic: NLP, Multimodal LLM
Location: Beijing, China

Jul 2021 - Jan 2022

Visiting Scholar, SUSTech
Host: Feng Zheng
Topic: Multimodal (Audio-Visual)
Location: Shenzhen, China

Mar 2021 - Jun 2021

Research Intern, IIAI
Host: Deng-Ping Fan, Ling Shao
Topic: Computer Vision
Location: Abu Dhabi, UAE

May 2020 - Feb 2021

Algorithm Intern, Alibaba Group
Host: Dehong Gao
Topic: Multimodal (Vision-Language Pre-training)
Location: Hangzhou, China

March 2018 - Jun 2018

Research Intern, NSFocus
Host: Wenmao Liu
Topic: Blockchain, Network Security
Location: Beijing, China

Invited Talks


2024

智能体蜂群
Host: 上海人工智能实验室  

2024

Exploring Multimodal Agents
Host: WAIC AI Lite Think Talk  

2024

浅述AI智能体
Host: 中科院自动化所  

2024

NLSOM、MetaGPT、GPTSwarm: 以不一样的视角探索AI智能体
Host: ByteDance  

2024

多模态AI智能体(社会)
Host: 华为藤蔓技术论坛2024  

2021

机器之心: "走进全球顶尖实验室第一期-IIAI"
Host: 机器之心 (Synced)