Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling
0
Zitationen
55
Autoren
2025
Jahr
Abstract
We present MiroThinker v1.0, an open-source research agent designed to advance tool-augmented reasoning and information-seeking capabilities. Unlike previous agents that only scale up model size or context length, MiroThinker explores interaction scaling at the model level, systematically training the model to handle deeper and more frequent agent-environment interactions as a third dimension of performance improvement. Unlike LLM test-time scaling, which operates in isolation and risks degradation with longer reasoning chains, interactive scaling leverages environment feedback and external information acquisition to correct errors and refine trajectories. Through reinforcement learning, the model achieves efficient interaction scaling: with a 256K context window, it can perform up to 600 tool calls per task, enabling sustained multi-turn reasoning and complex real-world research workflows. Across four representative benchmarks-GAIA, HLE, BrowseComp, and BrowseComp-ZH-the 72B variant achieves up to 81.9%, 37.7%, 47.1%, and 55.6% accuracy respectively, surpassing previous open-source agents and approaching commercial counterparts such as GPT-5-high. Our analysis reveals that MiroThinker benefits from interactive scaling consistently: research performance improves predictably as the model engages in deeper and more frequent agent-environment interactions, demonstrating that interaction depth exhibits scaling behaviors analogous to model size and context length. These findings establish interaction scaling as a third critical dimension for building next-generation open research agents, complementing model capacity and context windows.
Ähnliche Arbeiten
UCSF Chimera—A visualization system for exploratory research and analysis
2004 · 47.038 Zit.
SciPy 1.0: fundamental algorithms for scientific computing in Python
2020 · 35.700 Zit.
Clustal W and Clustal X version 2.0
2007 · 28.874 Zit.
The REDCap consortium: Building an international community of software platform partners
2019 · 22.727 Zit.
Array programming with NumPy
2020 · 20.720 Zit.
Autoren
- M. Team
- Song Bai
- Lidong Bing
- Carson Chen
- Guanzheng Chen
- Yuntao Chen
- Z. Chen
- Ziyi Chen
- Yifeng Dai
- Xuan Dong
- Wenbin Dou
- Yue Deng
- Jie Wang
- Junjie Ge
- Chenxia Han
- Tammy Huang
- Zhenhang Huang
- Jingsi Jiao
- Steve Jiang
- Tianyu Jiao
- Xian Jian
- Lei Lei
- Ruilin Li
- Ruiqi Luo
- Tingan Li
- Xiang Lin
- Ziyuan Liu
- Zhiqi Li
- Jie Ni
- Qiang Ren
- Peiyuan Sun
- Shu-Jem Su
- Chin‐Wang Tao
- Bin Wang
- H. Wang
- Haonan Wang
- James Z. Wang
- Jinfeng Wang
- J. Wang
- Letian Wang
- S Wang
- Weizhi Wang
- Zixuan Wang
- J. Xu
- Sen Xing
- Chenyu Yang
- Hai Ye
- Jun Yu
- Yue Yu
- Minlin Zhong
- Tianchen Zhao
- Xiaolong Zhu
- Yanpeng Zhou
- Yifan Zhang
- Zhi Zhu