Context
262K tokens
Pretraining
~15T multimodal tokens
Agent Swarm
Up to 100 sub-agents
Tool Calls
Up to 1,500
Why Kimi K2.5 is a frontier model
Frontier models are defined by their ability to solve real-world tasks end-to-end: multimodal understanding, tool orchestration, and long-horizon reasoning. K2.5 meets that bar with top-tier benchmarks across agents, coding, and vision—while maintaining massive context and autonomous orchestration.
Native multimodal reasoning
Kimi K2.5 fuses vision and language natively, enabling visual coding, UI parsing, and multi-modal reasoning without separate adapters.
Long-horizon agentic depth
262K context plus self-directed agent swarms let it sustain complex plans and multi-step workflows across massive tool chains.
Frontier-grade coding
Top-tier performance on SWE-bench and multilingual coding benchmarks makes K2.5 a serious frontier contender for real software work.
Self-directed agent swarms at scale
For complex tasks, Kimi K2.5 can self-direct an agent swarm with up to 100 sub-agents, executing parallel workflows across up to 1,500 tool calls. The swarm is automatically created and orchestrated by K2.5 without any predefined subagents or workflow.
Self-directed agent swarm
Kimi K2.5 can automatically spawn and orchestrate up to 100 sub-agents for complex tasks. No predefined subagents. No manual workflow design.
1,500 tool calls in parallel
The swarm can execute parallel workflows across up to 1,500 tool calls, compressing research, coding, and data synthesis into a single run.
Up to 4.5x faster execution
Compared with a single-agent setup, K2.5 reduces execution time by up to 4.5x by coordinating parallel sub-tasks.
Recreated benchmark highlights
The chart below recreates the published benchmark snapshot comparing Kimi K2.5 with GPT-5.2 (xhigh), Claude Opus 4.5, and Gemini 3 Pro.
Agents
Humanity's Last Exam (Full)
Agents
BrowseComp
Agents
DeepSearchQA
Coding
SWE-bench Verified
Coding
SWE-bench Multilingual
Image
MMMU Pro
Image
MathVision
Image
OmniDocBench 1.5*
Video
VideoMMMU
Video
LongVideoBench
* OmniDocBench score is computed as (1 − normalized Levenshtein distance) × 100, where a higher score denotes superior accuracy.
Ready to build with Kimi K2.5?
The most powerful multimodal Kimi model is now live — free to try on LeemerChat for every user.