Total Parameters
745B
Active Parameters
40-50B
Training Tokens
28.5T
Context Length
128K+
Why GLM-5 matters
We believe scaling remains one of the strongest levers for improving intelligence efficiency on the path to AGI. But raw scale alone is not enough. What makes GLM-5 compelling is how pre-training scale, systems efficiency, and reinforcement learning infrastructure were advanced together.
GLM-5 represents a major step forward in open-weight AI systems. With approximately 745 billion total parameters and a Mixture-of-Experts (MoE) architecture that activates only 40-50 billion parameters per token, it delivers frontier-level performance with manageable computational costs.
The result is a model that pushes beyond GLM-4.7 on reasoning, coding, and agentic execution benchmarks, while narrowing the gap to top closed frontier systems like Claude Opus 4.5, Gemini 3 Pro, and GPT-5.2. Practically, this means better reliability in long tool chains, fewer collapses on multi-step objectives, and stronger completion quality under real production constraints.
Native Multimodal
Core Capabilities
GLM-5 is designed for complex systems engineering and long-horizon agentic tasks. It excels across reasoning, coding, and autonomous execution domains.
Complex Reasoning
Advanced mathematical reasoning, scientific analysis, and logical problem-solving at frontier level.
Code Intelligence
State-of-the-art software engineering, debugging, and multi-language code generation capabilities.
Agentic Execution
Long-horizon task planning, tool orchestration, and autonomous workflow execution.
Systems Engineering
End-to-end system design, architecture planning, and complex engineering workflows.
Terminal Operations
Advanced command-line operations, shell scripting, and infrastructure automation.
Long Context
Process massive documents, codebases, and multi-step reasoning chains efficiently.
Architecture
Technical Architecture
Mixture-of-Experts (MoE)
745B total parameters with only 40-50B active per token. Smart routing selects specialized experts for each input, maximizing capability while minimizing compute.
DeepSeek Sparse Attention
Advanced sparse attention mechanisms enable efficient long-context processing without proportional compute overhead. Handle 128K+ context windows with ease.
Async RL Infrastructure
New "slime" reinforcement learning system increases training throughput for faster iteration and improved post-training quality.
Scale-Up Training
Pre-training corpus expanded from 23T to 28.5T tokens. Total parameters scaled from 355B to 745B for broader coverage and stronger generalization.
GLM-5 vs GLM-4.7 Improvements
Significant gains across agentic, reasoning, and coding benchmarks demonstrate the impact of increased scale and improved post-training.
Humanity's Last Exam
Terminal-Bench 2.0
MCP-Atlas
CyberGym
Tool-Decathlon
Academic & Reasoning Performance
GLM-5 demonstrates strong performance across mathematical reasoning, scientific knowledge, and academic benchmarks — competitive with leading frontier models.
| Benchmark | Description | GLM-5 | GLM-4.7 | Claude Opus 4.5 | Gemini 3 Pro | GPT-5.2 | DeepSeek-V3.2 | Kimi K2.5 |
|---|---|---|---|---|---|---|---|---|
| Humanity's Last Exam | Academic reasoning benchmarkNo tools | 30.5% | 24.8% | 28.4% | 37.2% | 35.4% | — | — |
| Humanity's Last Exam | With tool access enabledWith tools | 50.4% | 42.8% | 43.4% | 45.8% | 45.5% | — | — |
| AIME 2026 I | Mathematics competition | 92.7% | 92.9% | 93.3% | 90.6% | — | 92.7% | 92.5% |
| HMMT Nov. 2025 | Harvard-MIT Math Tournament | 96.9% | 93.5% | 91.7% | 93.0% | 97.1% | 90.2% | 91.1% |
| IMO AnswerBench | International Math Olympiad | 82.5% | 82.0% | 78.5% | 83.3% | 86.3% | 78.3% | 81.8% |
| GPQA-Diamond | Graduate-level science Q&A | 86.0% | 85.7% | 87.0% | 91.9% | 92.4% | 82.4% | 87.6% |
Software Engineering Excellence
Top-tier performance on real-world coding tasks, from repository-level changes to terminal-based workflows and cybersecurity challenges.
| Benchmark | Description | GLM-5 | GLM-4.7 | Claude Opus 4.5 | Gemini 3 Pro | GPT-5.2 | DeepSeek-V3.2 | Kimi K2.5 |
|---|---|---|---|---|---|---|---|---|
| SWE-bench Verified | Real-world software engineering | 77.8% | 73.8% | 80.9% | 76.2% | 80.0% | — | 76.8% |
| SWE-bench Multilingual | Cross-language code tasks | 73.3% | 66.7% | 77.5% | 65.0% | 72.0% | — | 73.0% |
| Terminal-Bench 2.0 | Agentic terminal workflows | 56.2% | 41.0% | 59.3% | 54.2% | 54.0% | — | — |
| CyberGym | Cybersecurity challenges | 43.2% | 23.5% | 50.6% | 39.9% | — | 17.3% | 41.3% |
Autonomous Agent Performance
Exceptional capability in multi-step planning, tool orchestration, and long-horizon task execution — key for production AI agents.
| Benchmark | Description | GLM-5 | GLM-4.7 | Claude Opus 4.5 | Gemini 3 Pro | GPT-5.2 | DeepSeek-V3.2 | Kimi K2.5 |
|---|---|---|---|---|---|---|---|---|
| BrowseComp | Web browsing & research | 75.9% | 67.5% | 67.8% | 59.2% | 65.8% | — | 74.9% |
| MCP-Atlas | Multi-step MCP workflows | 67.8% | 52.0% | 65.2% | 66.6% | 68.0% | — | — |
| τ²-Bench | Agentic tool use & planning | 89.7% | 87.4% | 91.6% | 90.7% | 85.5% | — | — |
| Tool-Decathlon | Long-horizon real-world tasks | 38.0% | 23.8% | 43.5% | 36.4% | 46.3% | 35.2% | 27.8% |
Key Benchmark Highlights
Visual comparison of GLM-5 against leading frontier models across critical benchmarks.
Reasoning
Humanity's Last Exam (with tools)
Coding
SWE-bench Verified
Agents
MCP-Atlas
Terminal
Terminal-Bench 2.0
Designed for Real-World Impact
GLM-5 excels in scenarios requiring deep reasoning, complex coding, and autonomous execution.
Software Engineering
Build, debug, and refactor complex codebases. Excel at SWE-bench tasks, multilingual coding, and long-horizon development workflows.
AI Agents & Automation
Deploy autonomous agents for research, data processing, and multi-step business workflows with reliable long-horizon execution.
Systems Architecture
Design distributed systems, cloud infrastructure, and complex technical architectures with deep reasoning capabilities.
Research & Analysis
Process massive documents, perform literature reviews, and synthesize insights across long-form content.
Technical Specifications
Architecture
Mixture-of-Experts (MoE)
Total Parameters
~745 billion
Active Parameters
40-50 billion per token
Pre-training Data
28.5 trillion tokens
Context Window
128,000+ tokens
Attention Mechanism
DeepSeek Sparse Attention (DSA)
License
MIT (Open Source)
Inference Framework
vLLM compatible
Developer
Zhipu AI (Z.ai)
Advantages
Why Choose GLM-5?
MIT Licensed — Full Freedom
GLM-5 is released under the MIT license, enabling commercial use, fine-tuning, and research deployment without restrictive licensing barriers. Build with confidence.
MIT
License
100%
Open Weight
What this unlocks in LeemerChat
- Stronger model reliability for long-horizon engineering prompts
- Better planning + execution in multi-tool, multi-turn agent loops
- Higher quality code reasoning under multilingual and terminal-heavy tasks
- A more capable open-source control brain for orchestrated workflows
- Complex systems engineering with deep reasoning capabilities
- Autonomous agent workflows with reliable long-horizon execution
Experience the Frontier of Open Source
GLM-5 represents a new standard for open-weight AI. Best-in-class performance, MIT licensed, and ready for your most demanding engineering and agentic workflows.