Back to BlogModel Launch · February 2026
Z.AI Frontier Model

GLM-5 is here

A 745B parameter open-source frontier model for engineering-grade agent work

Today we are launching GLM-5 on LeemerChat as the newest frontier addition in our Z.AI lineup. With 745B total parameters, advanced MoE architecture, and best-in-class performance among open-source models on reasoning, coding, and agentic tasks — GLM-5 narrows the gap with proprietary frontier systems.

Total Parameters

745B

Active Parameters

40-50B

Training Tokens

28.5T

Context Length

128K+

Frontier Open Source

Why GLM-5 matters

We believe scaling remains one of the strongest levers for improving intelligence efficiency on the path to AGI. But raw scale alone is not enough. What makes GLM-5 compelling is how pre-training scale, systems efficiency, and reinforcement learning infrastructure were advanced together.

GLM-5 represents a major step forward in open-weight AI systems. With approximately 745 billion total parameters and a Mixture-of-Experts (MoE) architecture that activates only 40-50 billion parameters per token, it delivers frontier-level performance with manageable computational costs.

The result is a model that pushes beyond GLM-4.7 on reasoning, coding, and agentic execution benchmarks, while narrowing the gap to top closed frontier systems like Claude Opus 4.5, Gemini 3 Pro, and GPT-5.2. Practically, this means better reliability in long tool chains, fewer collapses on multi-step objectives, and stronger completion quality under real production constraints.

Native Multimodal

Core Capabilities

GLM-5 is designed for complex systems engineering and long-horizon agentic tasks. It excels across reasoning, coding, and autonomous execution domains.

Complex Reasoning

Advanced mathematical reasoning, scientific analysis, and logical problem-solving at frontier level.

Code Intelligence

State-of-the-art software engineering, debugging, and multi-language code generation capabilities.

Agentic Execution

Long-horizon task planning, tool orchestration, and autonomous workflow execution.

Systems Engineering

End-to-end system design, architecture planning, and complex engineering workflows.

Terminal Operations

Advanced command-line operations, shell scripting, and infrastructure automation.

Long Context

Process massive documents, codebases, and multi-step reasoning chains efficiently.

Architecture

Technical Architecture

Mixture-of-Experts (MoE)

745B total parameters with only 40-50B active per token. Smart routing selects specialized experts for each input, maximizing capability while minimizing compute.

DeepSeek Sparse Attention

Advanced sparse attention mechanisms enable efficient long-context processing without proportional compute overhead. Handle 128K+ context windows with ease.

Async RL Infrastructure

New "slime" reinforcement learning system increases training throughput for faster iteration and improved post-training quality.

Scale-Up Training

Pre-training corpus expanded from 23T to 28.5T tokens. Total parameters scaled from 355B to 745B for broader coverage and stronger generalization.

Generation-over-Generation

GLM-5 vs GLM-4.7 Improvements

Significant gains across agentic, reasoning, and coding benchmarks demonstrate the impact of increased scale and improved post-training.

Humanity's Last Exam

+17.8%improvement
42.8%50.4%

Terminal-Bench 2.0

+37.1%improvement
41.0%56.2%

MCP-Atlas

+30.4%improvement
52.0%67.8%

CyberGym

+83.8%improvement
23.5%43.2%

Tool-Decathlon

+59.7%improvement
23.8%38.0%
Reasoning Benchmarks

Academic & Reasoning Performance

GLM-5 demonstrates strong performance across mathematical reasoning, scientific knowledge, and academic benchmarks — competitive with leading frontier models.

BenchmarkDescription
GLM-5
GLM-4.7
AnthropicClaude Opus 4.5
GoogleGemini 3 Pro
OpenAIGPT-5.2
DeepSeek-V3.2Kimi K2.5
Humanity's Last ExamAcademic reasoning benchmarkNo tools30.5%24.8%28.4%37.2%35.4%
Humanity's Last ExamWith tool access enabledWith tools50.4%42.8%43.4%45.8%45.5%
AIME 2026 IMathematics competition92.7%92.9%93.3%90.6%92.7%92.5%
HMMT Nov. 2025Harvard-MIT Math Tournament96.9%93.5%91.7%93.0%97.1%90.2%91.1%
IMO AnswerBenchInternational Math Olympiad82.5%82.0%78.5%83.3%86.3%78.3%81.8%
GPQA-DiamondGraduate-level science Q&A86.0%85.7%87.0%91.9%92.4%82.4%87.6%
Coding Benchmarks

Software Engineering Excellence

Top-tier performance on real-world coding tasks, from repository-level changes to terminal-based workflows and cybersecurity challenges.

BenchmarkDescription
GLM-5
GLM-4.7
AnthropicClaude Opus 4.5
GoogleGemini 3 Pro
OpenAIGPT-5.2
DeepSeek-V3.2Kimi K2.5
SWE-bench VerifiedReal-world software engineering77.8%73.8%80.9%76.2%80.0%76.8%
SWE-bench MultilingualCross-language code tasks73.3%66.7%77.5%65.0%72.0%73.0%
Terminal-Bench 2.0Agentic terminal workflows56.2%41.0%59.3%54.2%54.0%
CyberGymCybersecurity challenges43.2%23.5%50.6%39.9%17.3%41.3%
Agentic Benchmarks

Autonomous Agent Performance

Exceptional capability in multi-step planning, tool orchestration, and long-horizon task execution — key for production AI agents.

BenchmarkDescription
GLM-5
GLM-4.7
AnthropicClaude Opus 4.5
GoogleGemini 3 Pro
OpenAIGPT-5.2
DeepSeek-V3.2Kimi K2.5
BrowseCompWeb browsing & research75.9%67.5%67.8%59.2%65.8%74.9%
MCP-AtlasMulti-step MCP workflows67.8%52.0%65.2%66.6%68.0%
τ²-BenchAgentic tool use & planning89.7%87.4%91.6%90.7%85.5%
Tool-DecathlonLong-horizon real-world tasks38.0%23.8%43.5%36.4%46.3%35.2%27.8%
Head-to-Head Comparison

Key Benchmark Highlights

Visual comparison of GLM-5 against leading frontier models across critical benchmarks.

Reasoning

Humanity's Last Exam (with tools)

Score
GLM-550.4%
Gemini 3 Pro45.8%
GPT-5.245.5%
Claude Opus 4.543.4%
GLM-4.742.8%

Coding

SWE-bench Verified

Score
Claude Opus 4.580.9%
GPT-5.280%
GLM-577.8%
Gemini 3 Pro76.2%
GLM-4.773.8%

Agents

MCP-Atlas

Score
GPT-5.268%
GLM-567.8%
Gemini 3 Pro66.6%
Claude Opus 4.565.2%
GLM-4.752%

Terminal

Terminal-Bench 2.0

Score
Claude Opus 4.559.3%
GLM-556.2%
GPT-5.254%
Gemini 3 Pro54.2%
GLM-4.741%
Applications

Designed for Real-World Impact

GLM-5 excels in scenarios requiring deep reasoning, complex coding, and autonomous execution.

Software Engineering

Build, debug, and refactor complex codebases. Excel at SWE-bench tasks, multilingual coding, and long-horizon development workflows.

AI Agents & Automation

Deploy autonomous agents for research, data processing, and multi-step business workflows with reliable long-horizon execution.

Systems Architecture

Design distributed systems, cloud infrastructure, and complex technical architectures with deep reasoning capabilities.

Research & Analysis

Process massive documents, perform literature reviews, and synthesize insights across long-form content.

Specifications

Technical Specifications

Architecture

Mixture-of-Experts (MoE)

Total Parameters

~745 billion

Active Parameters

40-50 billion per token

Pre-training Data

28.5 trillion tokens

Context Window

128,000+ tokens

Attention Mechanism

DeepSeek Sparse Attention (DSA)

License

MIT (Open Source)

Inference Framework

vLLM compatible

Developer

Zhipu AI (Z.ai)

Advantages

Why Choose GLM-5?

Best-in-class performance among open-source models on reasoning benchmarks
Top-tier coding performance with 77.8% on SWE-bench Verified
Exceptional agentic capabilities with 67.8% on MCP-Atlas
Massive 745B parameter scale with efficient MoE architecture
MIT licensed — full commercial use, fine-tuning, and deployment freedom
Cost-efficient inference with only 40-50B active parameters per token
Native long-context support for complex document and codebase analysis
Strong multilingual capabilities across coding and reasoning tasks
Open Source

MIT Licensed — Full Freedom

GLM-5 is released under the MIT license, enabling commercial use, fine-tuning, and research deployment without restrictive licensing barriers. Build with confidence.

MIT

License

100%

Open Weight

What this unlocks in LeemerChat

  • Stronger model reliability for long-horizon engineering prompts
  • Better planning + execution in multi-tool, multi-turn agent loops
  • Higher quality code reasoning under multilingual and terminal-heavy tasks
  • A more capable open-source control brain for orchestrated workflows
  • Complex systems engineering with deep reasoning capabilities
  • Autonomous agent workflows with reliable long-horizon execution
Available now on LeemerChat

Experience the Frontier of Open Source

GLM-5 represents a new standard for open-weight AI. Best-in-class performance, MIT licensed, and ready for your most demanding engineering and agentic workflows.

Related Posts

February 12, 2026

MiniMax M2.5 Is Live: SOTA Productivity Model for Real-World Office Work

MiniMax M2.5 launches on LeemerChat with breakthrough performance in Word, Excel, and PowerPoint generation. Scoring 80.2% on SWE-Bench Verified and 76.3% on BrowseComp, M2.5 extends M2.1's coding expertise into general office productivity.

Read more
January 30, 2026

Kimi K2.5: Moonshot AI’s Frontier Multimodal Model, Now Live on LeemerChat

Kimi K2.5 brings state-of-the-art visual coding, 262K context, and self-directed agent swarms. We’re Ireland’s first AI platform to launch it — and it’s live free on LeemerChat.

Read more
March 2, 2026

Get Ready for Mission Control: The Next Evolution of Agentic Execution

Mission Control is our next-generation agentic research and execution platform. It represents a fundamental shift in how we interact with AI—moving away from rigid pipelines and chat interfaces, and stepping into the era of autonomous, goal-oriented swarms.

Read more
February 22, 2026

The Foundry Report: Why Fine-Tuned Models Are Still the Sharpest Weapon in Enterprise AI

Tinker is now generally available. Vision input, Kimi K2 Thinking, and LoRA Without Regret are reshaping what custom model training looks like in 2026. Here's why fine-tuning is more strategically important than ever — and how LeemerLabs Model Foundry is building the infrastructure to prove it.

Read more
Explore more:All PostsReleasesModelsBenchmarksEngineeringInsightsAll FeaturesAbout UsTermsPrivacy