We Just Moved a Chunk of the Frontier Into the Free Tier

Why we did this

Frontier is only useful if you can actually use it.

We've always believed the gap between "what free users get" and "what the best AI looks like" is a gap worth closing. This refresh is the biggest single step we've taken toward that.

Frontier access shouldn't gate-keep building

We built LeemerChat so builders, students, and researchers in Ireland and beyond could work with the best models — not just the ones they can afford. Moving MiMo V2 Pro, GLM-5V Turbo, and Gemma 4 31B IT to the free tier is the most direct way we can act on that.

Partner economics changed what's possible

Our partner model program means we can absorb inference costs on select models without passing them to free users. These three models have the capability and cost profile that makes that math work — which is why they're going free instead of staying behind a paywall.

Agentic work needs large contexts to be real

Short context windows force users into artificial workarounds. Giving free users access to a 1M-token agent model (MiMo V2 Pro) and 262K+ multimodal options means they can run real workflows — not toy demos.

Free Partner Models

Xiaomi MiMo-V2-Pro

Free Partner

xiaomi/mimo-v2.5-pro

1M context

Xiaomi's global top-tier agent model. MiMo-V2-Pro is the result of Xiaomi's full-stack AI research — trained on the principle that intelligence is about prediction and compression. With a 1M-token window, it is purpose-built for long-horizon agent loops: repository-spanning coding sessions, multi-document planning, and autonomous workflows that sustain context across hundreds of tool calls.

Why it matters for free users

Free users now have access to the same million-token window that was previously exclusive to premium tiers. This is the biggest context jump in LeemerChat's free catalog history.

1M-token context for true long-horizon work

Optimized for autonomous agent loops

Top-tier coding and planning on par with flagship models

Multimodal omni variant available (MiMo-V2-Omni)

Source

Z.AI GLM-5V-Turbo

Free Partner

z-ai/glm-5v-turbo

202K context

Z.AI's first native multimodal agent foundation model. GLM-5V-Turbo fuses vision and language at the architecture level — not as an adapter bolt-on. It handles image, video, and text inputs in a single model, making it the right tool when your workflow starts from a screenshot, a UI bug, a diagram, or a scanned document rather than a clean text prompt.

Why it matters for free users

Most multimodal models treat vision as secondary. GLM-5V-Turbo is designed around the perceive → plan → execute loop, which means it doesn't just describe images — it acts on them.

Native multimodal: image, video, and text in one model

202K context for long-form visual workflows

Built for vision-grounded coding and debugging

Seamless agent integration with full tool loop support

Source

Google Gemma 4 31B IT

Free Partner

google/gemma-4-31b-it

256K context

Google DeepMind's dense 31B flagship in the Gemma 4 family. Unlike the MoE variants in the same family, the 31B Dense model runs all 30.7B parameters on every token — providing deep, coherent reasoning for complex problems. With a 256K context window, it handles full codebases, long research documents, and multilingual work across 140+ languages.

Why it matters for free users

Gemma 4 31B scores 89.2% on AIME 2026 (no tools), 84.3% on GPQA Diamond, and 80% on LiveCodeBench v6 — placing it firmly at the frontier for open-weight models. This is the pragmatic benchmark-grounded choice for structured coding and reasoning work.

89.2% AIME 2026 (no tools) — top open-weight math score

84.3% GPQA Diamond — deep scientific reasoning

Native function calling and agentic workflow support

Built-in thinking mode for step-by-step reasoning

256K context with variable image resolution support

Source

Premium Frontier Models

OpenAI GPT-5.4

Premium

openai/gpt-5.4

1M context

The new flagship premium OpenAI slot in LeemerChat. OpenAI positions GPT-5.4 as their primary recommendation for complex reasoning, advanced coding, and multi-step problem solving. The 1M-token context window fundamentally changes what counts as a single-session problem — full codebases, extended research arcs, and enterprise workflows all fit inside one context.

Why we made this the recommendation

GPT-5.4 replaces the older GPT-5 chat variants that fragmented the OpenAI experience. One model, one recommendation, clear upgrade path.

1M-token context — full codebase in a single session

OpenAI flagship for complex reasoning and coding

Multi-step agentic planning and execution

Strongest general-purpose premium pick

Source

Z.AI GLM-5

Premium

z-ai/glm-5

80K context

Z.AI's frontier open-source language model, built for complex engineering and long-horizon agentic systems. GLM-5 scores 50.4% on Humanity's Last Exam with tools — outperforming GPT-5.4 (45.5%) and Claude Opus 4.5 (43.4%) on that benchmark. It brings serious RL-infrastructure improvements over GLM-4.7, with major gains in coding reliability and synthesis-heavy work.

Why we made this the recommendation

GLM-5 directly replaces the GLM-4.7 line. For users doing complex engineering, agent planning, or multi-model synthesis work, this is a material capability upgrade — not just a version bump.

50.4% Humanity's Last Exam (with tools) — beats GPT and Claude on this benchmark

Frontier open-source scale with commercial viability

Strong coding reliability improvements over GLM-4.7

Built for agent planning and complex synthesis work

Source

MiniMax M2.7

Premium

minimax/minimax-m2.7

200K context

MiniMax's next-generation productivity model, designed for autonomous real-world workflows and continuous self-improvement through multi-agent collaboration. M2.7 scores 56.2% on SWE-Pro, 57.0% on Terminal Bench 2, and achieves 1495 ELO on GDPval-AA — setting a new standard for multi-agent systems. It consolidates the M2.1/M2.5 split into one clear recommendation.

Why we made this the recommendation

Where M2.5 was strong on office-style document work, M2.7 extends that into live debugging, root cause analysis, and financial modeling workflows. The single-model consolidation removes the M2.1 vs. M2.5 confusion entirely.

56.2% SWE-Pro — strong production engineering performance

1495 ELO on GDPval-AA for multi-agent systems

Terminal Bench 2: 57% — genuine CLI/ops-grade execution

Live debugging, financial modeling, full document generation

200K context across the full workflow

Source

Benchmark Snapshot

These aren't marketing claims. The numbers hold up.

Selected benchmarks for the new free and premium models. Where numbers aren't available from official model cards, we leave the cell blank rather than interpolate.

Benchmark	Category	Gemma 4 31B	GLM-5	MiniMax M2.7
AIME 2026 (no tools) Gemma 4 31B and GLM-5 both clear SOTA thresholds	Math	89.2%	92.7%	—
GPQA Diamond	Science	84.3%	—	—
LiveCodeBench v6	Coding	80.0%	—	—
SWE-Pro	Engineering	—	—	56.2%
HLE with tools GLM-5 beats GPT-5.4 (45.5%) and Claude (43.4%) here	Research	26.5%	50.4%	—
Terminal Bench 2	Ops/CLI	—	—	57.0%
MMLU Pro	Knowledge	85.2%	—	—

Benchmarks sourced from official model cards. "—" indicates score not available from public model card at time of writing. Results marked with notes indicate peer-model comparisons from the same source.

What we cleaned up

Fewer tiers. Less selector noise. Clearer picks.

Adding new models without removing old ones creates confusion. We retired overlapping entries and collapsed model families to ensure every slot in the catalog has a clear reason to exist.

Retired the GPT-5 chat variants

We removed the older GPT-5 chat-focused variants from active surfaces. GPT-5.4 is a strict upgrade — better reasoning, 1M context, and OpenAI's own recommendation for complex coding and planning work. One model, clearer pick.

GLM-4.7 → GLM-5 and GLM-5V Turbo

The GLM-4.7 line is retired. GLM-5 (premium) and GLM-5V Turbo (free) replace it entirely. Z.AI's benchmark improvements are real: GLM-5 clears 50% on Humanity's Last Exam with tools, beating Claude Opus 4.5 on that benchmark. The multimodal Turbo variant goes free.

MiniMax M2.1 + M2.5 → M2.7

We had two MiniMax slots creating unnecessary choice confusion. M2.7 consolidates them. It scores 56.2% on SWE-Pro, 57.0% on Terminal Bench 2, and 1495 ELO on GDPval-AA. One premium MiniMax recommendation, not two.

What frontier means now

Frontier is about operating range, not benchmark bragging.

Agentic work is the organizing principle

These models are here because they're good at execution loops: repo reading, long planning arcs, multimodal debugging, and document-grounded reasoning. That's what real users actually run.

Frontier no longer means premium-only

The biggest change is economic, not cosmetic. Free users now get serious frontier-grade options — not fallback models that only make sense for lightweight chat.

Context size is infrastructure

202K, 256K, and 1M contexts aren't features — they're the difference between fitting a real project in one session or not. Every model in this refresh qualifies on context range.

Launch Offer

Claim your first month for $1

We start from a base price of $1 and localize the display with a live USD conversion. If you are in Ireland, you will see euro. If you are in India, you will see rupees.

Localized for your regionBase USD fallback

Start with $1

Exchange rate data via ExchangeRate-API. Offer display adapts at runtime based on detected location.

Source notes

This post is based on provider docs, official model cards, and public model pages. Benchmark figures are sourced directly from the relevant model card or provider documentation. Where a provider makes high-level positioning claims without a public benchmark sheet, we phrase those cautiously or omit them. Cells marked "—" indicate data not publicly available at time of writing.

OpenAI model docs Gemma 4 31B model card Xiaomi MiMo homepage Z.AI GLM-5 docs GLM-5V Turbo on OpenRouter MiniMax M2.7 on OpenRouter

We just moved a chunk of the frontier into the free tier.

Frontier is only useful if you can actually use it.

Frontier access shouldn't gate-keep building

Partner economics changed what's possible

Agentic work needs large contexts to be real

Xiaomi MiMo-V2-Pro

Z.AI GLM-5V-Turbo

Google Gemma 4 31B IT

OpenAI GPT-5.4

Z.AI GLM-5

MiniMax M2.7

These aren't marketing claims. The numbers hold up.

Fewer tiers. Less selector noise. Clearer picks.

Retired the GPT-5 chat variants

GLM-4.7 → GLM-5 and GLM-5V Turbo

MiniMax M2.1 + M2.5 → M2.7

Frontier is about operating range, not benchmark bragging.

Agentic work is the organizing principle

Frontier no longer means premium-only

Context size is infrastructure

Claim your first month for $1

Related Posts

GLM-5.1 Is Now Free on LeemerChat — The Model That Works for 8+ Hours Without Stopping

GLM-5 Is Live: Frontier Open-Source Scale for Complex Engineering and Agentic Work

Introducing LeemerH2: The Model Council

Introducing Leemer Analyst: Living Research Agents