Back to blog
Heavy successor

LeemerH2 turns Heavy into a model council.

Leemer Heavy proved that union models work. LeemerH2 makes the team the product: a fixed council of frontier and flash models planning, researching, executing, arguing, verifying, and then streaming one final answer.

LH2 council trace131072 max output tokens
01

Mission plan

02

Read-only tools

03

Council wave

04

Arbitration

05

Critic pass

06

128K synthesis

10

first-wave agents

3

arbiters

8

tool tasks

Heavy v1 proved the idea. LH2 changes the default.

The original Leemer Heavy release introduced a useful shift: stop asking one model to be researcher, engineer, critic, and writer in the same pass. Heavy used a central orchestrator that could delegate into research, reasoning, refinement, and synthesis.

That made Heavy feel closer to GPT-4.1 or Claude Sonnet 3.5 on practical engineering work. LH2 is a bigger jump. It behaves like a small engineering review team, landing in the same operating band we expect from GPT-5.3 Codex and Claude Sonnet 4.6-class workflows.

The council is not optional.

Mission director

x-ai/grok-4.3

Owns the objective, keeps the answer aligned with the latest user request, and prevents scope drift.

Strategy architect

moonshotai/kimi-k2.6

Designs the response structure, implementation path, and product-level tradeoffs.

Long-context scout

qwen/qwen3.5-flash-02-23

Recovers details from long conversations, files, tool outputs, and evidence blocks.

Research validator

google/gemini-3.1-flash-lite-preview

Checks freshness, source quality, and whether external context supports the claim.

Tool executor

inclusionai/ling-2.6-flash

Looks for useful tool, API, GitHub, uploaded-file, and code-execution opportunities.

Red-team critic

deepseek/deepseek-v4-flash

Attacks weak assumptions, hidden coupling, unsafe certainty, and final-answer gaps.

Arbitration is where LH2 gets sharper.

Many multi-agent systems ask several models for opinions and flatten the result into a bland summary. LH2 adds an arbitration wave after the first council. The arbiters look for consensus, meaningful disagreement, unsupported claims, and the answer structure that will survive scrutiny.

That is the difference between a swarm and a pile of drafts. Disagreement becomes a resource instead of a formatting problem.

Internal launch eval

These are Leemer internal launch numbers, not third-party benchmark claims. The signal is the shape: LH2 improves most where a council should improve, especially context recovery, bug localization, and catching regressions before the final answer.

Internal engineering evalHeavy v1LeemerH2
Repo task resolution41.8%68.7%
Multi-file refactor pass rate44.6%73.2%
Long-context bug localization52.4%86.1%
Regression caught before final28.9%64.5%
Architecture review usefulness6.7 / 108.9 / 10
Mean first useful plan latency71s58s

128K synthesis

LH2 requests a 131072-token final budget for long technical work when the provider honors it.

Engineering native

GitHub context, file reads, code execution, and systems review are part of the default identity.

Still compatible

The chat UI keeps the same token and tool-event stream contract.

The practical takeaway

Use lighter models when you need a quick answer. Use LeemerH2 when the task deserves a team: repo-scale engineering, architecture, research, debugging, migration planning, or any decision where a critic should attack the plan before you act.

Open LeemerChat

Related Posts

May 7, 2026

Introducing Leemer Analyst: Living Research Agents

Leemer Analyst is a persistent research agent inside its own E2B VM, built for long-running analysis, memory, connectors, verification, and private artifact deployment.

Read more
March 2, 2026

Get Ready for Mission Control: The Next Evolution of Agentic Execution

Mission Control is our next-generation agentic research and execution platform. It represents a fundamental shift in how we interact with AI—moving away from rigid pipelines and chat interfaces, and stepping into the era of autonomous, goal-oriented swarms.

Read more
May 6, 2026

Introducing LeemerStudio: Image and Video Generation Built Into LeemerChat

LeemerStudio is a new creative workspace inside LeemerChat for generating images, animating references, rendering video, tracking live status, and keeping every output in private history.

Read more
April 17, 2026

Introducing LeemerLabs

LeemerLabs is the infrastructure arm of the Leemer Group: Ireland-hosted inference, custom model creation through LeemerFoundry, and the systems powering products like LeemerChat.

Read more
Explore more:All PostsReleasesModelsBenchmarksEngineeringInsightsAll FeaturesAbout UsTermsPrivacy