Foundry · 2026Tinker GA · LeemerGLM · Agentic AI

Your data.
Your model. Our GPUs.

The LeemerLabs Model Foundry is a custom-LLM studio running on Tinker distributed training. We fine-tune open-weight frontier models on your data, deploy them in your infrastructure, and hand you the weights. No rental AI.

Start your model View supported models

Ireland's Tinker partner · Exportable weights · GDPR-ready

Open-weight bases we fine-tune

Qwen

Meta / LLaMA

Gemma

DeepSeek

Groq

OpenRouter

Qwen

Meta / LLaMA

Gemma

DeepSeek

Groq

OpenRouter

Who this is for

Teams who want to own
their intelligence layer.

The Foundry is built for organisations that have outgrown shared APIs. If your product, data, or compliance profile demands a private model — you're in the right place.

Startups

Build your AI moat early. Custom models give you a defensible advantage over competitors running generic APIs.

Agencies

Offer AI services to your clients with white-label models. Become the AI partner they actually need.

Enterprise

Deploy private intelligence layers with full compliance, security, and data-sovereignty controls.

The timing

Why now.
The window has opened.

The AI landscape has shifted. Custom models are no longer a luxury — they are a strategic necessity.

01The cost of training large models drops ~10× every 18 months.
02Open-source frontier models now rival proprietary alternatives.
03Enterprises demand private, compliant AI — not shared APIs.
04Your custom model is the competitive edge in the AI era.

Training economics

10×

Cost reduction in training frontier models every 18 months. What cost millions yesterday costs thousands tomorrow.

2023

€10M+

2026

€10k+

The four-week pipeline

From raw data
to production intelligence.

Every engagement follows the same disciplined rhythm. Four phases, four deliverables, one model you own outright.

01Week 1

Data Forge

Create, clean, or synthesize datasets. Domain distillation from frontier models.

02Week 2

Model Crafting

Fine-tune up to 235B parameters on Tinker's distributed training infrastructure.

03Week 3

Evaluation

Comprehensive benchmarks, adversarial safety tests, and real-world validation.

04Week 4

Deployment

Private APIs, SDKs, white-label apps, and full integration support.

Services

Everything you'd expect
from a real AI lab.

Fine-tuning is the beginning. Data, deployment, retrieval, orchestration, and evaluation are where most projects quietly fail — we handle them first-class.

Custom model creation

Fine-tuning on Qwen3, LLaMA 3.x, DeepSeek V3.1, Kimi K2.5. LoRA adapters, multi-turn training, instruction tuning, and vision capabilities.

Data services

Dataset creation (manual + synthetic), cleaning, domain distillation, RL trajectory datasets, and labelling pipelines.

Deployment & hosting

Private API endpoints, downloadable weights, SDKs, hosted inference, LoRA merging, rate limiting, logging, and analytics.

RAG & agentic fleets

Vector DB setup, Planner → Worker → Judge orchestration, document ingestion, custom retrievers, and tool-use training.

White-label apps

White-label LeemerChat, research agents, internal team chat, and Slack / Teams / WhatsApp integrations.

Model evaluation

MMLU, GSM8K, HumanEval, TruthfulQA, safety tests, hallucination analysis, and full benchmark reports on your data.

Supported model families

Frontier open weights,
tuned for your domain.

We work across dense, MoE, and vision architectures. Bring a model family — we bring the distributed infrastructure and the craft.

In-house

LeemerGLM-106B-A22B

96k context · Vision · MoE

Dense · MoE · Vision

Qwen3 & Qwen3-VL

2.5B → 235B MoE

Dense

LLaMA 3.1 / 3.2 / 3.3

1B → 70B

Moonshot

Kimi K2 & K2.5

Thinking & Base

MoE

DeepSeek V3.1

Base & Instruct

Hosted

Groq & OpenRouter

Sub-second inference

New release · LeemerGLM-106B-A22B

LeemerGLM ships
inside the Foundry.

Our flagship mixture-of-experts model is now a first-class base for custom deployments. 96k context, vision-aware reasoning, and production-grade guardrails — without starting from scratch.

106B total · 22B active96k context + vision≈ 250 tok/sec

Explore LeemerGLM Build with this release

MoE performance, tuned for chat

106B total parameters with 22B active per request for fast, coherent responses.

Long-context + multimodal

96k tokens with native vision awareness for documents, diagrams, and screenshots.

Production safety & evals

RL-tuned behaviours, guardrails, and regression evals from the LeemerLabs pipeline.

Real-time throughput

≈ 250 tokens/sec in production traces so you can ship responsive workflows.

Pricing

Three ways in,
plus a founder-direct door.

From first fine-tune to sovereign deployment. Transparent price ranges — every engagement is scoped, documented, and owned by you at the end.

Starter Fine-Tune

€1,200 – €3,000

For small businesses shipping their first custom model.

Small dataset (<20k samples)
7B – 8B model
LoRA fine-tune
Hosted API
Basic eval report
30-day support

Scope this tier

Most chosen

Business Model

€5,000 – €12,000

For startups and agencies standing up a real product.

7B – 32B models
Dataset creation
Multi-turn training
RAG pipeline
API + SDK
Optional white-label chat
3 months support

Scope this tier

Enterprise Intelligence

€15,000 – €50,000

For government and enterprise with compliance needs.

32B – 235B models
Domain datasets + RL datasets
Full eval suite
Safety tuning
Hosted inference + rate limits
Dedicated Slack
White-label end-user app
6 months support

Scope this tier

Founder Partnership

€25,000 – €75,000per 6–12 months

Work directly with Repath 'Ray' Khan — founder of LeemerChat, Warren.wiki, HeyCouncil, and a dozen other AI systems.

Quarterly strategy sessions (90 min, founder only)
Direct involvement in your AI system design
Oversight by Ray across data → training → infra → deployment
Access to the LeemerLabs internal research pipeline
Custom model recommendations + architecture planning
Hands-on refinement of prompts, datasets, workflows
Executive briefing documents + white papers
Brand + product strategy guidance
Optional on-site days (EU / UK)

Book a founder session

A Day With Ray

€299

One day. One founder. One deep dive into your AI problem — for founders, builders, and operators.

60-minute strategy call
Review of your product, idea, or data
Action plan for your AI system
Suggested model architecture
Dataset roadmap
Market & positioning guidance
Follow-up summary + next steps

Book a day

Sovereign & €5M+ engagements · scope a custom pod →

Why LeemerLabs

Ten reasons you'll
pick us over an agency.

We are an AI lab, not a reseller. Below is what that actually means in practice.

Not an agency — a full AI lab

Most 'AI agencies' wrap OpenAI and call it a day. We build models, agents, pipelines, infrastructure, and entire platforms. LeemerLabs is the research arm behind LeemerChat, Warren.wiki, ExamMate, HeyCouncil, and DeepThis.

In the AI game since 2023

We were training, distilling, and orchestrating models before GPT-4o and before the hype cycle. We lived through LLaMA-1 to LLaMA-3 and watched the open-weights revolution unfold in real time.

We build models — not glue APIs

We've fine-tuned Qwen, LLaMA, Gemma, Mistral, and Mixtral. We've built bilingual Bengali/English models, distilled production models, and crafted custom Orchestrator → Worker chains inside LeemerChat.

1B+ tokens processed in production

LeemerChat alone has processed over a billion tokens for real users. Real reasoning, real edge cases, real scale — this is battle-tested experience, not a demo.

Ireland's official Tinker partner

We're partnered with Thinking Machines — the training platform founded by ex-OpenAI leadership — for distributed fine-tuning from 7B to 235B with fault tolerance, multi-node reliability, and RL support.

Multi-model orchestration is native

We built Leemer Heavy, Heavy Fast, and Leemer Research on Qwen, Groq LPU models, GPT-4.1/4o, Claude, Kimi, LLaMA, and DeepSeek. Small, large, and domain models collaborating — fast, accurate, cheap.

Full-stack deployment, not just training

Private APIs, white-label chat, internal agents, custom embeddings, RAG pipelines, Slack / Teams / WhatsApp bots, on-prem deployment, monitoring, rate-limits, logging, analytics — the whole stack.

Builders, not consultants

Everything we sell, we use. LeemerChat, Warren.wiki, HeyCouncil, ExamMate — real platforms running on the same systems we deliver to clients.

Open models, real ownership

We back open weights and local hosting. At the end of every engagement, you own the model, the weights, and the intelligence layer. No rental, no lock-in.

Waterford-built, globally scaled

We build world-class AI in Ireland. No Silicon Valley ego, no bloated teams, no fluff — just pure engineering, research, and delivery.

GDPR readyOn-prem deploymentISO-friendly architectureData sovereigntyPrivate inferenceExportable weights

Frequently asked

Questions we answer
on every sales call.

Start here

Ready to forge?
Let's talk.

Book a 30-minute discovery call with our AI architects to walk through your use case, data readiness, and model options.

01Data assessment & feasibility
02Model selection (7B · 32B · 106B MoE)
03Architecture & RAG pipeline review
04Timeline & ROI projection

Visit leemerlabs.com dev@leemerchat.com

Your data.Your model. Our GPUs.

Teams who want to owntheir intelligence layer.

Startups

Agencies

Enterprise

Why now.The window has opened.

From raw datato production intelligence.

Data Forge

Model Crafting

Evaluation

Deployment

Everything you'd expectfrom a real AI lab.

Custom model creation

Data services

Deployment & hosting

RAG & agentic fleets

White-label apps

Model evaluation

Frontier open weights,tuned for your domain.

LeemerGLM-106B-A22B

Qwen3 & Qwen3-VL

LLaMA 3.1 / 3.2 / 3.3

Kimi K2 & K2.5

DeepSeek V3.1

Groq & OpenRouter

LeemerGLM shipsinside the Foundry.

Three ways in,plus a founder-direct door.

€25,000 – €75,000per 6–12 months

€299

Ten reasons you'llpick us over an agency.

Not an agency — a full AI lab

In the AI game since 2023

We build models — not glue APIs

1B+ tokens processed in production

Ireland's official Tinker partner

Multi-model orchestration is native

Full-stack deployment, not just training

Builders, not consultants

Open models, real ownership

Waterford-built, globally scaled

Questions we answeron every sales call.

How long does training take?

Can I provide my own data?

Do you offer hosting?

Is the model mine after training?

Can you white-label the interface?

LoRA vs full fine-tuning?

Do you support model distillation?

Do you offer maintenance and retraining?

Can you train multilingual models?

How secure is the process?

Can you handle extremely long context (128k–500k)?

Can you fine-tune OpenAI or Mistral models?

How do payments work?

Do you sign NDAs, DPAs, and MSAs?

Ready to forge?Let's talk.

Your data.
Your model. Our GPUs.

Teams who want to own
their intelligence layer.

Why now.
The window has opened.

From raw data
to production intelligence.

Everything you'd expect
from a real AI lab.

Frontier open weights,
tuned for your domain.

LeemerGLM ships
inside the Foundry.

Three ways in,
plus a founder-direct door.

Ten reasons you'll
pick us over an agency.

Questions we answer
on every sales call.

Ready to forge?
Let's talk.