Your data.
Your model. Our GPUs.
The LeemerLabs Model Foundry is a custom-LLM studio running on Tinker distributed training. We fine-tune open-weight frontier models on your data, deploy them in your infrastructure, and hand you the weights. No rental AI.
Ireland's Tinker partner · Exportable weights · GDPR-ready
Open-weight bases we fine-tune
Who this is for
Teams who want to own
their intelligence layer.
The Foundry is built for organisations that have outgrown shared APIs. If your product, data, or compliance profile demands a private model — you're in the right place.
Startups
Build your AI moat early. Custom models give you a defensible advantage over competitors running generic APIs.
Agencies
Offer AI services to your clients with white-label models. Become the AI partner they actually need.
Enterprise
Deploy private intelligence layers with full compliance, security, and data-sovereignty controls.
The timing
Why now.
The window has opened.
The AI landscape has shifted. Custom models are no longer a luxury — they are a strategic necessity.
- 01The cost of training large models drops ~10× every 18 months.
- 02Open-source frontier models now rival proprietary alternatives.
- 03Enterprises demand private, compliant AI — not shared APIs.
- 04Your custom model is the competitive edge in the AI era.
Training economics
10×
Cost reduction in training frontier models every 18 months. What cost millions yesterday costs thousands tomorrow.
2023
€10M+
2026
€10k+
The four-week pipeline
From raw data
to production intelligence.
Every engagement follows the same disciplined rhythm. Four phases, four deliverables, one model you own outright.
Data Forge
Create, clean, or synthesize datasets. Domain distillation from frontier models.
Model Crafting
Fine-tune up to 235B parameters on Tinker's distributed training infrastructure.
Evaluation
Comprehensive benchmarks, adversarial safety tests, and real-world validation.
Deployment
Private APIs, SDKs, white-label apps, and full integration support.
Services
Everything you'd expect
from a real AI lab.
Fine-tuning is the beginning. Data, deployment, retrieval, orchestration, and evaluation are where most projects quietly fail — we handle them first-class.
Custom model creation
Fine-tuning on Qwen3, LLaMA 3.x, DeepSeek V3.1, Kimi K2.5. LoRA adapters, multi-turn training, instruction tuning, and vision capabilities.
Data services
Dataset creation (manual + synthetic), cleaning, domain distillation, RL trajectory datasets, and labelling pipelines.
Deployment & hosting
Private API endpoints, downloadable weights, SDKs, hosted inference, LoRA merging, rate limiting, logging, and analytics.
RAG & agentic fleets
Vector DB setup, Planner → Worker → Judge orchestration, document ingestion, custom retrievers, and tool-use training.
White-label apps
White-label LeemerChat, research agents, internal team chat, and Slack / Teams / WhatsApp integrations.
Model evaluation
MMLU, GSM8K, HumanEval, TruthfulQA, safety tests, hallucination analysis, and full benchmark reports on your data.
Supported model families
Frontier open weights,
tuned for your domain.
We work across dense, MoE, and vision architectures. Bring a model family — we bring the distributed infrastructure and the craft.
LeemerGLM-106B-A22B
96k context · Vision · MoE
Qwen3 & Qwen3-VL
2.5B → 235B MoE
LLaMA 3.1 / 3.2 / 3.3
1B → 70B
Kimi K2 & K2.5
Thinking & Base
DeepSeek V3.1
Base & Instruct
Groq & OpenRouter
Sub-second inference
LeemerGLM ships
inside the Foundry.
Our flagship mixture-of-experts model is now a first-class base for custom deployments. 96k context, vision-aware reasoning, and production-grade guardrails — without starting from scratch.
MoE performance, tuned for chat
106B total parameters with 22B active per request for fast, coherent responses.
Long-context + multimodal
96k tokens with native vision awareness for documents, diagrams, and screenshots.
Production safety & evals
RL-tuned behaviours, guardrails, and regression evals from the LeemerLabs pipeline.
Real-time throughput
≈ 250 tokens/sec in production traces so you can ship responsive workflows.
Pricing
Three ways in,
plus a founder-direct door.
From first fine-tune to sovereign deployment. Transparent price ranges — every engagement is scoped, documented, and owned by you at the end.
Starter Fine-Tune
€1,200 – €3,000
For small businesses shipping their first custom model.
- Small dataset (<20k samples)
- 7B – 8B model
- LoRA fine-tune
- Hosted API
- Basic eval report
- 30-day support
Business Model
€5,000 – €12,000
For startups and agencies standing up a real product.
- 7B – 32B models
- Dataset creation
- Multi-turn training
- RAG pipeline
- API + SDK
- Optional white-label chat
- 3 months support
Enterprise Intelligence
€15,000 – €50,000
For government and enterprise with compliance needs.
- 32B – 235B models
- Domain datasets + RL datasets
- Full eval suite
- Safety tuning
- Hosted inference + rate limits
- Dedicated Slack
- White-label end-user app
- 6 months support
Founder Partnership
€25,000 – €75,000per 6–12 months
Work directly with Repath 'Ray' Khan — founder of LeemerChat, Warren.wiki, HeyCouncil, and a dozen other AI systems.
- Quarterly strategy sessions (90 min, founder only)
- Direct involvement in your AI system design
- Oversight by Ray across data → training → infra → deployment
- Access to the LeemerLabs internal research pipeline
- Custom model recommendations + architecture planning
- Hands-on refinement of prompts, datasets, workflows
- Executive briefing documents + white papers
- Brand + product strategy guidance
- Optional on-site days (EU / UK)
A Day With Ray
€299
One day. One founder. One deep dive into your AI problem — for founders, builders, and operators.
- 60-minute strategy call
- Review of your product, idea, or data
- Action plan for your AI system
- Suggested model architecture
- Dataset roadmap
- Market & positioning guidance
- Follow-up summary + next steps
Sovereign & €5M+ engagements · scope a custom pod →
Why LeemerLabs
Ten reasons you'll
pick us over an agency.
We are an AI lab, not a reseller. Below is what that actually means in practice.
Not an agency — a full AI lab
Most 'AI agencies' wrap OpenAI and call it a day. We build models, agents, pipelines, infrastructure, and entire platforms. LeemerLabs is the research arm behind LeemerChat, Warren.wiki, ExamMate, HeyCouncil, and DeepThis.
In the AI game since 2023
We were training, distilling, and orchestrating models before GPT-4o and before the hype cycle. We lived through LLaMA-1 to LLaMA-3 and watched the open-weights revolution unfold in real time.
We build models — not glue APIs
We've fine-tuned Qwen, LLaMA, Gemma, Mistral, and Mixtral. We've built bilingual Bengali/English models, distilled production models, and crafted custom Orchestrator → Worker chains inside LeemerChat.
1B+ tokens processed in production
LeemerChat alone has processed over a billion tokens for real users. Real reasoning, real edge cases, real scale — this is battle-tested experience, not a demo.
Ireland's official Tinker partner
We're partnered with Thinking Machines — the training platform founded by ex-OpenAI leadership — for distributed fine-tuning from 7B to 235B with fault tolerance, multi-node reliability, and RL support.
Multi-model orchestration is native
We built Leemer Heavy, Heavy Fast, and Leemer Research on Qwen, Groq LPU models, GPT-4.1/4o, Claude, Kimi, LLaMA, and DeepSeek. Small, large, and domain models collaborating — fast, accurate, cheap.
Full-stack deployment, not just training
Private APIs, white-label chat, internal agents, custom embeddings, RAG pipelines, Slack / Teams / WhatsApp bots, on-prem deployment, monitoring, rate-limits, logging, analytics — the whole stack.
Builders, not consultants
Everything we sell, we use. LeemerChat, Warren.wiki, HeyCouncil, ExamMate — real platforms running on the same systems we deliver to clients.
Open models, real ownership
We back open weights and local hosting. At the end of every engagement, you own the model, the weights, and the intelligence layer. No rental, no lock-in.
Waterford-built, globally scaled
We build world-class AI in Ireland. No Silicon Valley ego, no bloated teams, no fluff — just pure engineering, research, and delivery.
Frequently asked
Questions we answer
on every sales call.
Start here
Ready to forge?
Let's talk.
Book a 30-minute discovery call with our AI architects to walk through your use case, data readiness, and model options.
- 01Data assessment & feasibility
- 02Model selection (7B · 32B · 106B MoE)
- 03Architecture & RAG pipeline review
- 04Timeline & ROI projection