Model Launch

RIN: Sharp. Fast. Precise.

Meet RIN (凛) — our free, unlimited reasoning model. Built on Mistral AI's Ministral 3B foundation and fine-tuned for precision. 350-400 tokens per second, 256K context window. The precision instrument for builders who value speed over hand-holding.

Base ModelMinistral 3B

Speed350-400 tok/s

Context256K tokens

PriceFREE

Why “RIN”?

RIN (凛) in Japanese means dignified, cold, and severe. It commands respect without seeking approval. One syllable, impossible to mispronounce, easy to remember. Like “Grok” or “Claude” — it just sticks.

This isn’t a friendly chatbot name. It’s the psychology of the sharp knife in a chef’s hand, the tuned sports car engine, the perfectly balanced mechanical keyboard. Tools that respect the craftsman.

Built on Mistral AI's foundation. We didn't build Ministral 3B from scratch—Mistral AI did. What we did was take their exceptional 3B parameter model and fine-tune it for our users. Think of it as taking a high-performance engine and tuning it for your specific track.

Performance Metrics

Inference Speed

350-400 tok/s

3.5x faster than GPT-4

Context Window

256K tokens

2x larger than GPT-4 Turbo

Model Size

3B parameters

50x smaller, same quality

Latency (TTFT)

<100ms

Near-instant responses

Benchmark Performance

RIN (Ministral 3B 2512) punches above its weight class, competing with models 10x its size across reasoning, comprehension, and multilingual tasks.

TriviaQA

Large-scale reading comprehension dataset with 650K+ question-answer-evidence triples

Kimi K2 Base75%

Gemma 2 27B73%

Mistral Small 3.1 24B68%

Ministral 3 (3B Base 2512)55%

Granite 3.3 8B Base52%

AGIEval

Human-centric benchmark evaluating foundation models on standardized exams (GaoKao, SAT, LSAT, etc.)

Mistral Small 3 24B58%

Ministral 3 (14B Base)56%

Ministral 3 (3B Base 2512)54%

Gemma 2 27B52%

Granite 3.3 8B Base48%

MATH (CoT)

12,500 challenging competition mathematics problems requiring step-by-step reasoning

Llama 3.1 70B Instruct52%

Ministral 3 (14B Base)48%

Ministral 3 (8B Base)42%

Ministral 3 (3B Base 2512)58%

Llama 3.1 8B Instruct35%

Multilingual MMLU

Comprehensive multilingual benchmark covering 29 diverse languages with 11,829 questions

o3-mini85%

Ministral 3 (14B Base)72%

Ministral 3 (8B Base)68%

Ministral 3 (3B Base 2512)75%

Phi 4 Mini62%

Fine-Tuned Ministral

Built on Mistral AI's Ministral 3B (2512) foundation. We've fine-tuned it for precision, speed, and reasoning—delivering flagship-class performance in a compact 3B parameter model.

350-400 Tokens/Second

Blazing fast inference. No waiting. No buffering. No thinking delays. RIN responds at the speed of thought, keeping you in flow state instead of watching a spinner.

Unlimited & Free

Rate-limited per hour, but unlimited overall. No credits, no quotas, no upgrade prompts. RIN is our gift to builders who need a sharp, fast tool without the billing anxiety.

The RIN Philosophy

350-400 Tokens/Second

No waiting. No buffering. No thinking delays. RIN responds at the speed of thought, keeping you in flow state instead of watching a spinner.

Not a chatbot

RIN isn't trying to be your friend. It's a precision instrument. No warm greetings, no apologies, no disclaimers. Just fast, accurate responses.

The sharp knife

Like a chef's blade or a tuned engine, RIN is about surgical precision. It cuts through complexity without the bloat of models 10x its size.

Flow state enabler

Every delay is a micro-frustration. At 350-400 tok/s, RIN keeps up with your thoughts. You're driving, not waiting.

Psychological Positioning

Every AI has a personality. RIN occupies the “performance tool” quadrant — not a chatbot, a system.

Model	Psychology	Feeling
ChatGPT	The helpful assistant	Safe, friendly
Claude	The thoughtful colleague	Warm, ethical
Gemini	The Google ecosystem	Integrated, familiar
RIN	The precision instrument	Sharp, fast, uncompromising

When to Use RIN

Use case

Quick code reviews

Paste a function, get instant feedback. RIN’s speed means you can iterate on code without breaking your flow.

Use case

Brainstorming sessions

Rapid-fire idea generation. Ask, get, refine, repeat. No waiting means more iterations per minute.

Use case

Learning & exploration

Ask questions freely without worrying about credits. Perfect for students, hobbyists, and curious minds.

Use case

Draft generation

First drafts, outlines, summaries. RIN’s thinking capability produces structured, reasoned output fast.

Under the Hood

Base Model

mistralai/ministral-3b-2512

Architecture

Fine-tuned (not MoE)

Context Window

256,000 tokens

Inference Speed

350-400 tokens/second

Parameters

3 billion (all active)

Capabilities

Reasoning, Thinking (no web/files)

Training

Fine-tuned on curated datasets

Rate Limit

Per-hour throttling, unlimited overall

The RIN Promise

RIN doesn’t waste your time, energy, or tokens. It’s the AI for people who don’t want to think about the AI. Sharp. Fast. Precise. That’s it.

RIN: Sharp. Fast. Precise.

Why “RIN”?

Performance Metrics

Benchmark Performance

TriviaQA

AGIEval

MATH (CoT)

Multilingual MMLU

Fine-Tuned Ministral

350-400 Tokens/Second

Unlimited & Free

The RIN Philosophy

Psychological Positioning

When to Use RIN

Quick code reviews

Brainstorming sessions

Learning & exploration

Draft generation

Under the Hood

The RIN Promise

Related Posts

MiniMax M2.5 Is Live: SOTA Productivity Model for Real-World Office Work

GLM-5 Is Live: Frontier Open-Source Scale for Complex Engineering and Agentic Work

Kimi K2.5: Moonshot AI’s Frontier Multimodal Model, Now Live on LeemerChat

LeemerLite Drop: The 1,750 T/s Sandbox Powered by Groq