Mistral AI · 32K to 256K context · open weights

Mistral Token Counter: estimate Mistral AI tokens fast

Free online Mistral token counter for Mistral Large 2, Medium 3, Small 3, and Codestral. Uses the ~4 chars/token approximation that closely matches Mistral's SentencePiece tokenizer. 32K to 256K context windows. Runs entirely in your browser.

Your text

Token breakdown

0 / 0 tokens (0%)

Counter options

Model

Estimate

Context window

Auto-updateRe-run on every input change

Cost calculator

Tokens used

Context limit

Fill %

Status

Ready

Use cases

What you'll use this for

A token counter is a cheap pre-flight check — it tells you if your prompt fits, how much context you have left for output, and what each call will roughly cost.

Pre-flight checks

Verify your prompt fits Mistral's context before sending — Small 3's 32K window fills fast with code or long docs.

Codestral planning

Estimate how much of a codebase you can fit in Codestral's 256K window for repo-wide refactors.

Prompt iteration

See how edits affect token count — compress system prompts and few-shot examples.

Cost forecasting

Pair with the Mistral cost calculator to estimate spend per message at any volume.

Step by step

How to estimate Mistral tokens

Paste your text

Drop your prompt, system message, or document into the left editor. Unicode is fine — it's counted as bytes via UTF-8.

Pick a model

Each Mistral variant has its own context window. Switch to see how full it gets.

Read the count

The right panel shows token estimate, context fill bar, characters, words, and remaining headroom.

FAQ

Frequently asked questions

How accurate is the Mistral token estimate?

Mistral models use a SentencePiece-style BPE tokenizer with a ~32K vocabulary. For English text the ~4 chars/token heuristic is within ±10% of the real count. For code (especially in Codestral), expect slightly higher token density — long identifiers and operators tokenize more densely than prose.

What's the difference between Mistral Large 2, Medium 3, and Small 3?

Large 2 is the flagship for complex reasoning and multilingual work. Medium 3 is the new sweet-spot model — frontier-class capability at a fraction of Large's cost. Small 3 is a 24B-parameter latency-optimised model with strong instruction following. Codestral is the code specialist with a 256K context built for whole-repo workflows.

Why does Codestral have a 256K context but Small only 32K?

Codestral was rebuilt in 2025 specifically for codebase-scale tasks — repo-wide refactoring, multi-file PR reviews, monorepo navigation. Small 3 stays at 32K because its target workloads (chat, classification, RAG with short retrieved passages) rarely benefit from longer windows and the smaller window keeps inference cheap.

Does this work for la Plateforme or self-hosted Mistral?

Yes — the count is the same regardless of where you run the model. The Mistral tokenizer is identical across la Plateforme, Azure AI, Bedrock, and self-hosted weights. The only thing that differs is pricing and latency, not token count.

About

About Mistral tokenization

Mistral AI's tokenizer is a SentencePiece byte-pair encoding tuned on a multilingual corpus. The base vocabulary is ~32K tokens with extensions for code and special chat tokens. For day-to-day estimates, ~4 characters per token is a reliable rule of thumb.

Mistral 2026 line-up

Mistral Large 2 — 128K context. The flagship for hard reasoning, multilingual work, and tool use. Competitive with Sonnet 4.6 on many tasks.
Mistral Medium 3 — 128K context. Mistral's new mid-tier — frontier-class quality at a fraction of Large 2's cost, often the right default.
Mistral Small 3 — 32K context. 24B-parameter latency-optimised model. Best for chat, classification, and RAG.
Codestral — 256K context. Code specialist with whole-repo workflows, fill-in-the-middle, and 80+ languages.

What changes per language

French, Spanish, Italian, German — Mistral's tokenizer is well-trained on these. ~4 chars/token applies.
Code — Codestral's vocabulary adds programming-specific tokens. For Python/JS, expect slightly fewer tokens per character than other Mistral models.
CJK & Arabic — Less efficient than Latin scripts. Budget 2–3× more tokens per character.

For exact counts from the API, parse the usage field returned by la Plateforme's chat completion endpoint.