Mistral Token Counter: estimate Mistral AI tokens fast
Free online Mistral token counter for Mistral Large 2, Medium 3, Small 3, and Codestral. Uses the ~4 chars/token approximation that closely matches Mistral's SentencePiece tokenizer. 32K to 256K context windows. Runs entirely in your browser.
What you'll use this for
A token counter is a cheap pre-flight check — it tells you if your prompt fits, how much context you have left for output, and what each call will roughly cost.
Pre-flight checks
Verify your prompt fits Mistral's context before sending — Small 3's 32K window fills fast with code or long docs.
Codestral planning
Estimate how much of a codebase you can fit in Codestral's 256K window for repo-wide refactors.
Prompt iteration
See how edits affect token count — compress system prompts and few-shot examples.
Cost forecasting
Pair with the Mistral cost calculator to estimate spend per message at any volume.
How to estimate Mistral tokens
Paste your text
Drop your prompt, system message, or document into the left editor. Unicode is fine — it's counted as bytes via UTF-8.
Pick a model
Each Mistral variant has its own context window. Switch to see how full it gets.
Read the count
The right panel shows token estimate, context fill bar, characters, words, and remaining headroom.
Frequently asked questions
Mistral models use a SentencePiece-style BPE tokenizer with a ~32K vocabulary. For English text the ~4 chars/token heuristic is within ±10% of the real count. For code (especially in Codestral), expect slightly higher token density — long identifiers and operators tokenize more densely than prose.
Large 2 is the flagship for complex reasoning and multilingual work. Medium 3 is the new sweet-spot model — frontier-class capability at a fraction of Large's cost. Small 3 is a 24B-parameter latency-optimised model with strong instruction following. Codestral is the code specialist with a 256K context built for whole-repo workflows.
Codestral was rebuilt in 2025 specifically for codebase-scale tasks — repo-wide refactoring, multi-file PR reviews, monorepo navigation. Small 3 stays at 32K because its target workloads (chat, classification, RAG with short retrieved passages) rarely benefit from longer windows and the smaller window keeps inference cheap.
Yes — the count is the same regardless of where you run the model. The Mistral tokenizer is identical across la Plateforme, Azure AI, Bedrock, and self-hosted weights. The only thing that differs is pricing and latency, not token count.
About Mistral tokenization
Mistral AI's tokenizer is a SentencePiece byte-pair encoding tuned on a multilingual corpus. The base vocabulary is ~32K tokens with extensions for code and special chat tokens. For day-to-day estimates, ~4 characters per token is a reliable rule of thumb.
Mistral 2026 line-up
- Mistral Large 2 — 128K context. The flagship for hard reasoning, multilingual work, and tool use. Competitive with Sonnet 4.6 on many tasks.
- Mistral Medium 3 — 128K context. Mistral's new mid-tier — frontier-class quality at a fraction of Large 2's cost, often the right default.
- Mistral Small 3 — 32K context. 24B-parameter latency-optimised model. Best for chat, classification, and RAG.
- Codestral — 256K context. Code specialist with whole-repo workflows, fill-in-the-middle, and 80+ languages.
What changes per language
- French, Spanish, Italian, German — Mistral's tokenizer is well-trained on these. ~4 chars/token applies.
- Code — Codestral's vocabulary adds programming-specific tokens. For Python/JS, expect slightly fewer tokens per character than other Mistral models.
- CJK & Arabic — Less efficient than Latin scripts. Budget 2–3× more tokens per character.
For exact counts from the API, parse the usage field returned by la Plateforme's chat completion endpoint.