Prompt Token Diff
Compare two LLM prompts side by side. Line-based diff (with + / − markers) plus estimated token counts for both versions and the cost delta on the model you choose.
Two prompts, one verdict
Identical lines pass through. Lines unique to Prompt A are removed (-); lines unique to Prompt B are added (+). Token counts and cost delta show on each side.
You are a helpful assistant. Answer concisely. If unsure, say so.
- You are a helpful assistant. + You are an expert assistant. - Answer concisely. + Answer thoroughly but concisely. - If unsure, say so. + If unsure, ask a clarifying question.
What you'll use this for
Prompt engineering is iterative — every revision should be measured for both wording and cost impact.
Prompt iteration
See exactly what changed between two drafts before shipping the new system prompt.
A/B testing prompts
Pair this with offline eval — keep the variants whose token cost stays in budget.
Cost optimization
Trim verbose system prompts and instantly see the per-call savings.
Audit changes
Review prompt diffs in PRs without firing up the terminal or another tool.
How to diff two prompts
Paste prompt A
Drop the original prompt into the left editor. Auto-compare runs after a short debounce.
Paste prompt B
The revised version goes on the right. Swap with the arrow button to flip +/-.
Set your rate
Default is $0.003/1K — typical mid-tier input price. Override per your provider's rate card.
Read the result
Diff + token deltas + cost delta. Copy or download the unified-style output.
Frequently asked questions
It's a heuristic: chars / 4. That's usually within ±20% of the real tokenizer for English prose. For exact counts, use the LLM token counter.
Every provider uses a different tokenizer (BPE, SentencePiece, tiktoken). Code, JSON and non-English text tokenize very differently — the chars/4 heuristic skews for those.
Yes. No signup, no limits. Both prompts stay in your browser.
No — they're simplified per-1K-token estimates for one call. Real bills include output tokens, cached reads, batch discounts, and per-tier pricing. Use as a directional signal.
Click the swap arrows in the Prompt B header — it flips both editors and re-runs.
About prompt diffs
Prompts are code now — versioned, reviewed, A/B tested. A line diff is the cheapest tool for catching unintended drift between revisions, and pairing it with a token estimate makes the cost impact of every edit visible.
Output format
line— unchanged context line.- line— line removed from Prompt B.+ line— line added in Prompt B.
Why chars/4?
- It's a universal rough estimate that works across providers without bundling a 1 MB tokenizer.
- For exact-to-the-token counts use the per-model token counters (linked below).
- The diff itself is exact — only the token tally is approximate.