HCODX/Prompt Cleaner
100% browser-based · Token-saving · Pasted-text artifacts removed

Prompt Cleaner

Clean LLM prompts before sending. Strips zero-width characters, smart quotes, non-breaking spaces, BOMs, and redundant whitespace — the invisible junk that wastes tokens and confuses tokenizers.

Raw prompt
Cleaned prompt
Cleaning options
Count tokens
Input chars
0
Output chars
0
Tokens saved (est.)
Status
Ready
Example

Noisy in, clean out

A pasted prompt usually carries invisible junk: zero-width joiners, smart quotes, NBSPs, double-spaces. Strip them and the same prompt becomes shorter and more predictable.

Raw prompt
   Hello​  world
This is   "smart" quote 'test'.
 Extra space.
Cleaned
Hello world
This is "smart" quote 'test'.
Extra space.
Use cases

What you'll use this for

Anywhere a prompt leaves a Google Doc, Notion page, Slack message, or PDF and lands in your LLM — clean it first.

Trim noisy paste

Strip the artifacts that come along when you paste into GPT, Claude, or any LLM playground.

Fix smart-quote chaos

Curly quotes break code snippets in prompts. Normalize them to ASCII in one click.

Reduce token cost

Trimmed whitespace and stripped zero-width characters mean fewer billed tokens per call.

Debug weird tokenization

If your tokenizer is producing more tokens than expected, hidden characters are usually why.

Step by step

How to clean a prompt

1

Paste the prompt

Drop it into the left editor. The cleaner runs entirely locally — nothing is uploaded.

2

Pick options

Defaults strip zero-width, normalize quotes, replace NBSP, and collapse whitespace. Loosen any toggle you don't want.

3

Click Clean

Or leave auto-clean on for live updates.

4

Copy and send

Paste the cleaned prompt into your LLM. Check the estimated tokens saved in the stats bar.

FAQ

Frequently asked questions

Codepoints like U+200B (zero-width space), U+200C/U+200D (zero-width joiners), and U+FEFF (BOM) render as nothing but still consume tokens. They sneak in from PDFs, Word docs, and copy-paste across apps.

Curly quotes (U+2018, U+2019, U+201C, U+201D) often tokenize differently from straight ASCII quotes. Normalizing them gives more predictable token counts and avoids breaking code-style prompts.

Yes. No signup, no limits, no ads. Runs entirely in your browser.

Yes. If you paste content saved as UTF-8 with BOM, the leading U+FEFF is removed when the Strip BOM toggle is on.

As a rough approximation: removed characters divided by four. Real savings depend on the tokenizer, but this gives a useful order-of-magnitude estimate. For exact counts, use the LLM Token Counter.

About

About prompt cleaning

LLM tokenizers see every character, including the ones you can't. A pasted prompt is rarely just text — it's text plus zero-width spaces, smart quotes, non-breaking spaces, BOMs, and stray double-spaces left over from rich-text formatting. All of these consume tokens and can confuse the model.

What this tool removes

  • Zero-width characters — U+200B, U+200C, U+200D, U+FEFF.
  • Smart quotes — normalized to ASCII ' and ".
  • Non-breaking spaces — U+00A0 collapsed to regular space.
  • Redundant whitespace — repeated spaces, trailing line whitespace, excessive blank lines.

When not to clean

  • Code samples with deliberate indentation — turn off Trim each line.
  • ASCII art or fixed-width tables — turn off Collapse spaces.
Related

Related tools