HCODX/Text to Unicode Escape
100% browser-based · \uXXXX or \u{XXXX} · ES2015 syntax

Text to Unicode Escape

Convert text to Unicode escape sequences (\uXXXX or ES2015 \u{XXXX} for astral). Useful for embedding non-ASCII characters in JS / JSON / Java / Python string literals.

Plain text
Escaped text
Escape options
Reverse (Escape → Text)
Input size
0 B
Output size
0 B
Ratio
Status
Ready
Example

Text in, escapes out

Each non-ASCII character is replaced by a fixed-length escape sequence — predictable, copy-paste-safe, and valid in any source file.

Plain text
Café
Escaped
Caf\u00e9
Use cases

What you'll use this for

Unicode escape notation keeps non-ASCII content portable across editors, terminals, build systems, and source-control diffs.

JS / JSON literals

Embed accented or CJK characters in string literals without worrying about file encoding.

Java string literals

Java .properties files and source code accept \uXXXX escapes natively.

Python string literals

Python accepts \uXXXX and \U00XXXXXX escapes inside any str.

Debugging Unicode

See exactly which code points lurk in a string — invisible spaces, BOMs, homoglyphs, you name it.

Step by step

How to escape text to Unicode

1

Paste your text

Drop it into the left editor. Any Unicode is fine — characters are processed by code point.

2

Pick scope and style

Non-ASCII only is the typical pick. Use \u{X} (ES2015) if you want astral code points in a single escape rather than a surrogate pair.

3

Click Escape

Or leave auto-escape on for live updates. Everything runs locally — no upload.

4

Copy or download

Copy to clipboard or save as .txt. Round-trip with the reverse tool to confirm fidelity.

FAQ

Frequently asked questions

\uXXXX is 4-hex-digit, requires surrogate pairs for astral (>U+FFFF) chars. \u{X} is ES2015 and accepts any code point in one escape.

UTF-16 surrogate pairs. Astral characters (emoji etc.) take 2 \uXXXX codes.

Yes. No signup, no limits, no ads. Runs entirely in your browser.

Non-ASCII — JSON parsers handle ASCII as-is. Escaping everything bloats output.

About

About Unicode escape sequences

Unicode assigns every character a numeric code point from U+0000 to U+10FFFF. Many programming languages let you express a character by its code point using an escape sequence — useful when the actual character is hard to type, looks identical to another character, or isn't legal in your file's encoding.

The two main forms

  • \uXXXX — exactly four hex digits, representing a UTF-16 code unit. To express a code point above U+FFFF (the astral planes — emoji, rarer CJK, math symbols, etc.) you need a surrogate pair: two \uXXXX escapes whose values are in the ranges D800–DBFF (high) and DC00–DFFF (low).
  • \u{X} — introduced in ES2015 / JavaScript. Accepts 1–6 hex digits and takes any code point in a single escape, including astral planes. Java has \uXXXX only; Python uses \uXXXX for the BMP and \U00XXXXXX for astral.

Surrogate pairs in practice

JavaScript strings are UTF-16 internally, so '🚀'.length is 2, not 1. When you select \uXXXX style here, each surrogate code unit is emitted separately — exactly what you'd hand-type. With \u{X} style we iterate by code point (via Array.from / codePointAt) and emit one escape per character.

When to escape

  • Non-ASCII only — minimal, readable output. Best for JSON, source files, and most config formats.
  • Every character — maximum obfuscation; sometimes used as a poor-man's protection against scrapers or to ensure ASCII-only output.
  • Non-printable only — keep readable text alone, escape control characters and DEL (0x7F) so they don't break terminals or diffs.
Related

Related tools