HCODX/Image to Text
Local-only OCR · 100+ languages · No upload

Image to Text (OCR)

Extract text from any image — drag a rectangle to OCR only the part you need, rotate, fine-tune brightness/contrast/threshold, filter low-confidence lines, search the output, and export as plain text, Markdown, TSV, or JSON. Powered by Tesseract.js — 100+ languages, Apache 2.0, all in your browser. Paste a screenshot with Ctrl+V; your image never leaves your device.

Drop an image here

Or click to choose. JPG, PNG, WebP, BMP. You can also paste a screenshot with Ctrl+V. Best results: clean text, at least 30 px tall.

Choose image
OCR options
Language
Page segmentation
Output format
Image preprocessing
Mode
Brightness0
Contrast0
Threshold (binarize)128
Min confidence filter0%
Loading…
Image & extracted text
Image
OCR will run only on the selected area 0×0
Recognized text
Words
0
Characters
0
Avg. confidence
Use cases

What you'll use this for

Receipts & invoices

Pull text from receipt or invoice photos without retyping.

Screenshots

Convert a screenshot (terminal, error message, slide) into copyable text.

Document scans

Digitize a scanned ID, certificate, or contract page.

Foreign-language

Extract Chinese, Japanese, Arabic, or Hindi text from a photo and paste into a translator.

Step by step

How to extract text from an image

1

Load an image

Drag & drop a JPG/PNG/WebP/BMP file, click to choose, or just paste a screenshot with Ctrl+V. The image appears in a canvas-based editor with a toolbar.

2

Crop / rotate (optional)

If the photo is sideways, use the rotate buttons. If you only need text from one part of the image, click Select area and drag a rectangle — only that region will be OCR'd.

3

Tune the options

Pick a language and a Page segmentation mode. Adjust brightness / contrast / threshold if the image is dim or noisy. Set a confidence filter to drop garbage lines.

4

Click Extract text

Tesseract.js runs as a WebAssembly worker. First-time language packs take 5–15 s to download (cached after); recognition itself is 1–10 s depending on size.

5

Copy, search, or export

Output as Plain text, Lines with confidence, Markdown, TSV (positions), or JSON (full structure with bounding boxes). Use the search bar to find specific terms in the result.

FAQ

Frequently asked questions

Drop your image, click Select area in the toolbar above the preview, then drag a rectangle on the image. The rest of the image dims and the next Extract text run only processes the selected region. Great for receipts (just the total), forms (one field), or screenshots (one paragraph). Click Clear or pick a new area at any time.

Yes. Copy any image to your clipboard (PrintScreen on Windows, Cmd+Shift+Ctrl+4 on Mac, snipping tools, etc.) and press Ctrl+V / Cmd+V anywhere on the page. The image loads instantly — no need to save a file first.

Use the rotate buttons in the toolbar above the image (90° left / 90° right) to straighten phone photos. Tesseract works best on upright text. Auto with OSD in the Page segmentation dropdown can also detect rotation but is much slower.

It tells Tesseract what kind of layout to expect. Auto (default) for most images. Single line for one row of text. Single word for one word. Single uniform block for a paragraph or document body. Sparse text for scattered text with no structure (logos, screenshots). Picking the right mode dramatically improves accuracy.

They preprocess the image before OCR. Brightness / Contrast help when the image is washed out, dim, or has faint text. Threshold converts the image to pure black-and-white — useful for printed documents or screenshots where Tesseract gets confused by colour. Set the Mode dropdown to Manual to use the sliders, Binarize to also apply the threshold, or Auto to let the tool do grayscale + contrast normalisation automatically.

Plain text (default), Lines with confidence percentages, Markdown (paragraphs separated by blank lines), TSV (line / confidence / bounding box / text — paste into Excel or a database), and JSON (full structured output with bounding boxes per line). The download button picks the right file extension automatically.

No. Tesseract.js runs as a WebAssembly worker entirely inside your browser. Image bytes and recognized text never leave your device.

30+ in the picker, but Tesseract supports 100+. We surface the most-requested languages: English, Spanish, French, German, Italian, Portuguese, Russian, Chinese (simplified & traditional), Japanese, Korean, Arabic, Hebrew, Hindi, Vietnamese, and many more. Each pack is ~3–10 MB and downloads on demand the first time you pick it.

Tesseract is most accurate on clean, high-contrast, well-aligned text. Screenshots and scanned documents typically achieve 95–99% accuracy. Photos of street signs or handwriting are much harder and may need manual cleanup. The Image preprocessing card can usually clean up a marginal image enough to make it readable.

Tesseract is trained primarily on typeset text and struggles with cursive handwriting. Print handwriting works moderately well; cursive usually does not. For handwriting-specific OCR, consider a specialised tool.

About

About browser-based OCR

This tool runs entirely in your browser — no upload, no server-side processing, no privacy compromise. It uses Tesseract.js, the JavaScript / WebAssembly port of the venerable Tesseract OCR engine originally developed at HP Labs in the 1980s and now maintained by Google.

How it works

  1. Your image is decoded by the browser and (optionally) preprocessed to grayscale with contrast normalization.
  2. Tesseract.js spins up a Web Worker, loads the WASM core (~5 MB, cached) and the language data file (~3–10 MB per language, cached).
  3. The image is analyzed for text regions, segmented into lines and words, then run through a recurrent neural net trained on each language.
  4. The result is returned with per-character, per-word, and per-line confidence scores.

Accuracy reality check

  • Clean digital text (screenshots, exported PDFs): 99%+ usually.
  • Scanned documents: 95–99%, depending on scan quality.
  • Photos of printed text (signs, menus): 85–95% on a clean shot, lower at angles.
  • Handwriting: print = 60–85%, cursive = often unusable.
  • Stylized fonts / logos: hit-or-miss.

Privacy

Everything happens in your browser. The image is read with FileReader, processed inside a Web Worker, and the recognized text appears in your DOM. No network request carries your image data — only the initial library and language pack downloads, which contain no information about you.

Related

Related tools