Image to Text (OCR)

Extract text from screenshots, scans, and photos in 10 languages. tesseract.js, runs in your browser.

published

  • [FREE]
  • [NO_SIGNUP]
  • [NO_UPLOAD]

An image-to-text (OCR) tool extracts the written words from screenshots, scans, and photos of printed text. This one runs tesseract.js in your browser — no upload, no signup, ten languages.

When OCR works well

  • Clean screenshots of code, terminals, error dialogs — almost always >95% accurate.
  • Scanned documents at 300 DPI or better — book pages, contracts, receipts.
  • Phone photos of printed text, shot straight-on with good lighting and tight crop.
  • PDFs converted to PNG — extract one page at a time.

When OCR struggles

  • Handwriting — tesseract is trained on printed fonts; handwriting needs a different model.
  • Low-resolution photos — anything under ~150 DPI degrades fast.
  • Text on busy backgrounds — magazine ads, product packaging, decorated invitations.
  • Skewed or rotated text — straighten the image first (any photo app).
  • Stylized fonts — script, blackletter, distressed/grunge fonts confuse the model.

When in doubt, look at the confidence percentage in the result. Above 90% usually matches character-for-character. Below 70% means you should proofread.

Languages

CodeLanguageModel size
engEnglish~11 MB
spaSpanish~7 MB
fraFrench~6 MB
deuGerman~8 MB
porPortuguese~5 MB
itaItalian~6 MB
nldDutch~5 MB
jpnJapanese~13 MB
chi_simChinese (Simplified)~14 MB
chi_traChinese (Traditional)~14 MB

Each model is the tesseract LSTM-trained variant from the official tessdata repository. Downloaded once per language and cached by your browser.

How the recognition works

  1. The browser decodes your image into RGBA pixels via createImageBitmap.
  2. tesseract.js spawns a Web Worker for off-main-thread processing.
  3. The worker loads the WebAssembly Tesseract engine (~4 MB, cached after first load).
  4. The worker loads the trained model for your chosen language (5–14 MB, cached after first load per language).
  5. Tesseract runs page-segmentation → line detection → word recognition → LSTM character classification.
  6. The worker returns the full plain-text result + a confidence score.

Total time: 1–3 seconds for a typical screenshot after the first download. The first call takes 10–30 seconds because of the model download (network-bound, not CPU-bound).

Tips for better results

  • Crop tight to the text region. Background noise hurts.
  • Increase contrast before OCR if the image is washed out. Most photo apps have a one-click “auto enhance”.
  • Convert to grayscale for clean printed text. Color rarely helps; busy color backgrounds hurt.
  • De-skew the image if it was photographed at an angle. Even 5° of rotation reduces accuracy.
  • Use the right language. English-trained tesseract reads Spanish text but worse than spa-trained — and refuses accented characters outside the eng character set.

Privacy

Static HTML page → small JavaScript bundle → tesseract.js worker downloaded from jsdelivr → trained-model downloaded from jsdelivr → all OCR runs inside your browser tab. The Network tab in DevTools shows what gets fetched: tesseract worker.min.js, tesseract-core-simd.wasm, and the .traineddata file for your chosen language. None of those uploads your image. Your image bytes never leave your device.

How it compares

bytefork.toolsonlineocr.neti2ocr.com
Runs in browser✗ (uploads)✗ (uploads)
Multiple languages✓ 10✓ 46✓ 100+
Sign-in requiredfor >15 docs
Free tier limitunlimited15/hr4MB file size
Ad-free
Output as .txt

Frequently asked questions

How accurate is browser OCR?

For clean printed text (screenshots, scanned books, receipts shot straight-on), tesseract.js reaches 85–98% character accuracy at ~300 DPI. For low-resolution photos, skewed angles, handwriting, or text-on-background-pattern, accuracy drops fast — sometimes below 50%. The confidence percentage in the result is tesseract's own estimate. Below 70%, expect to proofread the output by hand.

Which languages are supported?

Ten preset languages cover most European languages plus Japanese and both Chinese variants: English, Spanish, French, German, Portuguese, Italian, Dutch, Japanese, Chinese Simplified, Chinese Traditional. The Tesseract project ships ~120 language trained models in total; this tool surfaces the ten with the broadest demand. Need a different language? Open an issue.

Does my image get uploaded?

No. Tesseract runs as a Web Worker inside your browser tab. The image bytes are passed to the worker via structured-clone (in-memory). The Network tab in DevTools shows zero requests with your image data. The trained-model download (5–14 MB depending on language) is fetched once from the jsdelivr CDN — that download contains the OCR neural network weights, not your image.

Why is the first recognition slow?

The very first recognition triggers two downloads: the tesseract.js worker JavaScript (~200 KB), the WebAssembly core (~4 MB), and the language model for your chosen language (5–14 MB). These are cached by your browser, so subsequent recognitions in the same language take 1–3 seconds depending on image size and complexity. Switching to a new language re-triggers only the language-model download.

Why is my recognized text wrong / garbled?

Common causes: 1) image resolution too low (~150 DPI or worse) — re-scan at 300 DPI. 2) Wrong language selected (English-trained tesseract reads Spanish badly). 3) Photo taken at an angle — straighten it. 4) Text on a busy background — crop tighter to the text. 5) Handwriting — tesseract is for printed text; handwriting needs a specialized model not bundled here. 6) Stylized fonts (script, blackletter) confuse the model.

Can I get the bounding boxes of recognized words?

tesseract.js returns block-level confidence in this UI. Word-level boxes are available in the raw result object — if you need them, fork the source. For now this tool focuses on plain-text extraction.

Will it preserve layout (paragraphs, columns)?

Newlines are preserved at the line level. Multi-column layouts are read left-to-right, which usually produces useful text but does not reconstruct the column boundaries. For magazine-style layouts, screenshot one column at a time for cleaner output.

What image formats are accepted?

PNG, JPG, WebP, GIF, BMP. Decoded by the browser via createImageBitmap before tesseract sees them.

Is this tool really free?

Yes. No signup, no usage limit, no ads. tesseract.js itself is Apache 2.0; the trained models are Apache 2.0.