OCR — Extract text from scanned PDFs

Run OCR on image-based PDFs to make them searchable and copyable. Supports 13 languages.

Drop a scanned PDF to OCR

or click to select · up to 100 MB

Note: OCR runs entirely in your browser (Tesseract.js via WebAssembly). First use downloads the OCR engine and language data — after that it's cached. Higher-resolution scans give much better accuracy.

Other PDF tools

PDF Merger PDF Splitter PDF Compressor Organize PDF Rotate PDF PDF to Image Image to PDF PDF to Text Watermark PDF Sign PDF Encrypt PDF PDF Unlock Page Numbers Header & Footer Crop PDF Extract Images Redact PDF PDF Info PDF to Word Word to PDF PDF to Excel Excel to PDF PDF to PowerPoint HTML to PDF

Frequently asked questions

How accurate is the OCR?

Very accurate on clean scans (90%+). Lower on low-resolution, skewed, or noisy scans. For best results, scan at 300 DPI and keep pages straight.

Can I use multiple languages?

Yes — pick a combined option like '繁中 + English' from the language list. This is useful for bilingual documents.

Why is the first run slow?

The OCR engine (~5 MB) and language data (~2-10 MB per language) are downloaded on first use. Subsequent runs use the browser cache and are much faster.