What Is OCR and Why Does Your PDF Need It?
When you scan a paper document, your scanner takes a photograph of each page. The resulting PDF contains images — not text. That means you can't search for a word, select a sentence, or copy a phone number out of it.
OCR (Optical Character Recognition) analyzes those images, identifies each character, and embeds real text into the PDF. After OCR, the document behaves like any typed PDF: fully searchable, copyable, and screen-reader accessible.
Common situations where you need OCR: scanned contracts, photographed receipts, old faxes saved as PDF, and any document that was printed and scanned.
How to OCR a PDF (Step by Step)
Upload your scanned PDF
Go to PDF.it's OCR Scanner tool. Drag your file into the upload area or click to browse. Files up to 25MB are free — Pro users can upload up to 200MB.
Select your language
Choose the language your document is written in. Matching the language improves character recognition accuracy, especially for accented characters and non-Latin scripts.
Click 'Run OCR'
PDF.it processes every page and embeds searchable text into your document. Processing time depends on page count — most PDFs finish in under 30 seconds.
Download your searchable PDF
The output looks identical to your original but now has a real text layer. Open it and press Ctrl+F (or Cmd+F on Mac) to confirm you can search for words.
When OCR Works Best
OCR accuracy depends heavily on the quality of your scan. Here's what to expect:
| Document Type | Expected Accuracy |
|---|---|
| Clean typewritten text, 300+ DPI | 98–99% |
| Laser-printed document, standard fonts | 95–98% |
| Low-resolution scan (under 150 DPI) | 70–85% |
| Handwritten text | 50–80% |
Tips to Get Better OCR Results
- ✓Scan at 300 DPI minimum. Most scanner apps default to 150 or 200 DPI — change the setting before scanning.
- ✓Keep pages straight. Tilted pages confuse OCR engines. If your scan is crooked, use Rotate PDF to correct it first.
- ✓Use Phone Scan Cleanup first. If you photographed the document with your phone, run it through Phone Scan Cleanup to remove shadows and improve contrast before OCR.
- ✓Select the correct language. OCR engines use language models to guess ambiguous characters — the right language setting improves accuracy significantly.
OCR vs. PDF to TXT — What's the Difference?
Both tools extract text, but they work differently:
- OCR Scanner: For scanned PDFs (images). Analyzes pixel patterns to identify text. Output is a searchable PDF with the original formatting intact.
- PDF to TXT: For digital PDFs with existing text layers. Extracts the text directly — faster and 100% accurate because no image recognition is needed.
Not sure which you have? Try copying text from your PDF. If nothing highlights, it's a scanned image and you need OCR. See When to Use OCR vs. PDF to TXT for a full breakdown.