Learn / OCR PDF

OCR Receipts and Invoices — Extract Text for Accounting

Scanned receipts and invoices are just images — their numbers and vendor names are not copyable. Here is how to run OCR and turn them into text you can use in any accounting software.

OCR your scanned receipts and invoices in seconds — Pro feature, 30-day free trial.

OCR Scanner

What Is OCR for Receipts and Invoices?

When you scan a paper receipt or invoice, the resulting PDF is an image — your computer sees pixels, not text. That means you cannot search it, copy an amount from it, or import it into QuickBooks, Xero, or any other accounting software without retyping everything by hand.

OCR (Optical Character Recognition) reads the image and converts each printed character into real, selectable text. After running OCR, the PDF looks identical but now contains a hidden text layer — every vendor name, date, line item, and total becomes copyable and searchable. This is the first step in any paperless accounting workflow.

  • 1.Expense reports. Copy receipt amounts directly into your expense report instead of squinting at a faded thermal printout and typing numbers manually.
  • 2.Accounts payable. OCR extracts invoice numbers, vendor names, amounts, and due dates from scanned supplier invoices — eliminating manual data entry and the errors that come with it.
  • 3.Tax preparation. Accountants and bookkeepers scan boxes of receipts at year-end. OCR makes every document searchable by vendor, date, or amount — so finding the Home Depot receipt from March takes seconds, not 20 minutes.
  • 4.Audit trails. Financial auditors need to reference source documents quickly. Searchable PDFs satisfy audit requirements while saving hours of manual document retrieval.

For a broader introduction to how OCR works, see our guide on What Is OCR.

How to OCR Receipts and Invoices (Step by Step)

1

Scan or photograph the receipt or invoice

Use a flatbed scanner at 300 DPI, or photograph the document with your phone. Save it as a PDF. For phone scans, run the file through Phone Scan Cleanup first to flatten contrast and remove shadows.

2

Upload to OCR Scanner and run OCR

Open PDF.it's OCR Scanner tool, upload your scanned PDF, select the document language, and click the OCR button to start text recognition.

3

Copy or export the extracted text

Download your searchable PDF. Open it and use Ctrl+F or Cmd+F to search for amounts, vendor names, or dates. Convert to Excel or Word for direct import into your accounting software.

Manual Entry vs. OCR vs. Native Digital Invoice

MethodTime per DocumentError RiskSearchable
Manual data entry3–10 minutesHigh (typos, transpositions)No
OCR (scanned PDF)Under 30 secondsLow (verify totals)Yes
Native digital PDFInstant (no OCR needed)NoneYes

If a supplier sends you a PDF invoice by email that was generated by their software (not scanned), it already has selectable text. Run OCR only on documents that started as paper or were photographed.

Getting the Best Scan Quality for Receipts

Thermal receipt paper — the shiny paper most cash register receipts are printed on — fades within months and is notoriously difficult to photograph cleanly. These tips make a significant difference:

  • Scan thermal receipts within a few weeks of purchase while the ink is still dark. Faded receipts reduce OCR accuracy significantly.
  • Use a flatbed scanner at 300 DPI for the most consistent results. Phone cameras introduce perspective distortion and uneven lighting, especially on curled receipts.
  • Place the receipt flat. Curl the edges down or place a light book on top for 30 seconds before scanning. Shadows from curled edges cause OCR misreads.
  • Run phone-scanned receipts through Phone Scan Cleanup before OCR. This tool automatically flattens contrast, removes background shadows, and straightens the image.

For deeper guidance on scan quality, see our OCR Accuracy Tips guide.

Troubleshooting Common OCR Problems

Numbers are being misread (8 becomes 0, 1 becomes I)

This is caused by low scan resolution or a faded original. Rescan at 300 DPI or higher. If you are working from a phone photo, run the file through Phone Scan Cleanup before re-running OCR. Always verify totals against the original before entering them in your accounting software.

OCR produced garbled text on part of the page

Garbled output usually means that section of the scan had a shadow, fold, or stain obscuring the text. Check the original image: if you can read the problem area by eye, the scan was the issue. Rescan with better lighting or use your phone's built-in document scanner (Notes on iPhone, Google Drive on Android) which applies automatic perspective correction.

The PDF already looks correct but text is still not selectable

Some PDFs are locked with restrictions that prevent text selection even after OCR. Use Unlock PDF to remove the restriction, then re-run OCR Scanner. If the file has no password, the "no text" behavior simply means it is image-based — OCR will fix it.

Stop Retyping Receipt Data by Hand

Upload any scanned receipt or invoice and get searchable, copyable text in under 30 seconds. Pro plan — 30-day free trial included.

Try OCR Scanner (Pro)

Frequently Asked Questions

What data can OCR extract from a receipt or invoice?

OCR can extract vendor name, date, invoice number, line items, subtotals, tax amounts, and grand totals from scanned receipts and invoices. The text becomes copyable and searchable, so you can paste values directly into accounting software like QuickBooks, Xero, or Excel.

Does OCR work on crumpled or faded receipts?

OCR works best on clean, flat receipts with clear print. Crumpled or faded receipts reduce accuracy. To improve results, flatten the receipt, photograph it in good lighting, and use PDF.it's Phone Scan Cleanup tool before running OCR to improve contrast and remove shadows.

Can I OCR a batch of invoices at once?

Yes. PDF.it Pro includes batch processing, so you can upload multiple scanned invoice PDFs and run OCR on all of them in one session. This is significantly faster than processing each file one by one.

Is OCR accurate enough to trust for accounting?

Modern OCR on clean, high-resolution scans typically achieves 95–99% accuracy on printed text. You should always verify totals and amounts before posting entries to your accounting software. OCR eliminates most manual typing — a final review takes seconds rather than minutes.

What resolution should I scan receipts at for best OCR results?

Scan receipts and invoices at 300 DPI or higher. Below 200 DPI, small fonts on thermal receipt paper become difficult for OCR to recognize accurately. Most flatbed scanners default to 300 DPI, which is ideal.

Can I OCR an invoice PDF that was emailed to me?

If the PDF was sent as a digital invoice (created directly from software), it likely already contains selectable text and does not need OCR. If it was scanned and attached as an image-based PDF, then yes — run it through OCR Scanner to add a searchable text layer.