Menu

What Is OCR? How to Extract Text from Scans and PDFs

8 min readPublished Updated

A scanned contract, a photo of a whiteboard, a PDF that won't let you select anything — to a computer, these are just grids of pixels. OCR (Optical Character Recognition) is the technology that looks at those pixels and recovers the actual text, so you can search it, copy it, edit it, or feed it to other software.

This guide explains how modern OCR works, what determines whether you get clean text or garbage, which OCR tools are already built into devices you own, and when an online OCR tool is the right call.

OCR in one minute

OCR converts images of text — scans, photos, screenshots, image-only PDFs — into machine-readable characters. The distinction matters most with PDFs, which come in two kinds that look identical: digitally created PDFs (exported from Word, for instance) already contain real text you can select, while scanned PDFs contain only a photograph of each page. The quick test: try to select a word. If you can't, and searching finds nothing, you need OCR.

OCR output typically takes one of two forms: plain extracted text you can paste anywhere, or a 'searchable PDF' where an invisible text layer is placed behind the original scan — the page still looks like the scan, but search, select, and copy all work.

How modern OCR actually works

Classic OCR matched character shapes against templates; modern engines are neural networks. Tesseract — the open-source engine that powers a large share of OCR tools, originally developed at HP and later maintained by Google — switched to an LSTM neural network in version 4, reading whole lines in context rather than isolated characters. The pipeline looks roughly like this:

  1. 1

    Preprocessing

    The image is converted toward clean black-on-white: deskewed (rotation corrected), denoised, and binarized. Most recognition failures are really preprocessing failures.

  2. 2

    Layout analysis

    The engine segments the page into blocks — columns, paragraphs, tables, images — and determines reading order.

  3. 3

    Recognition

    A neural network reads each text line and predicts character sequences, using language models to resolve ambiguity ('cl' vs 'd', 'rn' vs 'm').

  4. 4

    Output assembly

    Recognized text is assembled with position data, producing plain text, structured formats, or a searchable text layer for a PDF.

What determines accuracy

On a clean, well-lit scan of printed text, modern OCR routinely exceeds 99% character accuracy. On a skewed phone photo of a crumpled receipt, it can fall apart. The factors that matter, in rough order:

  • Resolution: ~300 DPI is the standard recommendation for scans. Below ~200 DPI, characters lose the detail engines need; far above 400 DPI adds little.
  • Contrast and lighting: dark, even text on a light background. Shadows across a phone photo are a top cause of garbage output.
  • Skew and perspective: straight-on scans beat angled photos. Many scanning apps auto-correct perspective — let them.
  • Fonts and layout: standard printed fonts recognize nearly perfectly; dense tables, multi-column layouts, and stylized type are harder.
  • Language settings: telling the engine the document's language lets its language model fix ambiguous characters. Good tools support dozens — FileMorf's OCR supports 14, including English, Spanish, French, German, Japanese, and Chinese.
  • Handwriting: a different problem (often called ICR). Neat print handwriting partially works; cursive generally needs specialized services and still disappoints.

Always proofread the critical bits

Even 99% accuracy means roughly one error per hundred characters — and OCR errors love exactly the places that hurt: digits in amounts, dates, reference numbers, names. Extract first, then verify the numbers by eye before anything important depends on them.

OCR you already own

Before reaching for any tool, know what's built into your devices — for quick one-off grabs, these are excellent:

  • iPhone/iPad and macOS — Live Text: select and copy text directly inside photos, in Safari images, and in Preview. Point the camera at a document and copy from the viewfinder.
  • Windows — PowerToys Text Extractor: press a shortcut, drag a rectangle over anything on screen, and the text lands on your clipboard. The Snipping Tool in Windows 11 has a similar 'text actions' feature.
  • Android — Google Lens: built into the camera and Google Photos; select, copy, or translate text in any image.
  • Google Drive: upload an image or scanned PDF, open it with Google Docs, and Drive runs OCR automatically — a capable free option if uploading the document to Google is acceptable.

Their common limits: they're built for a screenful of text at a time, they don't batch-process multi-page PDFs well, most don't produce searchable PDFs, and layout (columns, tables) is frequently scrambled.

Running OCR on multi-page documents

For a 30-page scanned PDF, you want a proper document OCR flow rather than screenshot-grabbing. FileMorf includes one built on the Tesseract engine:

  1. 1

    Open the OCR tool and sign in

    OCR is a workspace feature: free accounts include 20 OCR pages per month. Unlike FileMorf's image and PDF tools — which run entirely in your browser — OCR jobs run in secure cloud processing, with outputs retained in your workspace and downloadable via rotating links.

  2. 2

    Upload your scan, image, or PDF

    Pick the document language from the 14 supported — this meaningfully improves accuracy on non-English text.

  3. 3

    Review and export the text

    Results open in your workspace, where you can review the extracted text, copy it out, or keep it alongside the source for later comparison.

Being straight about where processing happens

OCR is computationally heavy, so like most document OCR services, FileMorf runs it server-side rather than in the browser. If your document is too sensitive to leave your machine at all, use a fully offline route instead: install Tesseract locally, or use macOS Live Text / PowerToys for shorter documents.

What people actually use OCR for

  • Making scanned archives searchable: a filing cabinet's worth of scanned PDFs becomes a searchable library.
  • Receipts and invoices: extracting amounts and dates for expense reports and bookkeeping.
  • Digitizing printed matter: pulling quotes from books, converting old typewritten documents, rescuing text from faxes.
  • Data entry from forms: turning stacks of paper forms into rows in a spreadsheet (pair the output with a CSV conversion).
  • Accessibility: screen readers can't read image-only PDFs; adding a text layer makes documents usable with assistive technology.
  • Translation: text must be machine-readable before it can be machine-translated — OCR is step one for foreign-language documents.
Free Tool

Extract text with OCR

Tesseract-powered OCR in 14 languages. Free workspace accounts include 20 OCR pages per month.

Frequently asked questions

Is OCR the same as converting PDF to Word?

Not quite. If the PDF already has real text, PDF-to-Word conversion just restructures it — no OCR needed. If the PDF is a scan, OCR is the step that recovers the text first. Good converters detect which case you have.

How accurate is OCR on photos taken with a phone?

Good, if you help it: fill the frame, shoot straight-on, avoid shadows, and use a scanning mode that flattens perspective. A careless angled photo can drop accuracy from 99% to unusable, purely from preprocessing issues.

Can OCR read handwriting?

Printed handwriting: partially, with frequent errors. Cursive: poorly, even with specialized handwriting-recognition services. For anything that matters, treat handwriting OCR output as a draft to correct rather than a result to trust.

Does OCR work in languages other than English?

Yes — modern engines ship trained models for dozens of languages. FileMorf's OCR supports 14, including Spanish, French, German, Japanese, and Chinese. Selecting the correct language before running the job noticeably improves accuracy.

Is free OCR really free?

The engines (like Tesseract) are open source, and device-built-in OCR is free. Hosted OCR services typically meter usage because server processing costs money — FileMorf's free workspace includes 20 OCR pages per month, with higher allowances on paid plans.

Keep Reading