Batch OCR using Acrobat Professional

Have you ever received a PDF file that did not contain searchable text? You may know that you can use Acrobat’s OCR (Optical Character Recognition) to add an invisible layer of searchable text on top of the file. This allows you to select, copy and search text on a paper document. Great!

What do you do when you have hundreds of TIFFs and Image-only PDFs file that you need to search for a big case? Working with these documents one at a time is not efficient.

If you have Acrobat Professional, you can batch OCR and let you computer do the work for you.

NOTE: Acrobat 9 and up make this process much easier. Simply select Document>OCR Text Recognition>OCR Multiple Files. If you have Acrobat 9 and you just want to OCR a bunch of files, this is probably all you need! Acrobat X can do OCR as part of an Action, so you can combine OCR with other operations as part of a document processing workflow.

Read on to learn how…

Batch Processing to the Rescue

There are two steps to follow:

  1. Set up a Batch Sequence
  2. Run a Batch Sequence

Set up a Batch Sequence

Scan your documents locally or send to a PC where Acrobat Pro is installed.

If you have the capability, scan directly to PDF or to an MTIFF (multi-page TIFF). These formats allow all of the pages of a document to be maintained as a single file.

  1. In Acrobat Professional 7, choose Advanced—>Batch Processing
    — or —
    In Acrobat Professional 8, choose Advanced—>Document Processing—>Batch Processing
  2. Click the New Sequence button.
    New Sequence Step
  3. Give the sequence a name.
  4. Click Select Commands
    Select Commands
  5. Choose Recognize Text Using OCR and click the Add button.
    Changing the steps
  6. Double-click the Recognize Text using OCR text (right side of the window) to set OCR Options.
    -Set Downsample Images to 300 dpi. Click OK
    Downsample settings
  7. Click OK again to get back to the main window.
  8. Click Output Options
    PDF Optimizer settings
Note: Output Options allows you specify where the OCR’d files should be written. I suggest writing them to a local drive and copying later to a network store.
  1. Enable_ PDF Optimizer_ and Do not overwrite existing files.
  2. Click the Settings Button.
    PDF Optimizer
Adjust the settings to make the smallest possible files, especially for Black and White (monochrome) files:> JBIG2 Lossless is very efficient and preserves the exact appearance of the text. > > Consider trying JBIG2 Lossy which causes some visual degradation, but can be up to 70% smaller than JBIG2 Lossless.
  1. Click OK.
  2. Give the revised settings a name such as “B&W Lossy”.

Run a Batch Sequence

Now, all you need to do is to run the batch sequence.

  1. Place all the files you wish to process in a single folder on your hard drive.
  2. In Acrobat Professional 7, choose Advanced—>Batch Processing
    – or –
    In Acrobat Professional 8, choose Advanced—>Document Processing—>Batch Processing
  3. Select the sequence to run
  4. Click OK
  5. Select the folder to process
  6. Click the Select button.
  7. Select the Output Folder

That’s it!

Sit back and enjoy a cup a coffee as Acrobat does the work for you.