Reducing the File Size of Scanned PDFs
It seems like a lot of folks are struggling with the size of scanned PDFs. Below are excerpts from two emails I received recently:
My [Fujitsu] ScanSnap makes PDFs that are too big . . . like around 60K per page! What can I do to make these smaller in Acrobat?
I have to eFile [with the Federal Court] and am having to split the filings into many segments to go through the [Court] gateway. The issue seems to be with documents that are scanned on our network scanner. PDFs produced directly from Word are a lot smaller. Is there some trick to reduce the size of scanned files?
Before covering how to reduce the size of scanned documents in detail, let’s discuss four factors that affect the size of scanned images:
- Scanning Resolution
A scan at 600 dpi results in a much larger file than at 300 dpi.
- Color Space
Color and grayscale files result in much larger files than black and white files.
- Physical dimensions of the scanned page
A legal-size scan will be larger than a letter-size scan, with all other factors being equal.
Raw scan data can be compressed to make it smaller.
compression retains the exact appearance of the original.Two common types of lossless compression are ZIP and CCITT Group 4.
compression makes some (hopefully) non-noticeable visual trade-offs to further reduce file size.JPEG is a common lossy compression method.
Ideally, you would control all of the above factors yourself by scanning at 300 dpi, black and white and using an efficient compression algorithm.
Unfortunately, you many not have that option. Many desktop and network scanners offer limited or confusing options— or— the scanned PDFs arrived from outside your firm.
Legal Scanning Recommendations In almost all situations, scan at 300 dpi, black and white.
For the purpose of this article we will make a couple of assumptions:
- You have a black and white scanned document of unknown dpi and compression
- You have already OCR’d the document, or don’t need OCR
Read on to learn how to reduce the file size of scanned documents using Acrobat.
Black and White Image Compression
There are three common types of compression used on black and white scanned images:
For most 300 dpi black and white scans, it can be very difficult to spot any visual differences.
Comparison of Compression, 300 dpi, 200% Enlargement
Using “Optimize Scanned Image” in Acrobat Standard and Pro
The Optimize Scanned Image feature performs various image clean-up tasks (de-skewing, edge enhancement) and also nicely compresses files.
Here’s how to use this feature:
- Open the PDF you wish to optimize
- Choose Document—> Optimize Scanned PDF. . .
- The Optimize Scanned Image window appears.
- Choose the appropriate level of compression and click OK.
What do the settings mean?
The slider at the top of the window has six clickable positions:
For 300 dpi black and white scans, only options a, b and f result in different file sizes.
Results for a 4-page scanned document
a, b, c and d = JBIG2 Lossy e = JBIG2 Lossless f=CCITT G4
Using Acrobat’s PDF Optimizer to Compress Scanned PDFs
The PDF Optimizer can be used to analyze and selectively compress documents. Sorry Acrobat Standard users— this feature is in Acrobat Pro and Pro Extended only.
Analyzing File Size of Scanned Documents
To better understand why a document is big, view the statistics available via the PDF Optimizer.
- Open the PDF you wish to analyze
- Choose Advanced—> PDF Optimizer . . .
- Click the Audit Space usage. . . button
- The Audit Space Usage window appears:
The window above reflected the state of a 4-page scanned document:
A)Total file size about 200K
B) Over 190K was allocated to images!
We can do a lot better than that . . .
Reducing the Size of an Individual Scanned PDF using the PDF Optimizer
- Open the PDF you wish to compress
- Choose Advanced—> PDF Optimizer . . .
The PDF Optimizer window appears:
- In the list on the left, ensure that only Images and Clean Up are checked:
- At the bottom of the window, set the following for black and white documents:
a) Set to 300 ppi
b) Set to 300 ppi
c) Set to JBIG
d) Choose Lossy or Lossless
- Save your setting so you can easily recall it:
a) Clickthe Save button at the top of the window
b)Give the setting a name and click OK
Note: The PDF Optimizer may be used in batch mode which allows you to process hundreds of files. See my article on Batch OCR with Acrobat Pro.