Exporting a PDF to Excel
NOTE: I wrote this article for Acrobat 9. In Acrobat X, exporting to Excel is super simple and works great. Just choose File> Save As> Spreadsheet. It’s worth the upgrade for this feature alone!
I received this email from a paralegal at a large law firm recently:
Help! An attorney has asked me to convert PDFs we received in discovery to Excel. The PDFs are tabular in nature (probably originated in Excel). Some are scanned in from paper and others appear to be converted electronically. How do I do this?
Fortunately, Acrobat 9 offers a couple of different ways to export to Excel.
- Select table and open in Excel
This allows you to select a portion of a page and open it in Excel.
Works best when you only need small part of the table
Better results if the file didn’t originate from a spreadsheet
- Export as Tables in Excel
This method uses some artificial intelligence to convert multiple page PDF documents to multiple worksheets in an XML-based spreadsheet file. It works best on files which were converted directly from Excel to PDF.
To open the XML-based file output generated using method 2 above, you’ll need either:- Office 2007 - The free Microsoft Office Compatibility Pack for Word, Excel, and PowerPoint 2007 File Formats for earlier versions of Office.
Acrobat generally will usually do a pretty good job converting the text, but formatting and column widths will look different than the original. Acrobat only copies over the text. Formulas will not convert. Do not expect 100% fidelity.
In the full article, you’ll receive my usual step-by-step instructions.
Converting to Excel from PDF: Copy Table as Spreadsheet
I’ve had better luck using this method for scanned documents and documents which were not originally spreadsheets.
How to use it:
- Open a PDF and OCR if it was originally scanned
Document—> OCR Text Recognition
- Select the Select Text tool (cursor)
- Hold down the ALT (CMD on the Mac) key to make a rectangular selection over a table in the document.
Your cursor will change shape to:
- With the text still selected
- , right-click and choose “Open Table in Spreadsheet”
- The table data will open in Excel
**What are the other options? **Copy as Table will copy the data to the clipboard. From there, you can paste it into Excel or another document. Save as Table will allow you to name the data and save it as a Comma Separated Value (CSV) file.Mac Users: Only Copy as Table and Save as Table are available.
Converting to Excel from PDF: Save As Tables in Excel Spreadsheet
This method allows you export a multiple page PDF to multiple Tables in an Excel file. It seems to work best on documents which were:
- Converted directly to PDF from Excel
- Converted using Acrobat (rather than a clone)
**Save as Tables works better in Acrobat 9.1 **Adobe greatly improved the capability to export to Excel using this method in Acrobat 9.1. Acrobat 9.0 sometimes exported XML tables that Excel could not open. Make sure you update.
How to use it:
- Open the PDF you want to convert
- OCR the document if it was originally scanned.
Choose Document—> OCR Text Recognition
- Choose File—> Save As
- From the Type list at the bottom of the window, choose Tables in Excel Spreadsheet
- Click Save
How do I open the file in Excel? Depending on your file associations, you may not be able to double-click the resulting XML file to open it in Excel. You’ll need to open Excel and choose File—> OpenWhere are all the pages? Each page in the PDF is converted to a different worksheet in the Excel file. Look at the tabs at the bottom of the screen.
Batch Converting PDF to Excel
Have a lot of PDFs you want to convert to Excel? No problem! This works in any version of Acrobat 9.
- Choose File—> Export—> Export Multiple Files
- Click the Add Files button at the top of the window and locate your source PDFs
- The Output Options window appears:
- A) Click Browse to select a folder for the Excel output
B) If desired, add a prefix or suffix to the filename
C)Change Export to “Tables in Excel”
- Click OK