Full Text Search of PDF using Adobe Acrobat
Lately, everyone’s been asking me to help them find themselves…
After a talk at the Missouri Solo and Small Firm conference, I chatted with a solo real estate attorney who asked for my advice on developing a searchable article archive from the materials he had collected over the years. “How do I find the articles I need?” he asked.
I also talked to a lawyer who took on a probono criminal defense case. “How can I find where my client is mentioned in all the police records I was sent?” she asked.
And, at the LegalTech West show, a workman’s compensation investigator asked how to search medical records. “How can I apply notes to these handwritten medical records and find them later?” he asked.
In this article, I’ll discuss how to use Acrobat Professional to create a full-text index so you can find what you need… fast!
Read on to learn more…
Searching Beyond Text of the Document
Acrobat can find text in the following parts of a PDF:
- Text of the document (regular or OCR)
- Title, Subject, Author, Keyword (metadata)
- Notes and Annotations
- Bookmarks
- PDF Attachments
So, what does this mean for legal professionals?
- You can find words or phrases across multiple documents quickly to help you find key facts, names, places, etc. that are contained within the text of documents.
- You can capture your thinking about a document—in the PDF—while reviewing it using bookmarks and comment tools.
- You can later find documents by the notes and knowledge you’ve applied to them.
That’s powerful.
Setting up for Search
Step 1: Make sure your documents are text searchable by Acrobat
- Use Acrobat Optical Character Recognition (OCR) if you have paper documents or image-only PDFs in your document collection.
- Convert electronic files such as word processing, spreadsheets, etc. to PDF
Step 2: Locate and Segregate Documents
Depending on the type of project you have, you may wish to move similar documents to individual directories.
For example, let’s say you have accumulated several years of legal research on trusts. You may wish to segregate the documents by state or issue.
If you are indexing client files, you may wish to index by client or perhaps even by matter.
There’s no right or wrong way to organize your documents, but you do need to strike a balance between how much time you spend organizing your files and how easy it is to find what you need.
Create an Index
Follow these steps to create a full-text search index using Acrobat 8 Professional:
- In Acrobat X, open the Tools pane, then open the Document Processing section and choose Full Text Index with Catalog
In Acrobat 9, choose Advanced —>Document Processing —>Full Text Index with Catalog -
- Click the New Index button
- The Build Index window will appear:
-
- Give the index a name
- Enter a description of the index
- Choose the directory that will be indexed. All sub-directories will be indexed.
- Click the Build button
- Acrobat will create a .pdx (index) file at the top level of the directory you specified.
-
- Click the Save button.
- The Index Progress window will appear:
-
- Note that Acrobat will skip any documents which are secured with an Open password.
Attaching to the Index and Searching
Follow these steps to attach to the index you created:
- Choose Edit—>Search—or type—
- Windows: Control-Shift-F
Macintosh: Command-Shift-F - Acrobat will split your screen between the Search window and the Document window.
- In the Search window on the left, click on Advanced Search at the bottom:
- In the Advanced Search panel, click on the Look In menu and choose Select Index.
- The Index Selection window will appear.
Click the Add button
Locate the the index file (.pdx) that you created earlier. Normally, Acrobat will automatically find it for you. - Click OK
Searching the Index
Once you select an index, Acrobat will keep it selected so you can search against it.