Creating and Using Custom Redaction Patterns
Acrobat 9 and X offer powerful redaction tools including the ability to search on patterns such as:
- Social Security Numbers
- Email addresses
- Phone numbers
- Dates
- Credit card numbers
While the default patterns are useful, you may want to create your own patterns.
Fortunately, it is possible to add your own patterns with a bit of work. In this article, I’ll discuss how to create and add your own patterns.
Regular Expressions and Redaction Patterns
Acrobat takes advantage of __Regular Expressions— often abbreviated as REGEX— to find patterns.A regular expression is a code pattern which describes the attributes of the text we want to find.
For example, you may need to find uniquely patterned account numbers in a banking fraud case, and mark all of the account numbers across many documents.
In the example below, the account numbers used in case documents follow a regular pattern of three alphabetic characters followed by three digits.
The REGEX for the above would be: \D\D\D\d\d\d
The above pattern would find ABC123, but not 123ABC, A12345, etc.
Many IT folks, especially those with a UNIX background, are familiar with regular expressions and can do them in their sleep.
I’m just not that geeky. I need some help.
RegexBuddy to the Rescue
RegexBuddy is an inexpensive ($39 US) Windows utility that helps with building regular expressions. It’s a small download and installs in seconds.
RegexBuddy helps in three ways:
- Offers a visual interface for building regular expressions:
- Allows you to test regular expressions to see if they work:
- Includes a library of pre-built expressions for various postal codes, national IDs, VAT country codes and so on:
RegexBuddy includes a well-written help file that will help you get up to speed.
Once you create and test the REGEX, you are ready to add it to Acrobat.
And, for the Mac Although I haven’t tried it, a similar utility available for the Mac is RegExplorer.
Where does Acrobat store the patterns?
Acrobat stores redaction patterns in an XML file. XML is a lot like HTML, but more structured.
You can edit the file in a text editor like Notepad (Windows) or TextEdit (Mac).
Tip: It’s a bit easier to edit the XML file in a smarter application that understands tags. I like to use Adobe Dreamweaver, but Word 2007 works well, too.
Here’s what we will do in this section:
- Find the right file to edit
- Make a backup of it (just in case!)
- Copy a block of text in the file and move it to the end
- Fill in and change various blocks for the new redaction pattern.
Find the File and Make a Backup Copy
- Quit Acrobat if it is open.
- Find the search pattern file:For Acrobat 9:Win XP
\Documents and Settings\<username>\Application Data\Adobe\Acrobat\9.0\Preferences\Redaction\<locale>\ SearchRedactPatterns.xmlWin 7 and Vista
\Users\<username>\AppData\Roaming\Adobe\Acrobat\9.0\ Preferences\Redaction\<locale>\SearchRedactPatterns.xmlMac Intel
/Users/<username>/Library/Preferences/Acrobat/9.0_x86/ Redaction/<locale>/SearchRedactPatterns.xmlMac PPC
/Users/<username>/Library/Preferences/Acrobat/9.0/ Redaction/<locale>/SearchRedactPatterns.xml - For Acrobat X:
**
Windows XP**
\Documents and Settings\<username>\Application Data\Adobe\Acrobat\10.0\Preferences\Redaction\<locale>\ SearchRedactPatterns.xml - Windows 7 and Vista
\Users\<username>\AppData\Roaming\Adobe\Acrobat\10.0\Preferences\Redaction\<locale>\SearchRedactPatterns.xml - Mac Intel
/Users/<username>/Library/Preferences/Adobe/Acrobat/10.0/ Redaction/<locale>/SearchRedactPatterns.xml
- Make a backup copy of the SearchRedactPatterns.xml file.
e.g. Rename the backup to SearchRedactPatterns.bak - Right-click on the SearchRedactPatterns.xml file and choose Open With>Notepad
Editing the Pattern File
- Open **SearchRedactPatterns.xml **in a text editor if you haven’t already.
- Scroll down to find Entry 4:
<set name="Entry4"> <str name="displayName"> <val>Email Addresses</val> </str> <str name="regEx" translate="no"> <val>([a-zA-Z0-9_])([a-zA-Z0-9_\-\.])*@([a-zA-Z\-])+\.([a-zA-Z\.]+)</val> </str> <str name="examples"> <val>This pattern will search for email addresses.
For example: John.Doe@acme.com John_Doe_1234@acme.gov j-doe@marketing.acme.net</val> </str> </set>
- Copy that block of text and place it just before the last tag in the file which is </asf>.
- Edit the block to add a new entry number, REGEX pattern, and descriptive example text:
A) Change to Entry 6
B) Change to the name you want to appear in Acrobat:
e.g. Canadian Social Insurance Number
C) Change description and examples. Optional, but nice for users.
- For example, my new Entry 6 forCanadian Social Insurance numbers looks like this:
<set name="Entry6"> <str name="displayName"> <val>Canadian Social Insurance Number</val> </str> <str name="regEx" translate="no"> <val>(\b)((\d{3}(-|\s|\.|_)\d{3}(-|\s|\.|_)\d{3})|(\d{9}))(\b)</val> </str> <str name="examples"> <val>This pattern will search for 9-digit Canadian Social Insurance numbers, either consecutive or 3 digits plus 3 digits plus 3 digits (separated by punctuation marks).
For example: 123-456-789 123456789</val> </str> </set>
- Save the File
- Restart Acrobat and choose View—>Toolbars—>Redaction
- Click on the Search and Redact button.
Your pattern should be listed in the list.
Happy searching!