Creating and Using Custom Redaction Patterns

Acrobat 9 and X offer powerful redaction tools including the ability to search on patterns such as:

Search and Redact Screen

While the default patterns are useful, you may want to create your own patterns.

Fortunately, it is possible to add your own patterns with a bit of work. In this article, I’ll discuss how to create and add your own patterns.

Regular Expressions and Redaction Patterns

Acrobat takes advantage of __Regular Expressions— often abbreviated as REGEX— to find patterns.A regular expression is a code pattern which describes the attributes of the text we want to find.

For example, you may need to find uniquely patterned account numbers in a banking fraud case, and mark all of the account numbers across many documents.

In the example below, the account numbers used in case documents follow a regular pattern of three alphabetic characters followed by three digits.

The REGEX for the above would be: \D\D\D\d\d\d

The above pattern would find ABC123, but not 123ABC, A12345, etc.

Many IT folks, especially those with a UNIX background, are familiar with regular expressions and can do them in their sleep.

I’m just not that geeky. I need some help.

RegexBuddy to the Rescue

RegexBuddy is an inexpensive ($39 US) Windows utility that helps with building regular expressions. It’s a small download and installs in seconds.

RegexBuddy helps in three ways:

  1. Offers a visual interface for building regular expressions:
  2. RegexBuddy Visual Builder
  3. Allows you to test regular expressions to see if they work:RegexBuddy testing screen. Yellow indicates a pattern match
  4. Includes a library of pre-built expressions for various postal codes, national IDs, VAT country codes and so on:RegexBuddy Library

RegexBuddy includes a well-written help file that will help you get up to speed.

Once you create and test the REGEX, you are ready to add it to Acrobat.

And, for the Mac Although I haven’t tried it, a similar utility available for the Mac is RegExplorer.

Where does Acrobat store the patterns?

Acrobat stores redaction patterns in an XML file. XML is a lot like HTML, but more structured.

You can edit the file in a text editor like Notepad (Windows) or TextEdit (Mac).

Tip: It’s a bit easier to edit the XML file in a smarter application that understands tags. I like to use Adobe Dreamweaver, but Word 2007 works well, too.

Here’s what we will do in this section:

Find the File and Make a Backup Copy

  1. Quit Acrobat if it is open.
  2. Find the search pattern file:For Acrobat 9:Win XP
    \Documents and Settings\<username>\Application Data\Adobe\Acrobat\9.0\Preferences\Redaction\<locale>\ SearchRedactPatterns.xmlWin 7 and Vista
    \Users\<username>\AppData\Roaming\Adobe\Acrobat\9.0\ Preferences\Redaction\<locale>\SearchRedactPatterns.xmlMac Intel
    /Users/<username>/Library/Preferences/Acrobat/9.0_x86/ Redaction/<locale>/SearchRedactPatterns.xmlMac PPC
    /Users/<username>/Library/Preferences/Acrobat/9.0/ Redaction/<locale>/SearchRedactPatterns.xml
  3. For Acrobat X:
    Windows XP**
    \Documents and Settings\<username>\Application Data\Adobe\Acrobat\10.0\Preferences\Redaction\<locale>\ SearchRedactPatterns.xml
  4. Windows 7 and Vista
  5. Mac Intel
    /Users/<username>/Library/Preferences/Adobe/Acrobat/10.0/ Redaction/<locale>/SearchRedactPatterns.xml
**Can’t see Files on Windows?**1. Go to the Control Panel 2. Choose Folder Options 3. Click on the View tab 4. Find Hidden Files and Folders in the list and double click to open it 5. Enable “Show hidden files and folders
  1. Make a backup copy of the SearchRedactPatterns.xml file.
    e.g. Rename the backup to SearchRedactPatterns.bak
  2. Right-click on the SearchRedactPatterns.xml file and choose Open With>Notepad

Editing the Pattern File

  1. Open **SearchRedactPatterns.xml **in a text editor if you haven’t already.
  2. Scroll down to find Entry 4:<set name="Entry4"> <str name="displayName"> <val>Email Addresses</val> </str> <str name="regEx" translate="no"> <val>([a-zA-Z0-9_])([a-zA-Z0-9_\-\.])*@([a-zA-Z\-])+\.([a-zA-Z\.]+)</val> </str> <str name="examples"> <val>This pattern will search for email addresses.
  3. For example:</val> </str> </set>
  4. Copy that block of text and place it just before the last tag in the file which is </asf>.
  5. Edit the block to add a new entry number, REGEX pattern, and descriptive example text:
    A) Change to Entry 6
    B) Change to the name you want to appear in Acrobat:
    e.g. Canadian Social Insurance Number
    C) Change description and examples. Optional, but nice for users.
Careful! Do not eliminate any quotes (” “) or brackets (< >) or you will break the code. Only fill in between the opening <val> and closing <\val> tags.
  1. For example, my new Entry 6 forCanadian Social Insurance numbers looks like this:
  2. <set name="Entry6"> <str name="displayName"> <val>Canadian Social Insurance Number</val> </str> <str name="regEx" translate="no"> <val>(\b)((\d{3}(-|\s|\.|_)\d{3}(-|\s|\.|_)\d{3})|(\d{9}))(\b)</val> </str> <str name="examples"> <val>This pattern will search for 9-digit Canadian Social Insurance numbers, either consecutive or 3 digits plus 3 digits plus 3 digits (separated by punctuation marks).
  3. For example: 123-456-789 123456789</val> </str> </set>
  4. Save the File
  5. Restart Acrobat and choose View—>Toolbars—>Redaction
  6. Click on the Search and Redact button.
    Your pattern should be listed in the list.
I edited the file but the patterns aren’t there! Make sure that you edited the file in the Preferences folder. There is an identical file used one level up that is easy to grab by mistake.

Happy searching!