Internal search implementation and (a few) best practices

A few weeks ago, a SiteCatalyst user asked me whether Omniture products could measure and help optimize internal search engine data—the keywords that users are searching to find products, content, etc. within your site. This was one of those questions that make me smile, because I can answer confidently and affirmatively. It not only can be done—it probably should be done for just about any site featuring a search engine. After all, how better to determine what your users want than by examining their search tendencies? There are only a handful of chances in the user experience to learn so much valuable information about your user base.

Figure out why internal search terms are valuable and implement around that

Not surprisingly, businesses will use internal search information differently. For example, a retail site is probably interested in the keywords that convert most effectively into orders, as well as the keywords that return no data (since this can sometimes identify holes in your catalog of products). A media site, on the other hand, may be more focused on the traffic generated by each search term, as well as the banner ad click-throughs which follow. While the implementation of internal search measurement may be similar across different business needs, it is nevertheless important to keep in mind why you care in the first place; it ensures that your implementation strategy will provide the data and key metrics that you really want.

Use a Custom Conversion (eVar) variable when you want the search term to persist, so that subsequent conversion metrics can be tied to it. For example, a user may perform an internal search at the very beginning of his/her visit, but not convert until 20 page views later. Use an eVar to allow that search term to receive credit for the order that occurred much later on. Using an eVar will also give you a total count of the number of searches performed in the Instances metric, allowing you to create a calculated metric within this report, [Orders] / [Instances], to see which keywords are most and least effective at producing conversion.

A Custom Traffic (s.prop) variable measuring internal search terms, as the name suggests, traffic-oriented. It can show you the number of page views, visits, and daily/weekly/monthly unique visitors per keyword on your site. This is ideal for business that care about searches per visit (or per visitor), and also those that care about pathing. I’ll discuss pathing by search keyword more below.

It’s worth noting that I suspect that many of you will want to use both an eVar and an s.prop to capture internal keywords, and this is just fine. In fact, it’s common. Best of both worlds, right?

This is described in the SiteCatalyst Knowledge Base:

There are two recommended approaches for populating a Custom Traffic (s.prop) variable with internal search keyword data. One is to use server-side variables to write out the desired variable and search keyword, and the other is to use the getQueryParam plug-in capture this data out of the query string in the URL of your search results page and pass it into a variable. In the examples below, we will use s.prop3 as an example of a destination variable for your internal search tracking, but you can use any Custom Traffic or Custom Conversion (s.eVar) variable for this purpose.

Server-side approach

The specifics of this method will vary depending on your server-side language of choice and implementation. In short, your server should have access the search keyword, either in a GET or a POST variable, and you can copy those over to regular variables, do any desired manipulation, and then write the keyword to the page.

/* You may give each page an identifying name, server, and channel on the next lines. */ s.pageName="Search Results" s.channel="my site section" s.prop1="user search" s.prop2="" echo "s.prop3="" . $_GET['keyword'] . """ ?> s.prop4="" s.prop5=""

If the user had searched for “little saplings handmade toys,” the result would be that “little saplings handmade toys” would be written out, and passed into SiteCatalyst on the page load, as the value of s.prop3:

s.prop3="little saplings handmade toys"

getQueryParam approach

A more common option is to allow the getQueryParam plug-in to capture your internal search keywords and to pass them into a variable of your choosing. The majority of site search engine implementations will give you the user’s keyword in the query string of the results page, and SiteCatalyst can grab it. For instance, if the user searched your site for “Little Saplings toys,” the search results page might have a URL similar to this:

http://www.yoursite.com/search/results.html?q=little+saplings+handmade+toys

In this case, you could use the getQueryParam plug-in to search for the value of the “q” parameter and to capture it in s.prop3. (Note that the plus signs are automatically stripped and replaced with spaces.) For example, you might include the following within the doPlugins() function in your SiteCatalyst code (within the s_doPlugins function in the s_code.js file):

s.prop3=s.getQueryParam('q')

Make sure to standardize the case of search terms

The point of passing internal search terms into SiteCatalyst is to determine the popularity of various values over time and their effect on success—however you define it. As such, you probably want to group different case variations of the same term, assuming that these variations return the same results. You don’t care to see “Little Sapling handmade toys” and “little sapling Handmade TOYS” as separate line items, because the search results (and, thus, the user experience based on this search) is almost certainly going to be the same. (NOTE: For sites with a high traffic volume and tons of internal searches, case issues can also increase the total unique values in reports significantly.)

So, if you’re using JavaScript to capture internal search terms, make sure to attach the toLowerCase() to the variable that is capturing the keyword. For example, you might do something like this, building on the example above:

s.prop3=s.getQueryParam('q').toLowerCase();

For server side languages, you would use something like the strtolower() function in PHP to do the same thing.

Capture the number of search results—especially zero—in a separate variable

On top of this, you can pass plenty of other useful information into other conversion and traffic variables. For example, if your internal search engine returns the count of results for each search, you can capture this information in an eVar to see how the number of results affects conversion; are the search targeted and accurate based on what the user is searching for, or do you confuse potential customers with many irrelevant results?

When no results are returned, pass a zero or “null” into this variable, so that you can break down “null” by the keywords which returned no results. This will help you understand what your users are searching for in vain. It can also help understand where your product meta data isn’t speaking the same language as your potential customers.

Use SAINT to combine singular and plural keywords (where appropriate)

Just a brief point here: your search engine may return the same results for singular and plural forms of a search term (e.g., the plural “Little Saplings handmade toys” versus the singular “Little Saplings handmade toy”). In this case, I would recommend passing the search keyword into SiteCatalyst as-is, then (if desired) using SAINT classifications to “group” these similar values. The singular and plural forms of the keyword would both be key values in your SAINT upload, and a single classification column, with the same value for both the singular and plural forms of each keyword, would give you an additional report where these variations are combined into one.

Pathing on search terms

As mentioned earlier, you can have pathing enabled for Custom Traffic variables, and this allows you to see how users’ interaction with your internal search engine evolves. These reports will display not just individual search keywords and their popularity, but the actual series of searches performed. For example, what does this “search keyword path” tell you?

apple imac > apple bluetooth mighty mouse > apple bluetooth keyboard

There are a few possibilities, but these might be users who are interested in purchasing not only an iMac, but also bluetooth accessories. Do you need to add a “Recommended Items” section to help users locate these products more easily—so they don’t need to perform search after search after search? If you already do product recommendations, is there a reason users aren’t finding these accessories there?

This is just the tip of the iceberg—internal keyword pathing opens a world of powerful optimization opportunities. It is absolutely possible, completely customizable to your needs, and fairly straight forward for you or your developers to implement. As always, please leave a comment with any questions, thoughts, or suggestions that you may have! I’m also available Twitter, FriendFeed, LinkedIn, or by e-mailing omniture care [at] omniture dot com.