Under the Hood with Visits and Visitors

Today I’d like to discuss how SiteCatalyst keeps track of page views and organizes them into visits and visitors. Since visits and visitors represent KPIs (or are involved in KPIs) for many of our customers, understanding how they work can be immensely helpful both in executing a successful implementation and in interpreting/resolving potential implementation issues.

All of this centers around a persistent cookie called s_vi, which stores the visitor ID provided by Omniture’s servers whenever a user first visits your site. This cookie is set on your specified data collection domain, which is the domain of the Omniture image request (i.e. server call); this domain ends with 2o7.net if your site uses third-party cookies, and is your domain with a special subdomain (e.g., metrics.yoursite.com) if you are using first-party cookies. The s_vi cookie has a lifetime of five years, so it’s about as persistent as can be.

When a user arrives at your site for the first time, the Omniture JavaScript code downloads and executes. An “image request” is sent into Omniture’s data collection servers. Since the user is making his first-ever visit to your site, an image request occurs, but no s_vi cookie exists on your data collection domain. This triggers a 302 redirect, which allows the data collection server to generate a visitor ID value and write it to a new s_vi cookie on your data collection domain. Take a look at the screen shot below.

Here you can see the 302 redirect, followed by a 200 status, which indicates a successful and normal image request and transfer of page view data into SiteCatalyst. You will also notice that this second request (highlighted) shows an s_vi cookie value in the last row under “Request Header Value.”

All subsequent page views that this user generates will pass this cookied visitor ID into SiteCatalyst in the HTTP header, and therefore will not generate additional 302 redirects. A couple of points to note:

  1. In the first image request (200 status) following the redirect, you will see an extra parameter passed in the request; “vidn=” will be set to the visitor ID value stored in the cookie. This will not occur on subsequent page views.
  2. This first image request will also contain a parameter, “pccr=true,” which is set to prevent an endless loop of redirects if the cookie cannot actually be set (as may occur when, for example, the user has disabled persistent cookies in his web browser).

Why does this matter?

It matters because this visitor ID is the glue that binds multiple page views together and allows SiteCatalyst to interpret them as distinct visits and visitors.

As page views flow in to Omniture’s data collection and processing servers, SiteCatalyst examines the visitor ID value passed in the s_vi cookie; if it does not have a record of another page view within the last 30 minutes from that same visitor ID in the given report suite, then it knows to count a new visit. If it does have a record of a page view within the last 30 minutes from this visitor ID, then it continues a visit; pathing data can then be derived by comparing timestamps to sequentialize these page views.

Based on this explanation, you may already have inferred how SiteCatalyst can track unique visitors—daily, weekly, monthly, etc. If a page view is passed to Omniture servers with a visitor ID that hasn’t been seen in the given report suite during that day, then the page view represents a new daily unique visitor; if the visitor ID has not been seen during the current month, then it is a new monthly unique visitor, as so on and so forth.

As new visits are detected, SiteCatalyst can also compare the visitor ID to historical data to determine if it has ever been seen before in the given report suite. If so, it is a repeat visitor; if not, it is a new visitor. This enables SiteCatalyst to record the total number of visits for various visitors.

Again, a couple of points to keep in mind:

  1. SiteCatalyst does not track visits or pathing data for users who have disabled persistent cookies in their web browsers, nor does it track this data for users whose browsers are fundamentally incapable of accepting cookies. A primary reason for this is that without the s_vi cookie being set and passed on each page view, SiteCatalyst relies on the user’s IP address and user-agent string to derive a visitor ID during data processing.
  2. As has been mentioned previously, the s_vi cookie is set on your data collection domain. You may have multiple report suites and web sites sharing a data collection domain; this should not cause mis-calculation of paths, visits, or unique visitors. While a user who visits multiple sites that use the same data collection domain may therefore pass the same visitor ID into multiple report suites, SiteCatalyst only compares these page views to prior visitor ID records within the given report suite.

A closer look at s_vi

Recently, a growing number of sites have begun using non-JavaScript data collection methods such as the Data Insertion API and full-processing Data Sources. The data passed into SiteCatalyst by these means should, wherever possible, be tied back to visits and visitors established using JavaScript data collection, so that revenue and other metrics are correctly attributed to the users who generated them.

One way to do this is to take ownership of the visitor ID process by assigning and passing your own visitor ID values, which supersede the cookied value when passed correctly. Another way is to read the s_vi cookie, parse it, pull out the actual visitor ID, and pass it using your chosen alternative data collection method.

Here’s a quick example: Some retail sites have recently switched over to the Data Insertion API to pass order data; all other site data is passed using JavaScript. This means that visits and visitors are first counted using JavaScript, but the visit may conclude with a Data Insertion API call. If you are a developer for a retail site considering this switch, you’ll want to ensure that the orders passed using the API are tied to visits that began with JavaScript requests. To do this, you will need to ensure that the same visitor ID passed by JavaScript is passed by the API post.

Warning: If you choose do this by reading the s_vi cookie, please be aware that the format/syntax within the cookie may change in the future. Omniture has committed to provide as much advance warning as possible should such a change be implemented, but it is worth noting this caveat up front.

While I won’t explain here how you can read and parse a cookie, nor how to set a visitor ID using the Data Insertion API, I will show you the contents of the s_vi cookie below so that you can grab the visitor ID for any purpose.

The important portion of the cookie shown above is 49C083D100005A40-A02087F00000039, and similar pairs of 16-character strings will be found in all other s_vi cookies. This is the visitor ID set by Omniture’s servers on the user’s first page view, and it is the value you would need to capture out of the cookie and pass as the visitor ID using an alternative data collection method.

Here are two more items to note regarding the visitor ID stored in the s_vi cookie:

  1. Note that this value is hex-encoded (e.g., the range of characters is 0-F rather than 0-9); it is decoded by Omniture’s servers and is ultimately stored as an integer value.
  2. If your site uses third-party cookies, the s_vi cookie is stored on 2o7.net and your servers will not be able to read it. If this is the case and you are planning to implement a non-JavaScript data collection method, Omniture recommends assigning your own visitor IDs to all users (even on JavaScript-tagged pages, using the s.visitorID variable). Details are available in the SiteCatalyst Implementation Manual.

How can this help you?

Perhaps the “value proposition” for this blog post should have come a few thousand words ago, but after trying to articulate it at the beginning of the post, I realized that it would only make sense coming after the bulk of the information. Let me list just a few ways that understanding visitor IDs and the s_vi cookie can help you in your implementation and reporting efforts.

  1. Using a packet monitor, you can check the s_vi value on successive hits as you browse your site. If the s_vi cookie value ever changes from page view to page view, you may have inconsistencies in your implementation that will inflate visit and visitor counts, as well as those of certain other metrics. Specifically, check to ensure that all of your pages use the same s_code.js file, or at least that all of the s_code.js files on the given site are set to use the same data collection domain. If using first-party cookies, this is set in the s.trackingServer and s.trackingServerSecure variables. If using third-party cookies, this is set using the s.visitorNamespace variable. Warning: If you find a discrepancy in these variables, please contact Omniture ClientCare before making changes.
  2. As has just been discussed, understanding how the s_vi cookie works allows you to use the visitor ID to tie data passed using methods other than JavaScript back to visits and visitors recorded on your web site using JavaScript.
  3. Generally, this information gives you a context for understanding visit-, visitor-, and pathing-based data. If you really dig into it, I’ll bet you can even use the information I’ve presented here to figure out how SiteCatalyst causes eVars to persist (hint: it isn’t in a cookie on the user’s computer!). Since knowledge is power, if you know how SiteCatalyst calculates this data, you’re empowered to use it exciting new ways—even to invent your own solutions to solve critical business questions!

Admittedly, the rest of this post is fairly dense, and yet could easily be spun off in a number of different directions. If there’s anything you’d like to see as the subject of a future post, please let me know. As always, I’m available by e-mail (omniturecare at omniture dot com) and on Twitter (@OmnitureCare) and I’d love to hear from you!