Measuring “logged in” visitors without cookies

Recently, I participated in a great little conversation on Twitter regarding how to handle a very specific reporting need. A user asked how he could tie individual user IDs on his site back to data from the user’s first visit. For example, this user needs to see the original entry page for various users. This seems fairly straightforward on the surface of it, but when you take into account the effects of clearing cookies, changing browsers, changing machines, etc., it can become much harder.

Typically, we would treat these uncontrollable phenomena that affect unique visitor tracking a fact of life in the web analytics game. The method I am about to suggest is definitely not for everyone and should only be considered where a system is in place to assign unique IDs to individual visitors in a reliable manner. By this, I mean that the unique ID must be assignable to a user on every single request that the user generates, without exception. It’s an awesome solution in the right environment—intranets, webmail apps, desktop apps, etc.—where you have a way to identify the user on every page view. If there are “‘logged out” page views, though, please proceed with extreme caution.

I should note that it’s also great for sites that, for whatever reason, do not want to set a cookie. This doesn’t necessarily get around cookie deletion and other similar limitations of visit/visitor measurement, but setting s.visitorID will prevent new s_vi (visitor ID) cookies from being set on users’ computers. Thus, the variable allows you to set your own visitor ID cookie and pass its value into SiteCatalyst.

(Admittedly, what I describe below doesn’t completely address the need of the user who originally asked me about it. I think it’s still worth covering, though.)

Beginning with code version H.9, the SiteCatalyst JavaScript code allows you to assign your own visitor ID to users on your site; the Data Insertion API contains this functionality in the form of the element. This means that can marry a visitor ID to each user ID on your site and then pass that visitor ID whenever the given user ID views a page on your site. For example:

bgaines
16278468165
paurigemma
79018759816
jlebaron
67859175781
cknoch
47698185678

Then, whenever a user hits your site, your servers would detect the user ID (e.g., bgaines), and look up the appropriate visitor ID from the table (16278468165) and place this into the s.visitorID variable:

s.pageName="Intranet Home Page" s.visitorID="16278468165" s.channel="Home" ...

This is actually pretty cool. Cookies are a good method for associating visit/visitor behavior (as described in a <u>previous post</u>), but if you can assign out your own visitor ID values, you aren’t bound by cookie deletion or any of the other limiting factors mentioned above. When a user logs in, you can track him/her as a unique visitor regardless of browser/computer/cookie status. (Yes, this also means you can track visits for users who do not accept persistent cookies. Huzzah.)

A few warnings straight out of our Knowledge Base:

As described above, you must be able to set the s.visitorID variable on every page of the visit. If you cannot do this, then the s.visitorID implementation is not for you.

Similarly, if set, the s.visitorID value for a visitor should not change (in other words, if I can navigate your site anonymously and then sign in, only then allowing you to know who I am, make sure that this does not cause the visitor ID to change). If this cannot be reliably achieved, we recommend against this implementation strategy.

Any existing Omniture-set visitorID (stored in the s_vi cookie) will be migrated to the new s.visitorID value one time without cliffing the visitor (counting the visitor twice and inflating visitor counts.

We recommend pushing the update at the time of least traffic to the site. When the update is pushed, any active visits will end causing an inflation of visit counts for the day. If the update happens when traffic is minimal, you reduce the effect of the spike.

The reason it’s so important to be perfectly consistent in assigning visitor IDs when using this system is that if Omniture code ever does not see an s.visitorID value, it will set an s_vi cookie, and begin to count data for a new visit/visitor. If you then begin to pass an s.visitorID value mid-visit, SiteCatalyst will use that instead, thus counting a new visit where there wasn’t really a new visit. Consider a user who views a “logged out” page and receives an Omniture visitor ID in the s_vi cookie. Then the user logs in, and his username is mapped to the visitor ID pulled from the s_vi cookie. Later, using a different computer, the same user hits another “logged out” page and receives a different visitor ID from Omniture. He then logs in using his former username.

Now you’ve got a problem: Either you change your mapping to reflect the new visitor ID, or you change the visitor ID that you’re using for this user. Either way, you’ve just inflated your visit and visitor counts. If you want to track logged in users across browsers, my recommendation is to just track the logged in pages of your site so that the non-logged in hits won’t inflate visitors. Most sites don’t have enough logged in users for this to be practical, but many do (such as webmail, intranets, on-demand software, etc.).

As described above, implementing a system such as this can allow you to track visitor-based metrics (e.g., Visit Number, Original Referring Domain, etc.) without relying on cookies. This means that, for some types of web applications, a person who last visited your site in 2006 from Sheboygan, Wisconsin, using IE 6 can be identified, three years later, as the same person now logging in from Liverpool, England, using Firefox 3 in 2009.

As always, please feel free to follow me at OmnitureCare on <u>Twitter</u> and/or <u>FriendFeed</u>. I’m also available by e-mail at <u>omniturecare@omniture.com</u> and would love to hear from you via any of these channels!