Implementing Twitter Data Tracking in Omniture SiteCatalyst

Yesterday, my colleague, Adam Greco, wrote an outstanding post on the solution he debuted last week at Omniture Summit, which allows Twitter data to be pumped into SiteCatalyst, Discover, Data Warehouse, etc. using the Omniture Data Insertion API. Today I’d like to clarify the how-to of the solution so that you can get it up and running as smoothly as possible to begin reporting on your brand’s Twitter presence.

I won’t provide examples regarding the Data Insertion API framework, because of the wide variety in the types of environments where Twitter data can be captured and passed into SiteCatalyst; the developers at your organization can learn how to build this API using the resources at developer.omniture.com. Most importantly, keep in mind that the general implementation principles discussed below can be tweaked and altered according to your unique business needs, but the fundamentals regarding a.) the capture of this data from Twitter and b.) the compilation of this data into a SiteCatalyst image request are somewhat standard across all environments and implementations.

Adam used the example of a hypothetical web analyst at Comcast, who was asked by his CMO to provide data surrounding four key business questions that this solution can answer:

How often is your company mentioned on Twitter?
Is there ever a spike (positive or negative) in brand-related terms (in a week, day or even hourly)?
Who are the people most often mentioning your company on social media tools and who are they communicating with the most?
When are people on social media tools mentioning key product/service features that your Product Managers should know about?

Here is how you can obtain answers each of these questions using Twitter’s search API and Omniture’s Data Insertion API.

How often your company is mentioned on tools like Twitter?

Queries to Twitter’s search API (using http://search.twitter.com/search.atom?q=omniture, where “omniture” is replaced with a keyword of interest to your business) return XML which follows this general form:

tag:search.twitter.com,2005:12420448292009-02-23T19:58:41ZRT: Omni_man: Want to learn how to integrate Omniture SiteCatalyst and Twitter? Check out my latest blog post: http://is.gd/kzLk2009-02-23T19:58:41Ztwhirljeffjordan (jeffjordan)http://twitter.com/jeffjordan

For each “tweet” (Twitter post/comment) returned by this query to the Twitter Search API, you would build and post an Omniture Data Insertion API request similar to this one:

1.0yourrsidhttp://www.yoursite.comTwitter Mentionevent1

where event1 is the event being used to count up the number of Twitter brand mentions. This Data Insertion API post, when performed for each result returned by the Twitter API, would produce a result like the screen shot given in Adam’s post:

A key component of this solution is ensuring that we do not repeatedly count historical searches. If we make a query to the Twitter API once every 10 minutes, we only want to capture the number of searches that occurred during the last 10 minutes (i.e. since the last query). This can be done by capturing the last number in the element (following the comma) in the top-most result returned by the Twitter API:

tag:search.twitter.com,2005:1242044829

and then putting that number into the “since_id=” parameter in your next Twitter API call, as follows:

http://search.twitter.com/search.atom?q=omniture&since_id=1242044829

Per the Twitter API documentation, this parameter “returns tweets with status ids greater than the given id.” In this example, This ensures that only the relevant tweets entered since the last API call are returned and entered into your Omniture Data Insertion API posts.

To break down the number of tweets, captured in event1 above, by the keywords being tweeted, simply pass the same value used in the q= parameter in your Twitter API call into an eVar in your Data Insertion API post. Storing the keyword in an eVar also allows you to perform multiple queries of the Twitter API to search for multiple keywords (e.g. if you have multiple products), if desired.

In the example above, the Twitter API query is http://search.twitter.com/search.atom?q=omniture. Here, you would simply pass “omniture” into an eVar in your Data Insertion API post, as follows. (This example uses eVar5 to store the keyword.)

1.0yourrsidhttp://www.yoursite.comTwitter Mentionevent1 omniture

You would then use SiteCatalyst Alerts, as Adam suggested, to report on trends in brand mentions on Twitter.

As described in Adam’s post, we will use two eVars to store the authors and recipients of tweets. This information is contained in the XML returned by the Twitter Search API, and obtaining it is simply a matter of parsing certain elements within the element.

Tweet author data can be obtained using the value of the element within the element in the XML example shown above. For your convenience, another example of these elements are reproduced below.

OmnitureCare (Ben Gaines)http://twitter.com/omniturecare

The element returns both the Twitter handle (e.g. OmnitureCare) as well as the friendly name (e.g. Ben Gaines). If you want to capture just the handle (or just the friendly name), you can use a function native to your development environment (such as substr() in PHP) to pull out the portion of the string that you need and place it into a variable on your server. For consistency’s sake, you may want to prepend “@” to the front of the Twitter handle captured in this manner (e.g. @OmnitureCare).

Tweet recipient data must be obtained using the

RT: @Omni_man: Want to learn how to integrate Omniture SiteCatalyst and Twitter? Check out my latest blog post: http://is.gd/kzLk

The recipient here is @Omni_man, so you would need to grab this value out of the

Once author and/or receipient have been captured, you can pass them into SiteCatalyst in the same Data Insertion API post that you are using to count a new mention of your brand, as in the example below. We will use eVar10 to store tweet author and eVar11 to store tweet recipient.

1.0yourrsidhttp://www.yoursite.comTwitter Mentionevent1 omniture@JeffJordan@Omni_man

This implementation will allow the kind of reporting shown in the screen shot below.

When are people on social media tools mentioning key product/service features that your Product Managers should know about?

This is actually the most straightforward of the implementation requirements given by the CMO in Adam’s example, because the full text of the Twitter post is available in the

1.0yourrsidhttp://www.yoursite.comTwitter Mentionevent1 omniture@JeffJordan@Omni_manRT: @Omni_man: Want to learn how to integrate Omniture SiteCatalyst and Twitter? Check out my latest blog post: http://is.gd/kzLk

While eVars are limited to 255 characters in length, Twitter limits tweets to 140 characters, so they should always be able to be passed into eVars. Having this full text available in SiteCatalyst will, as Adam explained, allow you to search for mentions of a particular product, feature, or service in connection with mentions of your company.

A few other important tips

Omniture recommends storing Twitter data in its own report suite. The reason for this is that the Data Insertion API posts described above will count page views, visits, and visitors, but mentions on Twitter are not the same as user interactions with your web site. It would be incorrect (and could wildly inflate traffic counts) to inject Twitter data into your production report suite.
Twitter currently does not charge for calls to its search API. Omniture does charge for server calls associated with passing Twitter data into SiteCatalyst.
Twitter’s API will return, at most, the 1,500 most recent searches mentioning the keyword(s) that you specify. Keep this in mind as you plan the “schedule” according to which you will be querying the API; if your brand or product is mentioned very frequently, you may need to run queries more frequently so that you do not miss results. Use the rpp= parameter to specify how many search results per “page” to return, and the page= parameter to specify what page to start from. These parameters allow you to sift through results when your brand is mentioned too often to return in a single query.
Make sure that the eVars storing the keyword, tweet author, tweet recipient, and tweet text are fully subrelated according to the breakdowns you will need to perform. You may not need to fully subrelate all four variables. If you want to break down Tweet Author by Tweet Recipient, but do not need to break down Tweet Recipient by Tweet Text, then you will need the Tweet Author eVar to be fully subrelated, but the Tweet Text eVar would need only basic subrelations. Omniture ClientCare can help you set up full subrelations on these eVars.
To obtain an accurate unique visitor (i.e. unique tweet author) count in the report suite storing your Twitter data, use a hash on the author handle (username) for the value in the element in your Data Insertion API posts.
Using an additional eVar, you can attempt to capture the “mood” of tweets regarding your brand by searching the

Next time, I’ll discuss how you can use the SAINT API to capture Twitter data in a slightly different manner; it’s an alternative methodology with some real upside in certain cases.

Implementing Twitter Data Tracking in Omniture SiteCatalyst

How often your company is mentioned on tools like Twitter?

Is there ever a spike (positive or negative) in brand-related terms (in a week, day or even hourly)?

Who are the people most often mentioning your company on social media tools and who are they communicating with the most?

A few other important tips