Analytics and Data Layers: A Look Under the Hood

First, let’s bust two common myths:

Confused? Heard something different? Let’s look under the hood and find out.

Webpage tags are used with many systems and platforms in digital marketing, analytics, and application development. The use of these tags for marketing is growing and accelerating so quickly that it’s hard for large companies to effectively and efficiently manage these technologies. Tag management systems (TMS) like Adobe’s Dynamic Tag Management (DTM) are being adopted quickly as executives start to understand the strategic advantages of using TMS to manage these technology components – the same components that enable vital strategic programs like testing & optimization, content personalization, remarketing, retargeting, campaign optimization, customer feedback, and more.

Of course, these “tags” are often nothing more than a way to collect data about our readers, prospects, and customers. This is the data that allows us to successfully measure, analyze, improve, and control our digital initiatives. In order to be effective, our people, processes, and systems that use and move this critical data from the website to the various end points along the chain need to be efficient and consistent. This is our data supply chain, and data collection is the first set of links in the chain.

Data Collection

In software development and in various Web standards, it’s common to separate complex systems into different layers. This is nothing more than splitting the pieces into tiers that relate to each other in different ways. For example, in an HTML document it’s common to separate and think of the HTML code as a “structural” layer, the style rules of CSS documents as a “presentation” layer, and the functions of JavaScript code or tags as a “behavioral” layer.

Using DTM can give us efficient control and great power over each of these “layers,” and to fully leverage this power and control, it’s important to consider the data collection layer – the first links in our data supply chain. The data collection layer simply consists of the data we care about in our page elements, visitor actions, application states, and events in our websites and other digital environments. These elements, actions, states, and events generate the data that feed our Web analytics tools, our remarketing platforms, our digital campaigns, and our other digital investment opportunities. This is our metadata, the information building blocks we all need to collect, manage, and manipulate in order to report on, analyze and optimize our online businesses.

How we implement and manage the collection of this data has a significant impact on the value we can earn from our digital investments over time.

Page Elements and Visitor Actions

Webpage elements make up the first part of our data collection layer. Page elements are simply the text, images, and other components in our webpages—our markup, code, and other digital resources like images or videos. Page elements help us answer questions like “how many people clicked on the new hero image on the home page during the recent holiday promotional campaign?” The homepage hero image is the page element of interest here, but of course the image itself isn’t nearly as interesting as the number of clicks it received. To capture the click events and send those event counts to our Web analytics or other systems, we first need to identify the right image before we can register or count the click event.

Although this is an overly simple and common example, identifying other page elements and visitor interactions with those elements can sometimes be a bit more involved. Identifying and selecting specific page elements is sometimes called traversing the Document Object Model (DOM) and is often done with JavaScript or jQuery code. The DOM is basically an org chart or “tree” of the different elements in a webpage.

Once we identify and select elements, we can then capture or “handle” visitor interactions with those elements and send metadata about those elements and interactions to various tools or systems, like Web analytics tools, voice of customer/survey tools, or third-party remarketing or retargeting systems. The good news about capturing data by traversing the DOM is that it can be easy and can be even easier with DTM. Simply use the dropdown identifiers and CSS selectors for your page elements, and you’re done.

The potential bad news here is that the HTML markup of many large websites is often poorly formed, invalid, or difficult to access using common DOM traversal and selection methods. This method of data collection can also be fragile and can break when pages are redesigned or content is updated; when the markup of the page (or application) changes, our data collection has to change in sync to remain consistent. If the markup changes, and no one changes the data collection, we could end up with inconsistent reporting and issues in analysis and validation that are difficult to troubleshoot and correct.

Yes, jQuery can make DOM traversal easier, but it won’t help us obtain the src value from an img element with a specific id attribute if a developer deleted it from the page with the last release.

Visitor Actions and Applications

On most websites today, the line between “pages” and “applications” is blurry. Although websites used to have static pages of text and image content that linked to more static pages of text and images, we now have a much more dynamic experience online. Pages, text, images, and videos shrink or expand in response to screen sizes and device types (responsive designs). Full applications now run completely in our browsers, instead of on our desktops (Gmail).

Clicks, swipes, opens, likes, and other interactions that readers, prospects, and customers have with our Web content and applications can often be captured as described above using DOM event handlers registered to specific page elements. Capturing events and interactions that happen when a visitor interacts with a Web application component or feature can be more difficult than capturing simple text or image content interactions, depending on the application design. For example, it’s common to capture text submitted in a form and send it to our Web analytics, CRM, or other systems. It’s also common to use JavaScript to validate or process the form input itself. Capturing form or other application data through the DOM can be challenging, depending on the specific implementation method used, especially as the application code and/or JavaScript in the page executes and interacts with other parts of this behavioral layer.

How Can We Make This Easier?

Unique id Attributes

One basic way to improve our data collection process is to make DOM traversal and selection easier. Adding a unique id attribute to each unique container element in our page markup or application code can really help improve the efficiency and effectiveness of any data collection implemented using DOM traversal methods.

For example, we might have a slider in the hero image location on a key landing page. This is typically a large image that slides, rotates, or otherwise changes every few seconds. It’s also common for the hero container element to be marked up as aorin the HTML. Adding a unique id attribute to this container`or can make it much easier to identify the elements of interest within the container and to enable data capture for visitor interactions with those elements. This makes DOM traversal easier simply because we can easily start at the container element with the id, instead of starting higher up in the markup or code.

Custom Data Attributes

Front-end developers also use another method to add metadata to pages and applications. Adding custom data attributes to individual page elements like paragraphs, sections, images, or div containers, is just like adding a “label” to these elements.

Adding these custom data attributes helps us identify specific elements of interest in our pages and applications. In our hero image example, the original business question involved a holiday promotional campaign. When planning and deploying the image assets for this campaign, the developers could easily add a custom data attribute to each image, allowing the individual promotions to be linked to visitor interactions with those image assets. Standard markup for these images could be: and adding our data attribute simply means adding data-campaign=“holiday­promo” to the markup.

Although this approach can be effective, it does require more careful planning than the unique id additions on container elements. Because this involves adding metadata to individual elements and not just unique container elements, it requires more thought and planning to ensure consistent taxonomies and implementation across our sites. Some also consider it a more fragile method of adding metadata, especially if the communication within and between different Web teams and business units is not timely and managed consistently.

A Data Collector or Data Object

As pages and Web applications are planned, developed, tested, and deployed, we can ensure that the metadata we want to capture is within the page, screen, or application view. By presenting the exact data we want to capture, at the exact time we want to capture it, we can ensure one of the most robust, accurate, and consistent forms of data collection currently in use.

JavaScript Object

Two methods commonly used to implement this capability are JSON values and JavaScript objects with properties and values in the markup. In either case, this just means we are surfacing the appropriate values at the appropriate time so our data collection code can pick them up and send them to the appropriate system in a very efficient, effective, and consistent manner. Most front-end developers are familiar with this approach and can usually implement it as part of their existing development work. Again, it’s important to plan and document the data we want to capture before development begins, or at least early on in the development process.

Back-end developers who work with CMS templates and other server-side code can also help with this type of metadata. If your CMS can be programmed to dynamically populate the markup of your pages, screens, and views with metadata that might otherwise not be available client-side, we should then have all the required metadata available in the right place, in the right format, at the right time for data collection.

Better Data Enables Better Decisions

In practice, we usually see and use all of the data collection methods mentioned in combination. Few large company websites have built out a complete data collector or data object model with all the data they want to capture from their webpages and applications. Whichever data collection strategies we choose, adequate planning, documentation, and timely communication across teams can go a long way in helping us ensure that the first link in our data collection supply chain is a strong one.

The People Factor

Scraping the DOM, selecting custom data attributes, and working with data objects are just three methods of working with the metadata within our “data layer.” In practice, any one of these implementations may be too time consuming, too expensive, or too fragile for a particular team to consistently implement and manage over time. The processes, politics, and people in big organizations typically have a greater effect on the degree of success with one or more of these methods than anything related to the particular technology in question.

The Road to Standardization

Like most things on the Web, new technologies and techniques can start out as bleeding-edge, gain wider adoption, and eventually become “standards” or “best practices.” The use of a structured data object with a specific syntax for object names, property names, formats, and value types is a long way from “standard,” but there is a good start. The Customer Experience Digital Data Community Group hosted by the W3C has put out two reports on their data layer work. The Digital Data Layer 1.0 “Final Report” details their work toward eventually standardizing a data layer format using a JavaScript data collection object. The Customer Experience Digital Data Acquisition Draft details their work toward specifying the parameters for communicating this data to digital analytics and other tools or systems.

Both reports and all the work by this group is an excellent effort by many individuals that helps move the conversation forward when discussing data layers and the tools and systems that use the data. However, a standard is only as good as its adoption. If no one complies, or only a few comply, the standard loses much of its value. But this work is still an excellent starting point.

Using DTM with or without a Data Layer

The really good news is that you can use DTM today regardless of where or how your source data exists on the Web. In DTM, there are several ways to identify, select, and capture metadata from webpages and applications. Data Elements can be an easy and useful way to capture metadata, regardless of where that data exist in the page. Any time there are values we’ll refer to more than once within DTM, we should definitely consider creating a Data Element to represent and persist those values within DTM.

Page Load, Event Based, and Direct Call rules can also make it easier and more efficient to identify, select, and capture metadata, whether or not you decide to use Data Elements. DTM is flexible, so it’s easy to use the system regardless of where your data layer elements exist.

In a future post, we’ll look at specific ways to do this with DTM.`