Creating The Perfect Plan for Video Measurement post is part of an on-going series on video measurement tips, tricks and best practices.

The first step of any new measurement project is to identify the “whats”, “whys”, and “hows”. Video is no different. In this post we’ll examine the “what” and “why” of video analytics. The “how” deserves its own post, or many posts.

Creating the perfect plan for video measurement starts with understanding the format (“what”) and metadata (“why”) for your video content. Once you’ve identified your cadence and data points, then clearly documenting the video requirements can turn video measurement from a capsize to a smooth sail.

Within this post, when I refer to site, I really mean wherever your video is playing, which could be a traditional website, a mobile app, or even a game console. Creating a video measurement plan is platform agnostic. Keeping a consistent video strategy across all your platforms is a best practice. Also, when I refer to content, I’m talking about individual video assets that are played via a video player. In this post we aren’t going to worry about asset formats or video player types; those topics are firmly part of the “how” discussion.

Video on Demand (VOD) is the term used to cover any videos that are not live and that start at the beginning of the video when the user presses play. For instance, replaying sailing races from yesterday or highlights from earlier in the day would be delivered by VOD. Compare that to Live Stream videos, which display video content that has a single start time, just like broadcast TV. Users may jump into a live stream at any point during playback. Live coverage of the America’s Cup, a premiere sailing race, would utilize streaming video to broadcast the two week event.

What: Milestones and Frequency

Video measurement is a balancing act between data granularity and server call frequency. You want to send enough calls to accurately capture playback but not more than you need from a cost perspective. Until certain technology limitations, such as browser close behavior, are changed, we are going to need to continue to worry about granularity vs. frequency. If you don’t send tracking calls frequently enough, you could lose data for users who leave the video before your next tracking call is sent. Just like in sailing racing, the goal is to go as fast as possible without tipping over.

The biggest decisions in video tracking continue to be the type of milestones used and the frequency of calls sent. The types of video milestones are percentage complete and time elapsed. Percentage milestones are calculated based on the total duration of the video. By relying on percentage complete metrics, the analyst can standardize video consumption across videos of different lengths. Generally, percentage complete frequencies are used for VOD because the video duration is a known value. Time elapsed milestones are based on seconds of video consumed by the user and provide standardized milestones for video consumption without knowing the total video length or the user starting point. Time elapsed milestones are the standard for live streaming video because video duration is not known, and neither is user starting point.

Regardless of the type of milestone used, I always recommend sending a call when the video starts to play and another call when the video is complete or the user leaves. Every video, no matter what format, should send a start call. The start call is the most important video tracking call you can send because it captures the initial request and user intention. Video complete calls and what constitutes a complete (100% or 98% or…) is a little less clear cut. I recommend sending a 100% complete call, as it is your last chance to collect any video usage data.

The frequency of calls not only indicates how often the tracking calls will be sent but also determines the granularity of the video reports. To identify the frequency of calls that will work best for your content, start by examining the types of video on your site. Are they long-form videos like complete coverage of yesterday’s sailing matches, or are they short-form videos like 30 second clips of racing highlights?

Short-Form VOD videos are more likely to be watched start to finish and they are also, by definition, short. This means you can scale back on your frequency of calls and still achieve a high accuracy. I recommend tracking start and complete milestones and throwing in an optional mid-point call depending on the common consumption patterns for your content as well as the length. For example, if your racing highlights videos average one minute in length and the standings are announced within in the first 30 seconds, then I’d advocate for a 50% call, since many users will close the video before reaching the end. On the other hand, if your videos average 25 seconds, then a start and complete call will be sufficient.

Long-Form VOD videos tend to have a higher incomplete rate and may also have more scrubbing and pausing, making the frequency of calls that much more important. I like to start with quartile tracking (0%, 25%, 50%, 75%, 100%) for long-form videos. From there it is not uncommon to add additional calls towards the start and end of the video (0%, 10%, 25%, 50%, 75%, 90%, 100%). This will give you a little more accuracy for those users who leave the video shortly after starting or those who leave before the credits roll. If your VOD is very long, like replaying an entire day of sailing races, then increasing the percentage calls to every 10% will help maintain a higher accuracy.

Live Stream video measurement follows the assumption that users either stop by for a quick fix or stay and watch until the end. Based on this theory it is believed that users are more likely to drop off soon after they start watching and that the longer they watch the less likely they are to leave. Due to this behavior pattern, I recommend front-loading the frequency of calls, which means sending more calls at the beginning of the user experience and less calls as the user continues to watch the video. A common cadence is to send calls at start, 30 seconds, 1 minute, 2 minutes, 3 minutes, 4 minutes, 5 minutes, then every 5 minutes until the content stops or the user leaves. Sometimes this gets spread out further to every ten minutes or even every thirty minutes for long live events. By sending more calls at the beginning of the America’s Cup live coverage, it will be easier for the analysts to determine the average time spent during each visit, assuming that most users watch less than five minutes of the action (which is not uncommon).

Why: Metadata and Context

Once you’ve decided on the milestone type and frequency of the calls, the next step is to determine why you are tracking the videos and which data points will best meet your needs. Video data falls into two areas, metadata and context of playback data. Metadata is data about the video content and is commonly drawn from the video asset management system. Context of playback is data about where and when the video was viewed. The context of playback data may be found within the video player, the site content management system, or the application displaying the video.

A bare bones video implementation will include video title and/or video id. Examples of additional metadata on video assets include video title, show name, talent, season, content tags, and other data about the asset that is critical to your business. There are a variety of ways to bring this data in, either at playback, or later via data classification. For planning purposes, simply identify all the important data points and include them in your documentation. To go back to the sailing regatta, every sailing video is tagged with the sailors featured in the video. To report on the most popular sailors watched, I must include the list of the athletes’ names as a key metadata element.

Context of playback data can be extensive because it comes from multiple levels of device, application, and site. Carefully identify why you need the data before requesting it. Requesting too much data can make implementation costly and difficult, without adding value to the analysis. Common context of playback data includes the parent page on which the video was played, application name, player name, site section, campaign, and other elements that provide more information on how the video was found and from which experience it was watched. Context of playback data may also include tracking social share links and other elements presented within the video player. On the America’s Cup site, the same video can be displayed on multiple pages. To accurately report on which pages a given video was watched, I need to capture the parent page name.

Creating a Measurement Map

After identifying the “what” and “why” of your video measurement, you will have a long list of data elements. Now it is time to document your plan. The easiest way I’ve found for communicating with developers or fellow analysts, while also creating a document that can be used for QA, is to lay out a simple spreadsheet. This spreadsheet has affectionately been called a “magic decoder ring” and it does just that, by providing a map between the data sent on each call and the variable names within Adobe Analytics.

In the first column of the sheet, list out each of frequency points within the video where a call will be sent. Every column to the right should contain one data point or variable to be captured. Feel free to put in the variable values here including props, evars, and context data. As you read across, each row will show all the data to be collected at each point within the video playback. Don’t forget to include pausing, scrubbing, and any social sharing links if you require tracking calls to be sent.

The following is a sample for the sailing videos, replaying the entire race using long-form VOD, showing daily race highlights using short-form VOD as well as the live player displaying the regatta in real-time. Note the addition of the parent page context and the athlete name metadata. More technical details about these variables and other columns in this example will be part of the next blog post on how to implement.

With the above decisions made and your plan laid out, all that is left is implementation (“how”) and analysis (gold). In my next post I will review the basics of video implementation.