How Adobe’s Enhanced Smart Tags Capability Empowers Marketers to Find the Most Relevant UGC Video

Mar­keters are spend­ing a whop­ping $10 bil­lion a year on con­tent in the US alone, with half of that going into con­tent cre­ation. It may come as a sur­prise, though, that about 20 cents of every dol­lar spent cre­at­ing con­tent actu­al­ly goes to waste due to inef­fi­cient process­es and out­put, amount­ing to a stag­ger­ing $1 bil­lion year­ly loss. Pair that fig­ure with the fact that, in 2018, only 28 per­cent of mar­keters report­ed sig­nif­i­cant returns to their con­tent mar­ket­ing cam­paigns, and we can safe­ly say that con­tent mar­ket­ing has a seri­ous scal­ing problem.

As con­tent and screens mul­ti­ply, lever­ag­ing user-gen­er­at­ed con­tent (UGC) is the key to alle­vi­at­ing the scal­ing chal­lenges brands face in an increas­ing­ly con­tent-hun­gry and per­son­al­ized world. This is because UGC is not only cost effec­tive — in most cas­es free by obtain­ing the rights — but also more authen­tic with bet­ter per­for­mance. 64 per­cent of social media users seek UGC before mak­ing a pur­chase and UGC videos receive 10x more views than brand­ed videos (source). That’s why some of the world’s most respect­ed brands, like Apple or Star­bucks, have made UGC a pil­lar of their con­tent strat­e­gy for years now. How­ev­er, this tra­di­tion­al­ly requires mar­keters to sift through pages of social media posts to find the few gold­en nuggets they can repurpose.

Turning to AI to find the best UGC for the job

To address these chal­lenges fac­ing mar­keters, Adobe is tap­ping into com­put­er vision to help auto­mate the UGC cura­tion efforts that were pre­vi­ous­ly done by hand. Smart Tags, pow­ered by Adobe Sen­sei, auto­mat­i­cal­ly scans images and iden­ti­fies the key objects, object cat­e­gories, and aes­thet­ic prop­er­ties to use as descrip­tive tags. This allows mar­keters to fil­ter out image con­tent with tags that do not match their search cri­te­ria. Yet while Smart Tags has been an effec­tive tool for images, video is by far the most con­sumed media type on the web today.

Adobe’s Smart Tags.

Video is grow­ing at a mas­sive pace — accord­ing to Cis­co, video will account for 82 per­cent of all web traf­fic by 2021 and the num­ber of videos post­ed on Insta­gram grew 4X last year. This pos­es a seri­ous chal­lenge for mar­keters and tech­nol­o­gists alike as to date, curat­ing video con­tent has been labo­ri­ous since a user needs to man­u­al­ly watch lots of videos to find rel­e­vant footage. Videos are much heav­ier and have a tem­po­ral dimen­sion, mak­ing them more chal­leng­ing than images to clas­si­fy, fil­ter and curate. This is pre­cise­ly why we part­nered with Adobe Research and Adobe’s Search team to enhance our cur­rent Smart Tags capa­bil­i­ty in Adobe Expe­ri­ence Man­ag­er to han­dle UGC video classification.

How we built Smart Tags for video in Adobe Experience Manager

We sought out to auto­mat­i­cal­ly out­put a set of rel­e­vant tags for a giv­en video. This ulti­mate­ly result­ed in the Video Auto Tag Adobe Sen­sei ser­vice which pro­duces two sets of tags for a video of up to 60 sec­onds in length. The first is a set cor­re­spond­ing to depict­ed objects, scenes and attrib­ut­es in the video, and the sec­ond cor­re­sponds to depict­ed actions in the video. These tags are used to improve search and retrieval of videos.

Our sys­tem builds on in-house Adobe image auto-tag­ging tech­nol­o­gy that was trained on a large col­lec­tion of images from an inter­nal Adobe image dataset and can pre­dict tags over a large vocab­u­lary. As a video con­sists of a sequence of frames, we first apply the image auto-tag­ger to the frames and aggre­gate the out­puts across time to pro­duce a final tag set for the video. This process results in a set of tags that typ­i­cal­ly cor­re­spond to the objects, scenes, and attrib­ut­es depict­ed in the video, since the image auto-tag­ger has been trained to make such predictions.

In addi­tion to objects, scenes, and attrib­ut­es, it is impor­tant to rec­og­nize tem­po­ral­ly vary­ing events — actions and activ­i­ties — in a video. Exam­ple actions and activ­i­ties include “drink­ing” and “jump­ing”. This is addressed by adapt­ing the image auto-tag­ger to pre­dict actions by train­ing on a curat­ed set of ‘action-rich’ videos with accom­pa­ny­ing action labels derived from user meta­da­ta from an inter­nal Adobe video dataset. The action auto-tag­ger is applied across mul­ti­ple frames in the video, aggre­gat­ing the results over time to pro­duce the final action tag set for the video.

Takeaways for marketers, data scientists, and their development teams

UGC is an essen­tial tool to help reduce con­tent mar­ket­ing costs, improve the effec­tive­ness of cam­paigns, and tack­le the scale issues mar­keters face today. Teams sup­port­ing these mar­keters are now empow­ered through com­put­er vision — par­tic­u­lar­ly video under­stand­ing — to bet­ter lever­age UGC, the most valu­able and pop­u­lar con­tent for­mat on the web, by accel­er­at­ing workflows.