MAGIX — the Solution to the Black Box Issue of AI-Driven Personalization

Image source: Adobe Stock / LoloStock.

If your doctor used an artificial intelligence (AI) system to recommend medical treatments for you and the system recommended, “Remove patient’s kidney,” before scheduling your surgery you’d want to know what led the system to recommend that treatment. Although making an error when personalizing digital content with AI isn’t nearly as big a deal as a medical misdiagnosis, you probably still want to know why the machine-learning model delivered the specific content it did to each visitor. Those insights inform your future personalization and optimization activities, and help you get buy-in from stakeholders and leadership for using AI to personalize.

Until recently, when you used Auto-Target and Automated Personalization, two of the Adobe Sensei AI capabilities in Adobe Target, you couldn’t explain why they delivered the content, offers, or experiences they did. You might get incredible lift, but if you asked the Random Forest algorithm which pieces of content worked, what key segments it discovered (what we call “automated segments”), and what it learned, it might respond with, “Umm… well I used a large number of trees.” And if the activity yielded an inconclusive lift, you really want answers to these questions to avoid wasting resources doing the same thing.

This is the “black box” issue of AI — you don’t understand why the algorithm makes the decisions it makes. This issue is not specific to AI in Adobe Target — it actually represents a significant challenge to the field of AI and machine learning in general. To help overcome this issue and open the door wide for using the AI-driven personalization capabilities in Adobe Target, we on the Adobe Target team set out to deliver those key insights. Those insights, available in beta right now, will soon be generally available in the Personalization Insights reports of Adobe Target Premium.

I thought you’d like to know how Adobe Target uncovers those insights. In this final article of three based on a session that my colleague Shannon Hamilton and I presented at Adobe Summit earlier this year, I’ll explain how Adobe Target delivers insights about its Automated Personalization and Auto-Target activities.

To get the complete picture of what we discussed in that session, I recommend that you read the previous two posts:

Our approach to developing the Personalization Insights reports

When we on the Adobe Target team considered solving the black box issue, we first looked at current research. We wanted insights about the models that an algorithm built that a human could understand, but none of the techniques currently available really gave an understanding of the overall patterns the model used. We also wanted a system that was algorithm-agnostic — one that would work with Random Forest or just about any other machine-learning algorithm.

We saw a gap that needed to be filled, so we set about developing our own approach to solving this black box issue. We developed a patent-pending technique we call MAGIX: Model Agnostic Globally Interpretable Explanations. MAGIX finds rules that define automated segments that explain the patterns the model uses across the board. Adobe Target displays the automated segments the model determined worked for specific content and the features the model thinks are important. Think of MAGIX as an interpretation layer that sits on top of the machine-learning model that actually determines how visitors are allocated.

How MAGIX works

MAGIX uses a genetic algorithm to help you understand the decisions made by the algorithms that Adobe Target uses to personalize. As input, MAGIX relies on the Adobe Target visitor profiles that the personalization algorithms of Automated Personalization or Auto-Target score to determine which piece of content in the activity is best to deliver each of your visitors. For example, the AI machine-learning model determines Sarah’s and Jim’s scores for each offer in an Automated Personalization activity, and determines that Sarah is most likely to convert for Offer A, and Jim is most likely to convert for Offer B.

At a high level, MAGIX takes each visitor profile and:

  1. Generates condition sets, the building blocks of rules for the visitor that reflect what was important about them that resulted in their seeing the specific content they saw. For example, one rule (segment) could be, “If female and from Texas, show offer A, but if male and from California, show offer B.”
  2. Uses a genetic algorithm to uncover the automated segments that let it merge all the condition sets across all visitors.
  3. De-duplicates the list of automated segments it has identified, ranks them, and sorts them by relevance.

Let’s dig deeper into the details of what happens at each step:

Step 1: Generating condition sets

When MAGIX generates the condition sets, it’s questioning why the machine-learning model showed a given visitor a specific offer or experience. So, in the earlier example, it’s questioning why the model showed Sarah offer A. MAGIX creates these condition sets by asking what would happen if each visitor attribute in the condition were turned off — would the model still show that visitor the same offer, or would it select a different one? This helps MAGIX determine if the feature is predictive of showing the offer for that visitor.

Simplifying Sarah’s visitor profile, let’s say her visitor profile has just three attributes associated with it: gender, geolocation (state), and browser. MAGIX does the following:

If you had additional attributes for your visitors, MAGIX would do the above step for all of them. In the simplified visitor profile example, you see that the rule that defines why Sarah gets offer A is that she is female and from California — her gender and geolocation matter. Her browser type does not appear to influence the model’s decision to show her offer A.

For each visitor, MAGIX generates the set of conditions, or rules, that explains why each received the offer they received. Of course, if you have 10,000 visitors, you’ll have 10,000 condition sets, some of which may be duplicates. Other condition sets may be so granular that they consider 10, 20, 50, or more visitor attributes when deciding to deliver a specific offer. There’s no way to practically apply this information more broadly, and for a human to be able to interpret this depth of information — it’s information overload. That’s where the next steps help.

Step 2: Merging condition sets

OK, so now you have as many condition sets as you do visitors. In this step, MAGIX merges those conditions to create segments that a human can interpret. Remember, a segment is just a set of conditions like, “female and from California.”

We consider two competing goals when determining how “good” a segment is. On one hand, we want to create segments that represent as closely as possible how the model actually allocated visitors to different offers. On the other hand, we want shorter rules that cover a larger number of visitors in the activity to make them interpretable and actionable_._ If a segment has the highest precision possible, every visitor in it will see the offer that the model thinks they should see every time. Coverage is the size of that segment. You don’t want each visitor to have their own unique rule for interpretation — that’s too detailed, and you might even have a segment for each visitor. But you also don’t want a single rule that covers all visitors — that’s too general, and the segment size would be the entire visitor population.

When developing MAGIX, we wanted to create segments that optimized for both goals. That’s why we decided to use the genetic algorithm. The genetic algorithm gets its name because it takes a “survival of the fittest” approach — in our case to identifying the visitor segments. (For a great introduction to the genetic algorithm, read this post on Analytics Vidhya.)

Here’s how the genetic algorithm works in MAGIX:

We start with a population of individuals, which are the condition sets for each visitor. We apply a “fitness function,” or “F-score,” that we developed to evaluate if a condition set is fit or not fit. By design, the F-score maintains a balance between our competing goals.

We then crossbreed or “cross-over” these condition sets with each other, essentially taking the various attributes used to define one condition set and trading it for the attributes defined in conditions of another set, and vice versa. This gives rise to a new generation of condition sets. And just as would happen with humans, we throw in a mutation here and there. So maybe we switch out one attribute in a condition set for a random one. And then we start the process again and do so for a specified number of generations. Figure 1 below provides a visual of this process.

Read the previous posts in this series

Figure 1. Genetic algorithm process. Image source: Cheng Yu Jade.

Over a number of generations, as you may suspect, we get a highly fit population of condition sets. These condition sets are pretty good rules that define audience segments. The output of the genetic algorithm is a set of automated segments.

Step 3: De-duplicating, ranking, and sorting condition sets

In the final step, we remove any redundant segments and then sort them by F-score to see the top segments that reflect the condition sets, or rules, that the model thought were most important. And we put that in the Personalization Insights report of Adobe Target Premium to explain the model behavior.

Discovering important automated segments

The beauty of this process is that MAGIX discovers and reveals automated segments of your visitor audience that are important for specific content. You can also discover automated segments that are important but didn’t respond well to any piece of content. That lets you think about what defines those segments so you can develop new content that they may like. Or you may find that people in a segment unexpectedly responded well to a piece of content you would not have thought they would. You can then determine if it was the image, the copy, or something else — and apply that learning to designing more content.

Figure 2. Personalization Insights report showing automated segments that MAGIX discovered based on a single attribute (Automated Segment 15) and two attributes (Automated Segment 17).

MAGIX also uncovers the visitor attributes, ranked by level of importance, that the model used in its decision-making across all visitors. It does this by using an algorithmic approach to merging important attributes across all visitor profiles. So, for example, if gender were an important attribute in a number of decisions for what content to serve, then the MAGIX algorithm would surface it as an important attribute.

Figure 3. Personalization Insights report showing the ranked visitor attributes that influenced the model and their relative importance in the model.

So that’s it. That’s how Adobe Target helps you get the valuable insights you need to explain the lift you’re seeing from its AI-driven personalization to stakeholders and leadership. You can also use those insights to develop new tests and personalization activities to keep optimizing your digital properties. Now the question is: what are you waiting for? It’s time to give AI a try and see the levels of conversion and revenue lift it can deliver.