The Netflix personalized recommendation system helps our members discover great content. That doesn't just mean recommending the right titles, but also displaying the right imagery. The artwork representing a title captures something compelling to you.
Previously, through multi-armed bandit algorithms, we found the best artwork for a title that would earn the most plays. However, given the enormous diversity in taste, wouldn’t it be better if we could find the best artwork for each of our members to highlight the aspects of a title that are relevant to them?
Consider personalizing the image for the movie Good Will Hunting. Someone who has watched many romantic movies may be interested if we show Matt Damon and Minnie Driver, whereas, a member who has watched many comedies might be drawn to Robin Williams.
Whereas typical recommendation settings present multiple selections and learn about member preferences from the item a member selects, we can only select a single piece of artwork to represent each title. As a result, we had to ask ourselves a few questions.
- How does changing artwork (after a member has seen the TV show or film) impact things? Does it reduce the recognizability of the title and make it difficult to visually locate the title again? Does changing the artwork make it difficult to visually locate the title again?
- How does artwork perform in relation to other artwork? Maybe a bold close-up of an actor works on a page because it stands out, but if every title had a similar image then the page as a whole may not seem as compelling.
- How do we find a good pool of artwork for each title? The set of images for a title also needs to be diverse enough to cover a wide potential audience interested in different aspects of the content.
Finally, there are engineering challenges to personalize artwork at scale. So using personalized selection for each asset means handling a peak of over 20 million requests per second with low latency.
Contextual bandits approach
Much of the Netflix recommendation engine is powered by machine learning algorithms running on batch data. But for artwork personalization, rather than waiting to collect a full batch of data, train a model, and run an A/B test, we decide to use contextual bandits, an online machine learning framework.
Contextual bandits trade off the cost of gathering training data required for learning an unbiased model on an ongoing basis with the benefits of applying the learned model to each member context. The training data is obtained through the injection of controlled randomization in the learned model’s predictions.
We also look at the quality of engagement to avoid learning a model that recommends “clickbait” images: ones that entice a member to start playing but ultimately result in low-quality engagement.
To evaluate prior to deploying them online on real members, we use an offline technique known as replay, comparing what would have happened in historical sessions under different scenarios if we had used different algorithms in an unbiased way.
After experimenting with many different models offline and finding ones that had a substantial increase in replay, we ultimately ran an A/B test to compare the most promising personalized contextual bandits against unpersonalized bandits. As we suspected, the personalization worked and generated a significant lift in our core metrics.
Summarized by Reforge. Original article by Ashok Chandrashekar • Manager, Discovery Research @ Netflix