January 3, 2016

[work term report] Building a Decision Engine for Megaphones on Instagram

Introduction #

Instagram is a popular photo-sharing social network. It follows a publish and subscribe model where users publish their photos to the accounts that follow them (their followers), and see photos from the accounts that they follow (their followings). The photos from each following are aggregated together into the content feed, which is the primary surface of the application. In addition to organic content, the feed contains advertisements, which are pieces of content that companies have paid Instagram to promote. At the top of the feed, Instagram will occasionally show a megaphone. Megaphones are Instagram’s way of self-promoting actions within the app, usually for the purpose of converting low-engaged users to high-engaged users. For example, if a user doesn’t have many followings, Instagram will show them a Suggested Users megaphones to help them discover great content to follow.

Instagram has many different megaphones that it can show to users. Furthermore, Instagram would like to build even more unique experiences that can be selectively targeted with megaphones. As the number of megaphones increases, it begs the question: “How can Instagram show the right megaphone to the right user at the right time?” This report first describes how Instagram built a decision engine to answer that question, and then expands on the answer by explaining how the results of such a decision engine can be post-analyzed to provide additional product insight.

Megaphones are Similar to Advertisements #

Users come to Instagram to see organic content in their feed. Megaphones and advertisements have a similar tradeoff in that they displace organic content to generate value. In particular, advertising has been well studied because it provides the revenue backbone for numerous companies, including Instagram. For this reason, it makes sense to look first at how advertisements are targeted, and then see if the process can be co-opted to target megaphones as well.

Traditionally, advertisements weren’t targeted. A content company, such as a television network, would have a certain number of advertising slots. Companies would bid on those slots, and the space would go to the advertisers with the highest bids (Yan). The value of an advertisement was the bid price. Modern Internet companies, like Google and Facebook, were able to improve on this model by personalizing the advertising slots for each user. To achieve this, they introduced a “User Experience” (UX) variable, which represents the per-user impact that each advertisement has on the quality of the product.

This is best understood in the case of Google’s “Cost Per Click” (CPC) ad. In a CPC ad, the company bids on a click, which means they only pay when their advertisement is clicked on (“CPC Bidding”). Since the ideal user experience is to show users advertisements that they want to click on, the UX term can be simplified to Click Through Rate (CTR), which is the percent chance that the user clicks on the advertisement. The value term is still defined as the bid price. The product of the two, bid price * CTR, is the expected revenue of the advertisement. To choose the most optimal CPC ad for each user, Google simply selects the advertisement with the highest expected revenue

Of all types of advertisements, megaphones are most similar to CPC ads. Like CPC ads, megaphones have a value to the user, but that value is only attained if the user clicks through the megaphone. Therefore, is makes sense for Instagram to target megaphones using a process similar to CPC ads.

OptimalMegaphone = max{value * CTR}

This is the high level architecture of the megaphone decision engine. The tricky part is predicting the unknowns, value and CTR, and doing so in a way that is both personalized and scalable.

Predicting the Value of a Megaphone #

The standard way Instagram estimates value is by isolating changes in an A/B test and then analyzing their effects on important metrics (Chopra). In the case of megaphones, the purpose is to promote engagement, so the important metric is Time Spent. Therefore, the standard approach would be to set up a test with one test group for each megaphone, and a control group that doesn’t receive any megaphones, and then compare the increase in average Time Spent for each test group over control. If Megaphone A provided a 4% increase in Time Spent and Megaphone B provided a 2% increase in Time Spent, then Megaphone A would be said to have provided twice as much value as Megaphone B.

The problem with this approach is that it doesn’t compute the value on a per-user scale. Since the goal of targeting is to show “the right megaphone to the right user”, what’s really needed is a model that estimates the value of a megaphone given a user. Such a model can be constructed by running a simple linear regression on a sample of users that are all equally likely to see any of the megaphones in a given time period. Then the change in Time Spent can be calculated on a per-user level using the value before seeing the megaphone as a baseline. The result is a simple linear regression of the form:

value = DeltaTimeSpent ~ UserAttributes + Megaphone * UserAttributes

where DeltaTimeSpent just subtracts the post-megaphone value from the pre-megaphone value, UserAttributes is a row vector of features, and Megaphone is a factor variable storing which megaphone the user saw (and zero if the user did not see a megaphone). Megaphone*UserAttributes is an interaction variable that represents the effect that UserAttributes has on the value of a megaphone.

It’s also important to note that the Megaphone*UserAttributes terms can be negative, which happens when the showing the megaphone would actually decrease Time Spent. If all the Megaphone*UserAttributes terms are negative, then it is in Instagram’s best interest to not show a megaphone.

Predicting the CTR of a Megaphone #

In the case of predicting CTR, Instagram can take the model one layer deeper by employing a useful heuristic called Thompson Sampling:

CTR = (α * CTRi + Clicks) / (α + Impressions)

Thompson Sampling relies on the assumption that past behavior will be a good indicator of future behavior (Chapelle). When there’s no historical data (e.g. Impressions = Clicks = 0), then CTR = CTRi, which can be calculated from a regression model similar to the example above. As data is added to the system (e.g. Impressions ≥ Clicks > 0), CTR begins to approach the trend Clicks / Impressions. The constant α controls the balance between the regression model and the trend; a larger value of α will place longer emphasis on the model. The beauty of Thompson Sampling is that it incorporates user behavior to temper the model’s predictions. This leads to a nice dropoff for users that frequently dismiss megaphones.

Analyzing the Megaphone Decision Engine for Product Insight #

The megaphone decision engine essentially boils down to a complex equation mapping user attributes to expected megaphone values. Using these results, Instagram can cluster users into four groups:

The users in the left half of this chart are high-engaged, and thus already find value in Instagram. The users in the top half of this chart are getting value from megaphones, which means that Instagram has crafted experiences that will help them find value in Instagram. The users in the bottom right quadrant of this chart are both not finding value in Instagram, and don’t have experiences that Instagram can show them to help them find value in Instagram. These users represent an opportunity for Instagram to craft new experiences that can help a group of users being currently under-served. The results of the megaphone decision engine provide a unique opportunity to identify this group of users, so that they can be observed and highlighted in the design of new features.

Conclusions #

Instagram uses megaphones to self-promote actions that it thinks will help improve the experience of low-engaged users. Instagram has multiple different megaphones that it can show users, which begs the question: “How can Instagram show the right megaphone to the right user at the right time?” As Instagram increases its number of megaphones, a scalable targeting system will offer a high level of personalization by optimally choosing the best megaphone (or no megaphone) for each user.

Such a system of optimally choosing megaphones in similar to Google’s modern practice of targeting CPC advertisements. This system relies on two terms: a value term representing the potential revenue, and a UX term representing the chance that the revenue will be attained. Mimicking this process, a decision engine can be built for megaphones on Instagram relying on two models: one to predict the potential value of the megaphone, and another to predict the chance that the user will click on the megaphone.

In addition to targeting megaphones, the megaphone decision engine can be used to provide useful product insights. The results of targeting will cluster users into four groups, one of which is a group of low-engaged users not being helped by existing experiences. These users represent an opportunity for Instagram to build new experiences that appeal to a group of previously under-served users.

Recommendations #

As part of the Instagram Growth team I helped build the first iteration of the megaphone decision engine. My first recommendation is to continue optimizing the system. The megaphone decision is built on two models, each of which can be honed by feeding more data and fine-tuning parameters. Another opportunity is to expand on ∆_(TimeSpent )as the megaphone value definition. I would recommend experimenting with value definitions that take into account high leverage metrics like Reciprocal Followers and Feed Inventory.

My second recommendation is to follow up with the megaphone decision engine by analyzing the results to provide product insights. The results of megaphone targeting provide a unique opportunity to identify a group of users that is currently being under-served. By observing these users we can improve our understanding of them, and hopefully build experiences in the future to help them find value in Instagram.

References #

Chapelle, Olivier, and Lihong Li. “An Empirical Evaluation of Thompson Sampling.” Advances in Neural Information Processing Systems 24 (2011): n. pag. Print.
Chopra, Paras. “The Ultimate Guide To A/B Testing.” Smashing Magazine. N.p., 23 June 2010. Web. 12 Dec. 2015.
“CPC Bidding.” AdWords Help. Google, n.d. Web. 12 Dec. 2015.
Evans, David S. “The Online Advertising Industry: Economics, Evolution, and Privacy.” Journal of Economic Perspectives 23.3 (2009): 37-60. Web.
“How The Auction Works.” Facebook Help. Facebook, n.d. Web. 12 Dec. 2015.
Yahyaa, Saba, Madalina Drugan, and Bernard Manderick. “Thompson Sampling in the Adaptive Linear Scalarized Multi Objective Multi Armed Bandit.” Proceedings of the International Conference on Agents and Artificial Intelligence (2015): n. pag. Web.
Yan, Jun, Ning Liu, Gang Wang, Wen Zhang, Yun Jiang, and Zheng Chen. “How Much Can Behavioral Targeting Help Online Advertising?” Proceedings of the 18th International Conference on World Wide Web (2009): n. pag. Web. 11 Dec. 2015.

Kudos