May 1, 2015

Recommending Teams

Recommendation systems are an important part of daily life – we use them to help us browse the internet, buy products, and choose which movie to watch. The way these recommenders work is by predicting the preference that a user will have for an item [1], but the problem is that most recommenders rank items in isolation from one another. People do not rank items in isolation from one another. A person’s preference for an item can be highly dependent on the context that it’s being presented in. For example, I like ice cream more than mashed potatoes, but if there’s steak around, then I’d rather have the mashed potatoes. In other words, my preference for mashed potatoes is higher when the mashed potatoes are put in the context of steak. Recommendation systems can use this context to provide better recommendations.

The primary means that a recommendation system has to control context is through the list of items being recommended. These items are usually selected (and ordered) based on their isolated preference, but when creating contextual recommendations it will be more useful to think of this list as a team. Organizational Behavior defines a team as a group whose individual efforts result in performance that is greater than the sum of the individual inputs [2]. By using context, recommendation systems should be able to recommend a team of items where the user’s preference for the team is actually higher than the sum of the user’s preference for each item individually [3].

Thinking about recommendation as the selection of a team enables recommendation systems to solve a new set of “team-creation” problems. In team-creation problems, the inter-relationships between items are so strong that they overwhelm the traditional recommendation process. Consider a recommendation system for players on a basketball team. Basketball teams are complex. The players on the team must fill a number of predefined roles, and must work together to be successful. A team filled with the best players as determined by a ranked list would likely perform poorly next to a well-balanced team. This makes a basketball team the ideal example for explaining how a recommendation system can improve the quality of its recommendations by recommending items as a team.

There are two types of recommendation systems: content-based and collaborative-based. A content-based recommendation system makes recommendations based on the characteristics of the items being recommended. A collaborative-based recommendation system makes recommendations based on historical relationships between users and items. In general, mathematicians prefer the collaborative-based approach, because it doesn’t care about the specifics of the items being recommending and can therefore be nicely abstracted to many different problems [4]. In the team-creation case, however, the benefit of recommending items as a team comes from utilizing the relationships between items, which is an inherently content-based approach. For this reason, our basketball team recommender will have to be a content-based recommendation system.

Typically content-based recommendation systems rely on heuristics to approximate the preference of items. On common heuristic is similarity (oftentimes with regards to an input query), and one common measure of similarity is the Jaccard index. The Jaccard index is defined as the intersection of two sets divided by the union of those two sets [5]. In particular, a Jaccard index of one indicates high similarity, whereas a Jaccard index of zero indicates low similarity (high diversity). In our basketball team example, we want to construct the opposite of a team of similar players, we want a well-balanced team of different players. In other words, we’ll want a low Jaccard index.

The first step of this recommendation system will be to decompose teams into sets of characteristics that can be compared using the Jaccard index. This can be done similar to how Pandora organizes songs, by classifying them using a set of predefined traits [6]. The next step is to check every possible k-item subset and see which one has the lowest Jaccard index. The subset with the lowest Jaccard index will have the highest diversity of traits, making it the most well-balanced team. The final step is for the recommendation system to recommend the items on the team. If the diversity heuristic is an accurate model of basketball team success, then this team should have the highest combined preference.

Basketball enthusiasts will find something wrong with this example; a well-balanced basketball team isn’t always the best. In some cases, it makes strategic sense to overwhelm the other team in key areas, like shooting or rebounding. This is one of the limitations of content-based recommendation systems: their quality is only as good as the heuristic used. Basketball teams are complex. The relationship between players is intricate, and the formula for success is complicated. This makes the associated heuristics confusing, and the process of developing heuristics time-consuming and difficult. To make matters worse, the more specific the problem domain is, the more domain-specific the heuristic has to be. This makes heuristics un-abstract, and difficult to apply across different scenarios.

Computer Scientists will also find something wrong this example. Namely, the step that checks every 12-item subset operates with a complexity of nC12 (n choose 12, since basketball teams are often comprised of 12 players). That means that even on a field with only 50 applicants, the algorithm would have to check over 100 billion possible combinations. This is obviously not scalable. One way of improving this is by applying another heuristic called Iterated Recommendation. The idea of Iterated Recommendation is to use the previous N items as part of the basis for recommending the N+1th item. Instead of checking all possible teams, this algorithm uses each iteration to add the most optimal item to the existing team. Instead of running in nC12 time, it runs in n¹² time.

Traditionally, recommendation systems have ranked items in isolation. This ignores the relationships between items, which in some cases can be large part of an item’s preference. In team-creation problems, the relationships between items are so important that it makes sense to recommend items as part of a team. Doing so can theoretically create recommendation teams where the user’s preference for all the items together is actually higher than the sum of of their preference for those items separately. This is exemplified in the basketball team recommendation system. Unfortunately, to take advantage of the relationships between items, the recommendation system must be content-based, which means it relies on confusing and domain-specific heuristics. The next article in this series will focus on how machine learning techniques can be used to simplify the selection logic in team-creation problems.

[1] Recommender System (Wikipedia)

[2] Organizational Behavior Notes

[3] Learning Optimal Subsets (Cornell)

[4] Types of Recommendation Systems (Stanford)

[5] Jaccard Index (Wikipedia)

[6] The Music Genome Project (Pandora)

Evaluating Recommendation Systems (Microsoft)

K-means Clustering (Wikipedia)

Multi-Armed Bandits (Wikipedia)

Kudos

Recommending Teams

Now read this

[idea] Opin.io