April 9, 2015

Recommending Groups, or When the Whole is Greater than the Sum of its Parts - old

Perhaps the most important trend throughout history has been that of rising complexity [1]. Recommendation systems help manage this complexity by predicting the preference that a user would give to an item [2]. Recommendation systems are a part of daily life – we use them to help us browse the internet, buy products, and choose which movie to watch. In all of these cases, however, recommenders rank the items in isolation from one another. This is acceptable for things like books and movies, but becomes problematic for something like meals. The foods that make up a meal are highly inter-dependent, and a food recommender would have to take these relationships into account when recommending something to eat. For example, I may like ice cream more than mashed potatoes, but if I’m already being recommended a steak I’d prefer the mashed potatoes. In other words, sometimes a user’s preference for an item is dependent upon the mixture of items being recommended [3]. Using this fact, recommendation systems can be improved by recommending items in groups (optimal subsets instead of ranked lists), where the user’s preference of the group is actually greater than the sum of their preference for each item.

Recommendation systems are categorized on a spectrum between two types: content-based and collaborative-based [4]. A content-based recommendation system makes recommendations based on the characteristics of the items being recommended. It usually does this by asking the user to input some seed query, and then using heuristics to evaluate each item relative to that seed query. A collaborative-based recommendation system makes recommendations based on historical usage data. For example, if users A and B have similar usage data, but user A has a high rating for something user B has never interacted with, then a collaborative-based recommendation system would recommend that item to user B. In particular, a collaborative-based approach is “nice” because it doesn’t care about the specifics of the items being recommended, which means that collaborative-based algorithms can be easily re-used for different types of problems. Recommending optimal subsets is not one of those types of problems. The benefit of recommending items as a group comes from making use of the relationships between items, which is essentially a content-based approach. With that in mind, this article will explore a content-based solution for two common recommendation use-cases, and see how recommending items in optimal subsets leads to better recommendations.

Typically, content-based recommendation engines work by using a comparison measure to evaluate each item based on an input query. When developing a comparison measure that can be abstracted to evaluate groups of items, it will be useful to think of each item as a collection of properties that can each be relatively evaluated. Geometrically, this can be visualized by representing each item as a polygon in an n-dimensional space defined by the item’s properties. In this form, items can be intuitively compared to the input query using the intersection and union of overlapping polygons. More abstractly, this comparison measure is known as the Jaccard index, and is defined as the intersection of two sets divided by the union of those two sets [5]. In particular, a Jaccard index of one indicates high similarity, whereas a Jaccard index of zero indicates high diversity. Since groups of items can be squashed into properties just as easily as individual items, this comparison measure will work well at evaluating groups of items. Furthermore, intersections and unions map nicely to two recommendation use-cases: similarity and diversity.

Recommending similar items is one of the most prevalent applications of recommendation systems. For example, the popular music service Pandora algorithmically fills personalized radio stations with similar sounding songs based on a seed artist or song [6]. The problem is that, although each song in the station may sound similar to the seed, they could potentially sound different from one another. In other words, the Jaccard index of each song individually could be high, but the Jaccard index of the group of songs could be low. Visually (where songs are represented as polygons), the resulting shape of overlapping the recommended items would be fuzzy around the edges. If Pandora truly wanted to optimize for the most similar sounding songs, they should consider the set of recommended items as a whole. For example, by recommending items that are similar to one another, in addition to be being similar to the seed. Visually, the shape created by overlapping these recommended items would be sharp, although not necessarily centered around the seed. It’s interesting to note that this approach mimics the approach behind the k-means clustering algorithm, which creates optimal subsets by adjusting the means to find their optimal spots [7]. In fact, the prevalence of recommendation system solutions in response to this type of problem is in part due to the ease and efficiency of effective clustering algorithms.

A more interesting use-case is recommending items where an ideal subset would contain high variety, or diversity. These types of problems are more difficult than the similarity case, because they can’t be reduced to a clustering problem. For example, a recommendation system for recommending players for a basketball team. A traditional recommendation approach would calculate a preference for each player, and then recommend the players with the highest preference. But taking the top 12 players isn’t likely to produce a strong basketball team, because there are predefined roles on a basketball team. A team of rebounders, even highly skilled rebounders, won’t beat a well-balanced team, because basketball is a complex sport that has many more components than just rebounding. In this situation, recommending a subset of players with a low Jaccard index would be more optimal, because those players would represent a well-balanced set of properties. Visually (representing the players as polygons), the resulting shape of the subset would be widespread and sparse. Of course, the Jaccard index is a naive comparison measure in this scenario, because basketball players are highly dependent on one another, and certainly not in a purely divergent way. To take advantage of these intricate dependencies, the comparison measure would have to incorporate additional domain specific logic. This is one of the limitations of recommendation systems built on content-based heuristics.

Thinking about recommendation as an optimal subset problem enables recommendation systems to recommend items in groups where the preference of the group is higher than the sum of each preferences for each item in the group. This is done by using domain-specific heuristics to take advantage of the relationships between items and, in the case of divergently related items, actually opens recommendation system solutions up to a new type of problem. Unfortunately, the domain-specific nature of this approach requires the recommendation system to rely on a content-based approach, which cannot be abstracted as nicely as a collaborative-based approach. The next article in this series will focus on how machine learning techniques could be incorporated into a recommendation system to gain further abstraction while maintaining the benefits of recommending items as a group.

[1] Crash Course Big History

[2] Recommender System (Wikipedia)

[3] Learning Optimal Subsets (Cornell)

[4] Types of Recommendation Systems (Stanford)

[5] Jaccard Index (Wikipedia)

[6] The Music Genome Project (Pandora)

[7] K-means Clustering (Wikipedia)

Evaluating Recommendation Systems (Microsoft)

Multi-Armed Bandits (Wikipedia)

Kudos

Recommending Groups, or When the Whole is Greater than the Sum of its Parts - old

Now read this

[notes] The Toyota Way Fieldbook