=Paper=
{{Paper
|id=None
|storemode=property
|title=Probabilistic Game Theoretic Algorithms for Group Recommender Systems
|pdfUrl=https://ceur-ws.org/Vol-793/womrad2011_paper6.pdf
|volume=Vol-793
}}
==Probabilistic Game Theoretic Algorithms for Group Recommender Systems==
Probabilistic Game Theoretic Algorithms
for Group Recommender Systems
George Popescu Pearl Pu
EPFL - HCI Group EPFL - HCI Group
IC IIF Station 14 IC IIF Station 14
1015 Lausanne, Switzerland 1015 Lausanne, Switzerland
+41 21 693 1246 +41 21 693 6081
george.popescu@epfl.ch pearl.pu@epfl.ch
ABSTRACT distinct strategies based on these. Social choice theory aims to
Aggregating users’ individual preference and recommending a offer an answer to “which strategy is most effective and will be
common set of items for a group has become a challenging topic most liked by a group of users?” (Hastie and Kameda, 2005).
in group recommender systems and social websites. This issue is With the purpose of determining what strategy people actually
mainly concerned with the following three objectives: eliciting use, Masthoff (2004) found that individuals use computationally
individual users' preferences, suggesting outcomes that maximize simple strategies mentioned above, particularly the average
the overall satisfaction for all users and ensuring that the strategy, the average without misery and the least misery strategy.
aggregation mechanism is resistant to individual users' However, there is no dominant strategy as people switch between
manipulation. Firstly we show how our proposed probabilistic them given a different context. Fairness plays an important role in
weighted-sum algorithm (PWS) works and emphasize on its decision making but members do not have a clear strategy for
advantages. Then we compare PWS with related approaches applying it.
implemented in similar systems using the case of our music Our main research question is to determine “which group
recommender, GroupFun. We describe an experiment design to satisfaction rule best satisfy users expectations”. We propose 4
study users’ perceptions of the algorithms, their perceived fairness algorithms and investigate upon: “which algorithm is best suited
and incentives to manipulate the final recommendation outcome. to meet users’ expectations” for our music recommender system,
We expect our results to show that PWS will be perceived as fair GroupFun and “how users perceive the algorithms’ accuracy”.
and diversity- and discovery-driven, thus enhancing the group's Next we present related work, then considered algorithms together
satisfaction. Our future work will focus on the actual evaluation of with our implementation and future experiment design.
GroupFun using the experiment design presented here.
2. BASELINE AND RELATED WORK
Categories and Subject Descriptors
H1.2 [User/Machine Systems]: human factors; H5.2 [User 2.1 MusicFX
Interfaces]: evaluation/methodology, user-centered design. MusicFX is a music system offering best music match to
employees working out in a fitness center (McCarthy and
General terms Anagnost, 1998). The algorithm aims at selecting the most
Experimentation, Human factors. preferred music genre that maximizes members’ listening
pleasure. For this it computes a group preference index and sums
Keywords squared individual preferences. Then it lists the most popular
Quality measurement, usability evaluation, recommender systems, categories. The system also saves events in its history such as:
quality of user experience, post-study questionnaire. member entrance, member exit, individual preference update,
system parameter adjustment and maximum station play time
1. INTRODUCTION elapsed. Since some individual preference filters may not change,
Group recommender systems use various aggregation strategies to the system opts for a different music configuration according to
suggest a common list of items to a group of users. These two criteria: playing more the music which members like most
strategies aim at increasing the group’s welfare and are based on and playing less the music which members like least. The
users’ votes on items. The social welfare is an aggregate of weighted random selection operator is one strategy used to reduce
individual utilities of all group members. Most common used the likelihood of starvation. Another strategy is limiting the period
deterministic strategies are: plurality voting, utilitarian, approval of time for one genre to be played – regardless of how popular it
voting, least misery, most pleasure, average without misery, is – before the selection algorithm is invoked in order to select a
fairness, most respected person, Borda count, Copeland rule or new station. MusicFX has two important advantages: (1) it
Kemeny scores (Masthoff, 2005). One can easily create other increases the variety of music and (2) it democratizes the music
selection process. Thus it is adaptive to changing preferences of
its users also proposing new songs for them. One drawback of the
WOMRAD 2011 2ndWorkshop on Music Recommendation and system is that it changes music stations abruptly in the middle of
Discovery, colocated with ACM RecSys 2011 (Chicago, US) the songs.
Copyright ©. This is an open-access article distributed under the terms
of the Creative Commons Attribution License 3.0 Unported, which
permits unrestricted use, distribution, and reproduction in any medium,
2.2 PolyLens
provided the original author and source are credited. PolyLens is a collaborative filtering recommender system which
recommends movies to groups of people based on their individual
preferences (O’Connor et al. 2001). It represents a group
extension of the MovieLens recommender system with over outcomes are songs si that are selected in the common playlist.
80,000 users and their ratings for more than 3,500 movies (with a
total of nearly 5 million votes). Users can create and manage We let each user a j submit a numerical vote score( s i , a j ) for
groups, access individual and group recommendations and receive
notification alerts for group invitation. The algorithm uses the each song si that reflects their preference for that song. These
least misery strategy given that the groups formed to watch a votes are given as ratings on a 5-point Likert scale and normalized
movie together tend to be small. As such the group is as happy as so that the scores given by each user sum to 1:
its least happy member. The authors mention that “the social ,
value function and algorithm are unlikely to work well for large , =∑ (1)
groups”. They further note that “it is still an open research ,
question to understand the types of social value functions that best We then assign a joint score to each song that is computed as the
satisfy large groups and to implement them algorithmically”. sum of the scores given by the individual users:
2.3 Voting strategies
An extensive study on a group of television viewers aiming at
= ∑ ∈ , (2)
finding which strategy people use was realized by Masthoff
To choose the songs to be included in a playlist of length k, a
(2005). In the experiment 10 deterministic group voting rules are
compared: plurality voting, utilitarian strategy, Borda count, deterministic method is to choose the k songs with the highest
Copeland rule, approval voting, least misery strategy, most joint rating: weighted sum (DWS):
pleasure strategy, average without misery strategy, fairness
strategy and most respected person strategy. The experiment = ∑ (3)
shows that individuals do not use a clear dominant strategy, but ∈
average, average without misery, and least misery are all plausible
The probabilistic weighted sum (PWS) iteratively selects each of
candidates for implementation. In a different experiment
addressing how people judge the recommendation results the k songs randomly according to the probability distribution:
multiplicative utilitarian strategy is the most promising strategy,
but the other strategies received close scores. In the study of = ∑ (4)
television viewers the hypothesis that social status influences ∈
selection has no statistical dominance. Non-linear utility suits By comparison, the probabilistic sum (PS) method chooses the k
better users’ expectations than a linear one. For instance quadratic
songs with equal probability:
rating scale is appropriate for implementation. Furthermore, there
is strong evidence that human subjects use a series of simple
strategies in different judgment contexts they face. For instance, if = (5)
|"|
one takes the satisfaction of the group to be the average of the
satisfaction of the individuals, then the average strategy performs The least misery (LM) method takes into account the minimum of
best. Taking the minimum better corresponds to the predictions ratings for each user:
which are made by individuals assessing their own needs.
3. ALGORITHMS min & , ',∀ ∈) (6)
In the music domain many users usually form many groups and
listen to many songs. Given the fact that the length of one song is 3.2 Example
of 3 to 4 minutes users usually select a playlist containing several To illustrate how each algorithm works, we consider the following
to lots of songs. This is not the case of movies selection when example. In the next table, user1, user2, and user 3 represent
users need to agree on only one or few movie(s) they would like group members. The score distribution normalized to 1 for each of
to consume given their limited time and the length of a movie: the users is displayed in the respective row, and the joint scores
~2h. Thus, the music domain presents both opportunities and are shown in the table below.
challenges since the recommendation needs to focus on both
diversity and accuracy. Table I. Item selection example using the 4 algorithms
We propose the following 4 algorithms for comparison: User1 Song1: 0.1 Song2: 0.4 Song3: 0.4 Song4: 0.1
• PS (Probabilistic Sum): select the common playlist’
User2 Song1: __ Song2: 0.2 Song3: __ Song4: 0.8
songs probabilistically, each of them having the same
probability to be selected User3 Song1: 0.4 Song2: 0.2 Song3: __ Song4: 0.4
• LM (Least Misery): select songs with the highest Total Song1: 0.5 Song2: 0.8 Song3: 0.4 Song4: 1.3
minimum individual ratings
• DWS (Deterministic Weighted Sum): deterministically
select songs with the highest score The least misery (LM) will choose song 2 and song 3 (each of
them has the minimal rating 0.2). For all other songs the minimum
• PWS (Probabilistic Weighted Sum): compute weighted
score is 0.1. After normalizing the total scores by the sum of
sum and select songs based on their score probabilities. scores, we obtain the following probability distribution for the set
of outcomes.
3.1 General framework
Let A be the set of all users and S the set of all possible outcomes
that can be rated. In our group music recommendation setting, the
Table II. Probability distribution 4. GROUPFUN
P Song1: 0.16 Song2: 0.26 Song3: 0.13 Song 4: 0.43 GroupFun is a web application that helps a group of friends to
agree on a common music playlist for a given event they will
Considering the probability as the final score, the deterministic attend, e.g. a birthday party or a graduation ceremony. Firstly, it is
weighted sum (DWS) will chose songs 4, 2, 1 and 3. Probabilistic implemented as a Facebook plugin connecting users to their
weighted sum (PWS) will choose one song after another using this friends. Secondly, it is a music application that helps individuals
probabilistic distribution. Compared to other social choice based to manage and share their favorite music with groups. In
algorithms, PWS is incentive compatible. That is, it is to the best GroupFun users can listen to their own collection of songs as well
interest of the individual to reveal his/her preferences truthfully. It as their friends’ music. With the collective music database, the
application integrates friends’ music tastes and recommends a
is in fact equivalent to a random dictator method, where the common playlists to them. Therefore, the application aims at
dictator will choose a song randomly with the probabilities given satisfying music tastes of the whole group by aggregating
by its degree of preference – a reasonable method since nobody individual preferences through the use of previously presented
wants to hear the same song over and over again. This is because algorithms.
the probability of a song si to be chosen can be written as:
,
= ||
∑ ∈
||
(7)
or, in other words, the probability of choosing user a j times the
normalized score that user a j has given to song si . Indeed,
User3’s preference for song 1 yields a significant probability that
this song will be included in the playlist, relative to other songs.
3.3 Discussion
The contribution of the PWS algorithm in the paper stands out
with respect to group satisfaction. We expect users to be more
satisfied using PWS than other algorithms given their expectations
to discover the music of other members. Figure 1. “Home” page of GroupFun
Advantages of PWS compared with the other algorithms: In the “Home” page users see 4 playlists: one from a recent event,
one containing popular songs, one from a party and the last one
1. Users are free to choose the number of songs from an older event. They can listen to each song in each of the
2. Ratings are updated permanently playlists.
3. The algorithm is computationally simple
4. Users can negotiate their ratings and trade utility
5. Incentive-compatible truthful property is observed
6. The algorithm favors music diversity
The disadvantages of PWS are:
1. It is difficult to quantify rating differences between distinct
users. The weights given by each user cannot be compared
with the ones given by another since users have different
estimations of their utility.
2. Self-selection effect: most popular songs will receive most
votes - not ideal if long tail distribution is desired.
Since PWS can be interpreted as similar to the random scheme
users have to test it in more recommendation rounds to understand
its inner logic. PWS can be further developed to include the group Figure 2. "My Groups" page of GroupFun
dynamics. One solution is to consider trust and other members’
comments on the songs rated by one user (e.g. “like”/”dislike”). In the “My Group” page users can create groups, upload and rate
their music, invite friends and hear the group’s songs. Finally, in
The PWS algorithm stands out with respect to allowing users to the “My Music” page users see their contribution to GroupFun:
engage in trustful individual preference elicitation and music for each song is displayed the associated group, the rating and its
discovery. By returning to the recommendation list the group will name and artist. Users can also listen to their individual
find a different playlist every-time they are would like to listen to preferences in the same interface. One of the most important
group music. By considering the probabilistic distribution of characteristics of GroupFun is that it combines music, friends and
ratings and an extensive music library the algorithm will mostly groups together. In other words, GroupFun serves as a platform
suggest songs strongly liked by most others. Sometimes it will allowing users to conveniently organize their individual music
recommend unexpected, rating-wise, serendipitous items library, effectively communicate with friends and actively
facilitating music discovery and group enjoyment. participate in social activities.
5. EXPERIMENT DESIGN Outcome change I was interested in changing the
To compare how users perceive the 4 algorithms, we plan to carry intention outcome of the algorithm.
on a between-groups user study. With the results of the
experiment we will be able to make a judgment of the influence of 6. CONCLUSIONS AND FUTURE WORK
both the algorithms and the design on users’ satisfaction. We plan In this paper we presented our research work on the algorithmic
to collect solid user feedback regarding how an algorithm should development and evaluation of our music recommender system
allow group members to arrive at a common decision in a music and Facebook application, GroupFun. The major contribution of
recommendation setting. this paper is the demonstration of the applicability of the PWS
Our hypotheses are that: (1) users will not reveal their preferences algorithm for group recommendation strategies and negotiation. In
strategically as to influence the algorithm’s outcome; (2) they will this context, we analyzed different group recommendation
prefer more PWS than DWS given the increased diversity of the approaches w.r.t. group satisfaction and discussed key satisfaction
recommendations they receive and (3) the group will perceive issues to be taken into account. The PWS algorithm we proposed
more overall satisfaction but less diversity using the LM calculates probabilities for songs to appear in groups’ playlists
algorithm compared with PWS. favoring music diversity and discovery. Using PWS users state
their preference truthfully. They align their decision to that of the
5.1 Procedure group. Furthermore, our current development of GroupFun allows
First we recruit university students, friends who have Facebook users to create groups, rate and share their music profiles with
accounts and other users on the Amazon Mechanical Turk their friends.
platform. All of them will use their own computers in order to To understand how users’ perceive our algorithms and current
connect to the GroupFun application. We consider 4 algorithms interface, we plan to conduct an experiment to compare the 4
implemented for 4 groups of 40 users together with 1 interface. algorithms in a between-subjects study. As such we will evaluate
Each of the algorithms displays a common list of 10 songs to all user satisfaction for music group recommendations. Furthermore,
group members. Users are asked to contribute their music to only to learn more about the perceived ease of use and perceived
one of the groups and fill in an online post-study questionnaire usefulness of our system we plan to invite more members and
assessing their satisfaction. The final music outcome is shown to analyze user feedback in terms of design and functionality. We
all group members after they have finished their tasks. Users can also intend to develop a new version of the algorithm which will
interact with the system in diverse ways such as: upload more or better match users’ behavior and expectations.
less songs, change their ratings, see and hear to their friends’
playlists, etc. 7. ACKNOWLEDGMENTS
Table III. Evaluation of 4 algorithms using the same interface This research was supported by EPFL and the Swiss National
Science Foundation. We thank Elsa Friscira and Laurentiu
PWS DWS LM PS Dascalu for their important contribution to the development of
Interface/Algorithm
GroupFun and GroupFun users for their answers and involvement
40 40 40 40 in the evaluation of our algorithms.
Interface
5.2 Measurements 8. REFERENCES
The first 40 users see the results of the probabilistic weighted sum [1] Hastie, R. and Kameda, T. 2005. The robust beauty of
algorithm, the next those of deterministic weighted sum and so on. majority rules in group decisions. In Psychological Review,
Since individuals who upload more songs expect to see their song Vol. 112, no. 2, 494-508.
names more often in the final list they would prefer to know a [2] Masthoff, J. 2004. Group modeling: Selecting a sequence of
priori the computation rule of the algorithm so that they would television items to suit a group of viewers. User Modeling
adjust the number of songs they upload. Given users’ known self- and User-Adapted Interaction, Vol. 14, no. 1, 37-85.
ratings and group ratings computed by the algorithms we expected
our subjects to identify some differences between the 4 [3] Masthoff, J. 2005. The pursuit of satisfaction: affective state
approaches. Some of the questions from the post-study in group recommender systems. In Computer Science, Vol.
questionnaire are presented in the table below. They were 3538, 297-306.
extracted from a well-known user evaluation model, named [4] McCarthy J.F. and Anagnost, T.D. 1998. MusicFX: an
ResQue [7], that our group has developed. arbiter of group preferences for computer supported
Table IV. Evaluation questions collaborative workouts. Proc. ACM Computer Supported
Cooperative Work, 363-372.
Measurements Questions
[5] O’Connor, M., Cosley, D., Konstan, J.A., Riedl, J. 2001.
Perceived The layout of the system's interface is PolyLens: A recommender system for groups of users. Proc.
attractiveness attractive. ACM European Conference on Computer Supported
Cooperative Work.
Perceived satisfaction The items recommended to me
[6] Popescu G. and Pu P. 2010. Group Recommender Systems as
matched my interest. a Voting Problem. EPFL Technical report.
Perceived helpfulness I took into account the ratings given by [7] Pu P. and Chen L. 2010. A User-Centric Evaluation
my friends. Framework of Recommender Systems. In the 3rd ACM
Conference on Recommender Systems, Barcelona, Spain