<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Improved Data Aggregation Strategy for Group Recommendations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Toon De Pessemier</string-name>
          <email>Toon.DePessemier@UGent.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simon Dooms</string-name>
          <email>Simon.Dooms@UGent.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luc Martens</string-name>
          <email>Luc1.Martens@UGent.be</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>iMinds - Ghent University</institution>
          ,
          <addr-line>G. Crommenlaan 8 box 201, B-9050 Ghent</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <fpage>36</fpage>
      <lpage>39</lpage>
      <abstract>
        <p>Although most recommender systems make suggestions for individual users, in many circumstances the selected items (e.g., movies) are not intended for personal usage but rather for consumption in group. Group recommendations can assist a group of users in nding and selecting interesting items thereby considering the tastes of all group members. Traditionally, group recommendations are generated either by aggregating the group members' recommendations into a list of group recommendations or by aggregating the group members' preferences (as expressed by ratings) into a group model, which is then used to calculate group recommendations. This paper presents a new data aggregation strategy for generating group recommendations by combining the two existing aggregation strategies. The proposed aggregation strategy outperforms each individual strategy for di erent sizes of the group and in combination with various recommendation algorithms.</p>
      </abstract>
      <kwd-group>
        <kwd>group recommendations</kwd>
        <kwd>aggregation strategy</kwd>
        <kwd>combining techniques</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Categories and Subject Descriptors</title>
      <p>H.3.3 [Information Search and Retrieval]: Information
Filtering; H.5.3 [Information Interfaces and
Presentation]: Group and Organization Interfaces</p>
    </sec>
    <sec id="sec-2">
      <title>1. INTRODUCTION</title>
      <p>
        Although the majority of the currently deployed
recommender systems are designed to generate personal
suggestions for individual users, in many cases content is selected
and consumed by groups of users rather than by
individuals. This strengthens the need for group recommendations,
providing suggestions thereby considering the tastes of all
group members. In the literature, group recommendations
have mostly been generated by one of the following two data
aggregation strategies [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        The rst aggregation strategy (aggregating
recommendations) generates recommendations for each individual user
using a general recommendation algorithm. Subsequently,
the recommendation lists of all group members are
aggregated into a group recommendation list, which (hopefully)
satis es all group members. Di erent approaches to
aggregate the recommendation lists have been proposed during
the last decade, such as least misery and plurality voting [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
Most of them make a decision based on the algorithm's
prediction score, i.e., a prediction of the user's rating score for
the recommended item. One commonly used way to
perform the aggregation is averaging the prediction scores of
each member's recommendation list. The higher the
average prediction score is, the better the match between the
group's preferences and the recommended item.
      </p>
      <p>
        The second grouping strategy (aggregating preferences)
combines the users' preferences into group preferences. This
way, the opinions and preferences of individual group
members constitute a group preference model re ecting the
interests of all members. Again, the members' preferences
can be aggregated in di erent ways, e.g., by calculating the
rating of the group as the average of the group members'
ratings [
        <xref ref-type="bibr" rid="ref1 ref7">7, 1</xref>
        ]. After aggregating the members' preferences,
the group's preference model is treated as a pseudo user in
order to produce recommendations for the group using a
traditional recommendation algorithm.
      </p>
      <p>
        This paper presents a new data aggregation strategy, which
combines the two existing strategies and outperforms each of
them in terms of accuracy. For both individual data
aggregation strategies, we used the average function to combine
the individual preferences or recommendations. Although a
switching scheme between both aggregation strategies has
already been investigated [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], the proposed combined
strategy is the rst to generate group recommendations by using
both aggregation strategies at once, thereby making a more
informed decision.
2.
      </p>
    </sec>
    <sec id="sec-3">
      <title>EVALUATING GROUP RECOMMENDA</title>
    </sec>
    <sec id="sec-4">
      <title>TIONS</title>
      <p>A major issue in the domain of group recommender
systems is the evaluation of the accuracy, i.e., comparing the
generated recommendations for a group with the true
preferences of the group. Performing online evaluations or
interviewing groups can be partial solutions but are not feasible
on a large scale or to extensively test alternative con
gurations. For example, in Section 5, ve recommendation
algorithms in combination with two data aggregation strategies
are evaluated for twelve di erent group sizes, thereby leading
to 120 di erent setups of the experiment. Therefore, we are
forced to perform an o ine evaluation, in which synthetic
groups are sampled from the users of a traditional
singleuser data set. Since movies are often watched in group, we
used the MovieLens (100K) data set for this evaluation.</p>
      <p>
        In the literature, group recommendations have been
evaluated several times by using a data set with simulated groups
of users. Baltrunas et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] used the MovieLens data set
to simulate groups of di erent sizes (2, 3, 4, 8) and di erent
degrees of similarity (high, random) with the aim of
evaluating the e ectiveness of group recommendations. Chen
et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] also used the MovieLens data set and simulated
groups by randomly selecting the members of the group to
evaluate their proposed group recommendation algorithm.
They simulated group ratings by calculating a weighted
average of the group members' ratings based on the users'
opinion importance parameter. Quijano-Sanchez et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
used synthetically generated data to simulate groups of
people in order to test the accuracy of group recommendations
for movies. In addition to this o ine evaluation, they
conducted an experiment with real users to validate the results
obtained with the synthetic groups. One of the main
conclusions of their study was that it is possible to realize
trustworthy experiments with synthetic data, as the online user test
con rmed the results of the experiment with synthetic data.
This conclusion justi es the use of an o ine evaluation with
synthetic groups to evaluate the group recommendations in
our experiment.
      </p>
      <p>This o ine evaluation is based on the traditional
procedure of dividing the data set in two parts: the training set,
which is used as input for the algorithm to generate the
recommendations, and the test set, which is used to evaluate
the recommendations. In this experiment, we ordered the
ratings chronologically and assigned the oldest 60% to the
training set and the most recent 40% to the test set, as this
re ects a realistic scenario the best.</p>
      <p>
        The used evaluation procedure was adopted from
Baltrunas et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and is performed as follows. Firstly,
synthetic groups are composed by selecting random users from
the data set. All users are assigned to one group of a
prede ned size. Secondly, group recommendations are
generated for each of these groups based on the ratings of the
members in the training set. Since group recommendations
are intended to be consumed in group and to suit
simultaneously the preferences of all members of the group, all
members receive the same recommendation list. Thirdly,
since no group ratings are available, the recommendations
are evaluated individually as in the classical single-user case,
by comparing (the rankings of) the recommendations with
(the rankings of) the items in the test set of the user
using the Normalized Discounted Cumulative Gain (nDCG)
at rank 5. The nDCG is a standard information retrieval
measure, used to evaluate the recommendation lists [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
    </sec>
    <sec id="sec-5">
      <title>RECOMMENDATION ALGORITHMS</title>
      <p>
        The e ectiveness of the di erent aggregation strategies
is measured for di erent sizes of the group and in
combination with various state-of-the art recommendation
algorithms. The used implementation of Collaborative
Filtering (CF) is based on the work of Breese et al [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This
nearest neighbor CF uses the Pearson correlation metric for
discovering similar users in the user-based approach (UBCF)
or similar items in the item-based approach (IBCF) based
on the rating behavior of the users. As Content-Based
recommender (CB) the InterestLMS predictor of the open
source implementation of the Duine framework [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is adopted
(and extended to consider extra metadata attributes). Based
on the actors, directors, and genres of the content items and
the user's ratings for these items, the recommender builds
a pro le model for every user. This pro le contains an
estimation of the user's preference for each genre, actor, and
director that is assigned to a rated item, and is used to
predict the user's preference for unseen media items by
matching the metadata of the items with the user's pro le. The
used hybrid recommender (Hybrid) combines the
recommendations with the highest prediction score of the IBCF
and the CB recommender into a new recommendation list.
The result is an alternating list of the best recommendations
originating from these two algorithms. A user-centric
evaluation comparing di erent algorithms based on various
characteristics showed that this straightforward combination of
CF and CB recommendations outperforms both individual
algorithms on almost every qualitative metric [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. As
recommender based on matrix factorization, we opted for the open
source implementation of the SVD recommender (SVD)
of the Apache Mahout project [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. This recommender is
con gured to use 19 features, which equals the number of
genres in the MovieLens data set, and the number of
iterations is set at 50. To compare the results of the various
recommenders, the popular recommender was introduced
as a baseline. This recommender generates for every user
always the same list of most-popular items, which is based
on the number of received ratings and the mean rating of
each item.
4.
      </p>
    </sec>
    <sec id="sec-6">
      <title>COMBINING STRATEGIES</title>
      <p>
        Previous research [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] has shown that the used aggregation
strategy in combination with the recommendation algorithm
has a major in uence on the accuracy of the group
recommendations. Certain algorithms (such as CB and UBCF)
produce more accurate group recommendations when the
aggregating preferences strategy is used, whereas other
algorithms (such as IBCF and SVD) obtain a higher
accuracy in combination with the aggregating recommendations
strategy. So, the choice of the aggregation strategy is
crucial for each algorithm in order to obtain the best group
recommendations. Instead of selecting one individual
aggregation strategy, traditional aggregation strategies can be
combined with the aim of obtaining group recommendations
which outperform the group recommendations of each
individual aggregation strategy. In this context, Berkovsky and
Freyne [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] witnessed that the aggregating recommendations
strategy yields a lower MAE (Mean Absolute Error) than
the aggregating preferences strategy if the user pro les have
a low density (i.e., containing a low number of
consumptions). In contrast for high-density pro les, the aggregating
preferences strategy resulted in the lowest MAE, thereby
outperforming the aggregating recommendations strategy in
terms of accuracy. Therefore, Berkovsky and Freyne
proposed a switching scheme based on the pro le density, which
yielded a small accuracy improvement compared to the
individual strategies. However, their results were obtained in
a very speci c setting. They only considered the accuracy
of recommendations generated by a CF algorithm, the MAE
metric was used to estimate the accuracy, and they focused
on the speci c use case of recipe recommendations using a
rather small data set (approximately 3300 ratings). Because
of these speci c settings, we were not able to obtain an
accuracy improvement by using such a switching scheme on
the MovieLens data set.
      </p>
      <p>Therefore, we propose an advanced data aggregation
strategy which combines both individual aggregation strategies
thereby yielding an accuracy gain compared to each
individual aggregation strategy for di erent recommendation
algorithms. This combination of strategies aggregates the
preferences of the users as well as their recommendations with
the aim of merging the knowledge of the two aggregation
strategies into a nal group recommendation list. The idea
is that if one of the aggregation strategies comes up with
a less suitable or undesirable group recommendation, the
other aggregation strategy can correct this mistake. This
makes the group recommendations resulting from the
combination of strategies more robust than the group
recommendations based on a single aggregation strategy.</p>
      <p>The two aggregation strategies are combined as follows.
First, group recommendations are calculated by using the
selected recommendation algorithm and the aggregating
preferences strategy. The result is a list of all items, ordered
according to their prediction score. In case of an
individual aggregation strategy, the top-N items on that list are
selected as suggestions for the group. After calculating the
group recommendations using the aggregating preferences
strategy, or in parallel with it, group recommendations are
generated using the chosen algorithm and the aggregating
recommendations strategy. Again, the result is an ordered
list of items with their corresponding prediction score.</p>
      <p>Both of these lists with group recommendation can still
contain items that are less suitable for the group, even at
the top of the list. The next phase will try to eliminate these
items by comparing the two resulting recommendation lists.
Items that are at the top of both lists are probably
interesting recommendations, whereas items at the bottom of both
lists are usually less suitable for the group. Less certainty
exists about the items that are at the top of the
recommendation list that is generated by one of the aggregation
strategies but that are in the middle or even at the bottom
of the recommendation list produced by using the other
aggregation strategy. Therefore, both recommendation lists
are adapted by eliminating these uncertain items in order
to contain only items that appear at the top of both
recommendation lists, thereby reducing the risk of recommending
undesirable or less suitable items to the group. So, items
that are ranked below a certain threshold position in (at
least) one of the recommendation lists generated by the two
aggregation strategies, are removed from both lists. If only
one aggregation strategy is used, identifying uncertain items
based on the results of a complementary recommendation
list is not possible. In this experiment, we opted to exclude
these items from the recommendation lists, that are not in
the top-5% of both recommendation lists (i.e., the top-84 of
recommended items for the MovieLens data set). As a
result, the recommendation lists contains only items that are
identi ed as `the most suitable' by both aggregation
strategies, ordered according to the prediction scores calculated
using either the aggregating preferences strategy or the
aggregating recommendations strategy.</p>
      <p>Subsequently, the two recommendation lists are combined
into one recommendation list by combining the prediction
scores of each aggregation strategy per item. In this
experiment, we opted for the average as method to combine
the prediction scores. So in the resulting recommendation
list, each item's prediction score is the average of the item's
prediction score generated by the aggregating preferences
strategy and the item's prediction score produced by the
aggregating recommendations strategy. Alternative
combining methods are also possible, e.g., a weighted average
of the prediction scores with weights depending on the
performance of each individual aggregation strategy. Then, the
items are ordered by their new prediction score in order to
obtain the nal list of group recommendations.
5.</p>
    </sec>
    <sec id="sec-7">
      <title>RESULTS</title>
      <p>
        Our combined aggregation strategy is compared to the
individual aggregation strategies in Figure 1. Since users are
randomly combined into groups and the accuracy of group
recommendations is depending on the composition of the
groups, the accuracy slightly varies for each partitioning of
the users into groups. (Except for the partitioning of the
users into groups of 1 member, which is only possible in 1
way.) Therefore, the process of composing groups by taking
a random selection of users is repeated 30 times and just
as much measurements of the accuracy are performed. So,
the graph shows the mean accuracy of these measurements
as an estimation of the quality of the group
recommendations (on the vertical axis), as well as the 95% con dence
interval of the mean value, in relation to the
recommendation algorithm, aggregation strategy, and the group size.
The group size is indicated on the horizonal axis. The
vertical axis crosses the horizontal axis at the quality level of
the most-popular recommender. The pre x \Combined" of
the bar series stands for the proposed aggregation strategy
which combines the aggregating preferences and
aggregating recommendations strategy. The pre x \Pref" and \Rec"
indicate the accuracy of the two individual strategies,
respectively the aggregating preferences and aggregating
recommendations strategy. For each algorithm, only the most
accurate individual strategy is shown: aggregating
preferences for UBCF and CB, aggregating recommendations for
SVD, IBCF, and Hybrid [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>The non-overlapping con dence intervals indicate a
signi cant improvement of the combined aggregation strategy
compared to the best individual aggregation strategy.
Table 1 shows the results of the statistical T-tests comparing
the mean accuracy of the recommendations generated by
the best individual aggregation strategy and by the
combined aggregation strategy for groups with size = 5.
(Similar results are obtained for other group sizes.) The null
hypothesis, H0 = the mean accuracy of the recommendations
generated by using the best individual aggregation
strategy is equal to the mean accuracy of the recommendations
generated by using the combined aggregation strategy. The
small p-values (all smaller than 0.05) prove the signi cant
accuracy improvement of our proposed aggregation strategy.</p>
    </sec>
    <sec id="sec-8">
      <title>CONCLUSIONS</title>
      <p>This paper presents a new strategy to aggregate the tastes
of multiple users in order to generate group
recommenda0.905</p>
      <p>0.9
0.895
G
C
nD0.89
n
a
e
M
0.885
0.88
0.875
0.87</p>
      <p>Comparison of the best individual aggregation strategy and the combined aggregation strategy
tions. Both existing data aggregation strategies are
combined to make a more informed decision hereby reducing
the risk of recommending undesirable or less suitable items
to the group. The results show that the combination of
aggregation strategies outperforms the individual aggregation
strategies for various sizes of the group and in combination
with various recommendation algorithms. The proposed
aggregation strategy can be used to increase the accuracy of
(commercial) group recommender systems.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Baltrunas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Makcinskas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Ricci</surname>
          </string-name>
          .
          <article-title>Group recommendations with rank aggregation and collaborative ltering</article-title>
          .
          <source>In Proceedings of the fourth ACM conference on Recommender systems, RecSys '10</source>
          , pages
          <fpage>119</fpage>
          {
          <fpage>126</fpage>
          , New York, NY, USA,
          <year>2010</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Berkovsky</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Freyne</surname>
          </string-name>
          .
          <article-title>Group-based recipe recommendations: analysis of data aggregation strategies</article-title>
          .
          <source>In Proceedings of the fourth ACM conference on Recommender systems, RecSys '10</source>
          , pages
          <fpage>111</fpage>
          {
          <fpage>118</fpage>
          , New York, NY, USA,
          <year>2010</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Breese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Heckerman</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Kadie</surname>
          </string-name>
          .
          <article-title>Empirical analysis of predictive algorithms for collaborative ltering</article-title>
          .
          <source>In Proceedings of the Fourteenth conference on Uncertainty in arti cial intelligence</source>
          ,
          <source>UAI'98</source>
          , pages
          <fpage>43</fpage>
          {
          <fpage>52</fpage>
          , San Francisco, CA, USA,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y.-L.</given-names>
            <surname>Chen</surname>
          </string-name>
          , L.-C. Cheng, and
          <string-name>
            <given-names>C.-N.</given-names>
            <surname>Chuang</surname>
          </string-name>
          .
          <article-title>A group recommendation system with consideration of interactions among group members</article-title>
          .
          <source>Expert Systems with Applications</source>
          ,
          <volume>34</volume>
          (
          <issue>3</issue>
          ):
          <year>2082</year>
          {
          <year>2090</year>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>T. De Pessemier</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Dooms</surname>
            , and
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Martens</surname>
          </string-name>
          .
          <article-title>Design and evaluation of a group recommender system</article-title>
          .
          <source>In Proceedings of the sixth ACM conference on Recommender systems, RecSys '12</source>
          , pages
          <fpage>225</fpage>
          {
          <fpage>228</fpage>
          , New York, NY, USA,
          <year>2012</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Dooms</surname>
          </string-name>
          , T. De Pessemier, and
          <string-name>
            <given-names>L.</given-names>
            <surname>Martens</surname>
          </string-name>
          .
          <article-title>A user-centric evaluation of recommender algorithms for an event recommendation system</article-title>
          .
          <source>In Proceedings of the workshop on User-Centric Evaluation of Recommender Systems and Their Interfaces at ACM Conference on Recommender Systems (RECSYS)</source>
          , pages
          <fpage>67</fpage>
          {
          <fpage>73</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mastho</surname>
          </string-name>
          . Group modeling:
          <article-title>Selecting a sequence of television items to suit a group of viewers. User Modeling</article-title>
          and
          <string-name>
            <surname>User-Adapted</surname>
            <given-names>Interaction</given-names>
          </string-name>
          ,
          <volume>14</volume>
          :
          <fpage>37</fpage>
          {
          <fpage>85</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Quijano-Sanchez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Recio-Garcia</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Diaz-Agudo</surname>
          </string-name>
          .
          <article-title>Personality and social trust in group recommendations</article-title>
          .
          <source>In Proceedings of the 2010 22nd IEEE International Conference on Tools with Arti cial Intelligence -</source>
          Volume
          <volume>02</volume>
          , ICTAI '
          <volume>10</volume>
          , pages
          <fpage>121</fpage>
          {
          <fpage>126</fpage>
          , Washington, DC, USA,
          <year>2010</year>
          . IEEE Computer Society.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Telematica</given-names>
            <surname>Instituut</surname>
          </string-name>
          / Novay. Duine Framework,
          <year>2009</year>
          . Available at http://duineframework.org/.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>[10] The Apache Software Foundation. Apache Mahout</source>
          ,
          <year>2012</year>
          . Available at http://mahout.apache.org/.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>