<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Groups Identification and Individual Recommendations in Group Recommendation Algorithms y</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ludovico Boratto</string-name>
          <email>ludovico.boratto@unica.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Salvatore Carta</string-name>
          <email>salvatore@unica.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michele Satta</string-name>
          <email>michele_satta@hotmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Group Recommendation, Collaborative Filtering, Cluster-</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dipartimento di Matematica e, Informatica, Università di</institution>
          ,
          <addr-line>Cagliari, Via Ospedale 72, 09124 Cagliari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dipartimento di Matematica e, Informatica, Università di</institution>
          ,
          <addr-line>Cagliari, Via Ospedale 72, 09124 Cagliari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Dipartimento di Matematica e, Informatica, Università di</institution>
          ,
          <addr-line>Cagliari, Via Ospedale 72, 09124 Cagliari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>ing</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Recommender systems usually deal with preferences previously expressed by users, in order to predict new ratings and recommend items. To support recommendation in social activities, group recommender systems were developed. Group recommender systems usually consider prede ned/a priori known groups and just a few existing approaches are able to automatically identify groups. When groups are not already formed, another key aspect of group recommendation is related to groups identi cation. In this paper a novel algorithm able to identify groups of users and produce recommendations for each group is presented. The algorithm uses individual recommendations and a classic clustering algorithm to identify and model groups. Experimental results show how this approach substantially improves the quality of group recommendations with respect to the state-of-the-art.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>This work is partially funded by Regione Sardegna
under project CGM (Coarse Grain Recommendation) through
Pacchetto Integrato di Agevolazione (PIA) 2008 \Industria
Artigianato e Servizi".
yCopyright is held by the author/owner(s). Workshop on
the Practical Use of Recommender Systems, Algorithms and
Technologies (PRSAT 2010), held in conjunction with
RecSys 2010. September 30, 2010, Barcelona, Spain.</p>
    </sec>
    <sec id="sec-2">
      <title>1. INTRODUCTION</title>
      <p>
        With the development of Web 2.0, the use of the web has
become increasingly widespread and users have had the chance
to express opinions about shared content updated daily. This
generates an incredible amount of data that can't be
handled directly by the users. So nding relevant information
over the Internet nowadays is becoming more and more
difcult [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
      </p>
      <p>
        Recommender systems have been developed to deal with
information overload and produce personalized content for the
users by exploiting context-awareness in a domain. This is
done by computing a set of previously expressed preferences,
in order to recommend items that are likely of interest to a
user. Collaborative Filtering (CF) [
        <xref ref-type="bibr" rid="ref11 ref15 ref19">11, 15, 19</xref>
        ] is by far the
most successful recommendation technique. The main idea
of CF systems is to use the opinions of a community, in order
to provide item recommendations.
      </p>
      <p>There are context and domains where classic
recommendation cannot be used, because the recommendation process
involves more than a person and preferences have to be
combined in order to produce a single recommendation that
satis es everyone (e.g., people traveling together or going to a
restaurant/museum together). Therefore, in order to
support recommendations in social activities, algorithms able
to provide group recommendations were developed. Group
recommendations are provided according to the way a group
is modeled. Group modeling is the combination of the
preferences expressed by single users into a common group
preference.</p>
      <p>A special type of group recommendation is needed when
technological constraints limit the bandwidth available for
the recommendation. This is for example the case of
Satellite Systems, in which the number of channels is limited and
a personalized TV schedule cannot be produced.
Another useful application scenario in which limitations are
imposed in the recommendation process is the printing of
recommendation yers that contain suggested items. Even if
a company has all the data to produce a yer with individual
recommendations for each customer, the process of printing
a di erent yer for everyone would be technically too hard
to achieve and costs would be too high. A possible solution
would be to print n di erent yers that can be a ordable in
terms of costs and that can satisfy users by recommending
interesting items to the recipients of the same yer.
In both the scenarios described the rst result that the
algorithm has to compute is a proper identi cation of groups,
in order to produce a recommendation that maximizes users
satisfaction. This preliminary phase of the group
recommendation process is not performed by the great part of
algorithms in literature, because they consider only how to
model already existing groups.</p>
      <p>In this paper a novel approach for group recommendation
with automatic identi cation of groups is proposed.
To enhance the readability of the paper and the properties
of the proposed approach, a baseline version of the
algorithm is preliminarily presented (BaseGRA, Baseline Group
Recommendation Algorithm). BaseGRA uses a classic
clustering algorithm to identify groups, by exploiting past
preferences expressed by each user of the system. To model the
group, BaseGRA combines the preferences of each user with
the ratings predicted using a CF algorithm for the unrated
items.</p>
      <p>Since the number of items evaluated by a user in a system is
usually much lower than the number of the items that can
be evaluated, we considered the fact that the clustering step
may be a ected by the well-known problem of sparsity of
the available data.</p>
      <p>The algorithm presented in this paper, named
ImprovedGRA (Improved Group Recommendation Algorithm), has
been developed to overcome this potential problem and
improve the quality of clustering. This is done by using the
predictions of the missing ratings to complete the matrix of
the preferences already expressed by users. The algorithm
predicts individual recommendations, combines them with
the preferences explicitly expressed by users, and uses both
of them as input for a classic clustering algorithm. As
highlighted by the experiments, this leads to an identi cation of
groups of users with similar preferences with a high quality
of the predicted results. Individual recommendations and
explicitly provided preferences are also used to model the
groups.</p>
      <p>The proposed approach is the rst that combines clustering
of the users with an aggregation of individual
recommendations. In fact none of the existing recommender systems
that automatically identify groups merges individual
recommendations and the approaches that merge individual
recommendations deal with groups that have a prede ned
structure.</p>
      <p>Another scienti c contribution of the approach relies in the
algorithm used to automatically identify groups, which mixes
recommendation and clustering algorithms, leading to a
substantial improvement of the quality of the group
recommendations with respect to the state-of-the-art.</p>
      <p>Moreover the paper presents an analysis of two more
fundamental aspects of this kind of group recommendation:
homogeneity of group size and homogeneity of recommendations
quality.</p>
      <p>Considering the size of groups, it is evident that it should
be su ciently homogeneous. In simple words, if the
recommendation process involves 70000 users and 10 available
channels, it would not be acceptable to have a group with
61000 users and 9 groups with 1000 users. In fact it would be
a waste of bandwidth to produce recommendations for small
groups and, at the same time, it would be hard for a system
to produce recommendations that gather the preferences of
a large group.</p>
      <p>Considering the quality of the predicted results, it should not
vary too much between the groups. In other terms, the
system should try to keep a su cient quality of the predictions
for every group. Providing inadequate recommendations to
any group should always be avoided.</p>
      <p>The rest of the paper is organized in the following way:
section 2 presents related work, considering both group
recommender systems able to automatically identify groups and
group recommender systems that build individual
recommendations; section 3 contains a detailed description of the
baseline group recommendation algorithm, BaseGRA;
section 4 will do the same for the improved algorithm
ImprovedGRA; section 5 describes the experiments we conducted to
evaluate the proposed algorithm and outlines main results;
section 6 contains comments, conclusions and future
developments.</p>
    </sec>
    <sec id="sec-3">
      <title>2. RELATED WORK</title>
      <p>As mentioned in the Introduction, group recommender
systems were developed to support the recommendation process
in activities that involve more than a person.</p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] the state-of-the-art in group recommendation
is presented. The existing systems were developed for di
erent domains like web/news pages, tourist attractions, music
tracks, television programs and movies. A classi cation of
those approaches can be made from two perspectives:
- the type of group considered;
- the way group recommendations are built.
      </p>
      <p>Considering the rst classi cation of the existing systems,
which is based on the type of groups considered, we can
identify four di erent types of groups, described below.
- Established group: a number of persons who
explicitly choose to be part of a group, because of shared,
long-term interests;
- Occasional group: a number of persons who do
something occasionally together, like visiting a museum. Its
members have a common aim in a particular moment;
- Random group: a number of persons who share an
environment in a particular moment, without explicit
interests that link them;
- Automatically identi ed group: groups that are
automatically detected considering the preferences of
the users and/or the resources available.</p>
      <p>The second classi cation of the existing approaches can be
done considering the way group recommendations are built.
There are two ways to build group recommendations,
described in the list below.</p>
      <p>- Merge of individual recommendations into a group
recommendation.
- Merge of the individual preferences to build a group
pro le and predict speci c recommendations for the
group.</p>
      <p>The approach described in this paper automatically
identi es groups and merges individual recommendations. The
existing approaches for those two categories of group
recommender systems will now be described and di erences with
our approach will be highlighted.</p>
      <p>As a general consideration, please note that none of the
approaches that automatically identify groups merges
individual recommendations.</p>
    </sec>
    <sec id="sec-4">
      <title>2.1 Approaches that automatically identify groups</title>
      <p>
        The approach proposed in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] aims to automatically discover
Communities of Interest (CoI) (i.e., a group of individuals
who share and exchange ideas about a given interest) and
produce recommendations for them.
      </p>
      <p>CoI are identi ed considering the preferences expressed by
users in personal ontology-based pro les. Each pro le
measures the interest of a user in concepts of the ontology. Users
interest is exploited in order to cluster the concepts.
User pro les are then split into subsets of interests, to link
the preferences of each user with a speci c cluster of
concepts. Hence it is possible to de ne relations among users
at di erent levels, obtaining a multi-layered interest network
that allows to nd multiple CoI. Recommendations are built
using a content-based CF approach.</p>
      <p>The di erence with our approach is that preferences of users
are not expressed through an ontology. Moreover, our
recommendation technique is based on a CF user-based
approach.</p>
      <p>
        The system proposed in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] generates group
recommendations and automatically detects intrinsic communities of users
whose preferences are similar. Communities of users with
similar preferences are identi ed using a Modularity-based
Community Detection algorithm [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and group
recommendations are predicted for each community. See 5.2 for a more
detailed description of the approach.
      </p>
      <p>This approach, although it achieves exactly the same
purposes, di ers from the one presented in this paper both in
the way group predictions are built and in the way groups
are identi ed. The approach was chosen for comparison with
the algorithm presented in this paper because of the
mentioned similarities in several aspects.
2.2</p>
    </sec>
    <sec id="sec-5">
      <title>Approaches that merge individual recommendations</title>
      <p>
        PolyLens [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] is a system built to produce recommendations
for groups of users who want to see a movie.
      </p>
      <p>To produce recommendations for each user of the group a
CF algorithm is used. In order to model the group, a \least
misery" (LM) strategy is used: the rating used to
recommended a movie to a group is the lowest predicted rating
for that movie, to ensure that every member is satis ed.
In contrast with the LM strategy used by PolyLens, in our
approach group preferences are built combining individual
recommendations in a single value that averages the
preferences of the single users.</p>
      <p>We considered the use of a group modeling technique based
on the average of users ratings instead of using a LM
strategy because it seems more suited for an approach where
large groups are considered. A LM strategy is useful for
small groups and in fact Polylens handles groups with two
or three users. Even if groups are composed by people with
homogeneous preferences, using a LM strategy a low rating
expressed by a user for a movie would be enough to have a
low rating for that movie for the whole group. With large
groups such an approach would probably lead to extremely
low ratings for almost all the movies.</p>
      <p>
        INTRIGUE (INteractive TouRist Information GUidE) [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]
is a system that recommends sightseeing destinations using
the preferences of the group members. The approach merges
individual recommendations and, in order to build group
recommendations, some subgroups are considered more
inuential (e.g., disabled people).
      </p>
      <p>In our approach we don't consider a speci c domain of
application and every individual recommendation is weighted
equally, so that group recommendations re ect all the users
preferences.</p>
      <p>
        The approach presented in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] computes group
recommendations by combining individual recommendations built for
every user and considering a consensus function, which
combines relevance of the items for a user and disagreement
between members.
      </p>
      <p>Since our approach automatically builds groups of users with
similar preferences, we don't expect disagreement to be a
characterizing feature when computing group
recommendations. Therefore this aspect was not considered in our
approach.</p>
      <p>
        The system proposed in [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ] presents a group
recommendation approach based on Bayesian Networks (BN). To
represent users and their preferences a BN is built. The authors
assume that the composition of the groups is a priori known
and model the group as a new node in the network that
has the group members as parents. A collaborative
recommender system is used to predict the votes of the group
members. A posteriori probabilities are calculated to
combine the predicted votes and build the group
recommendation.
      </p>
      <p>The main di erence with our approach is that, in order to
combine preferences and build group recommendations, we
don't rely on a Bayesian Network and a posteriori
probabilities.</p>
    </sec>
    <sec id="sec-6">
      <title>3. BASELINE GROUP</title>
    </sec>
    <sec id="sec-7">
      <title>RECOMMENDATION ALGORITHM (BASEGRA)</title>
      <p>The baseline version of our algorithm identi es groups of
similar users considering the preferences expressed by each
user and models each group using individual
recommendations built for each user of a group.</p>
    </sec>
    <sec id="sec-8">
      <title>3.1 Overview of BaseGRA</title>
      <p>
        The algorithm works in two steps:
1. Using a Ratings Matrix that contains the preferences of
each user, groups of similar users are detected through
the k-means clustering algorithm [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
2. Once the groups have been detected, a group
preference is produced by aggregating the preferences of the
individual users.
      </p>
    </sec>
    <sec id="sec-9">
      <title>3.2 Groups Identification</title>
      <p>The input of the algorithm is a Ratings Matrix M that
associates a set of users to a set of items through a rating.
A rating indicates the level of satisfaction of a user for a
considered item. So each value mui of the Ratings Matrix
is:
mui =
rui if user u expressed a preference for item i</p>
      <p>; if user u didn't express a preference for item i
A rating rui is always such that rmin rui rmax and
rui &gt; 0. In other words, a rating value is always inside a
xed range and its value is always positive.</p>
      <p>
        The Ratings Matrix is used as input for the k-means
clustering algorithm [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Since the algorithm's input are the
preferences expressed by each user, the output is a partition
in groups of users with similar preferences.
      </p>
    </sec>
    <sec id="sec-10">
      <title>3.3 Groups Modeling</title>
      <p>The objective of group modeling is to calculate, for each
item, a group rating which will be evaluated in order to
decide which items should be recommended to the group.
In order to model a group, the preferences of each user that
belongs to the group have to be combined.</p>
      <p>An average is a single value that is meant to typify a list of
values. The most common method to calculate such a value
is the arithmetic mean, which also seems an e ective way
to put together the preferences of each user in a group, in
order to reach our objective.</p>
      <p>Combining just the preferences expressed by the users would
lead to a poor modeling of the group, since each user usually
gives an explicit preference to a small set of item. This is
especially true when modeling small groups. In fact group
preferences have to be extracted considering a small set of
preferences expressed by a small set of users.
recommendations predicted for each user. The result is a
Predicted Ratings Matrix P R that associates each user u
with an item i either through an explicitly expressed rating
rui or through a predicted rating pui.</p>
      <p>
        A predicted rating pui is calculated using a classic
UserBased Nearest Neighbor CF Algorithm, proposed in [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
The algorithm predicts a rating pui for each item i that was
not evaluated by a user u, considering the rating rni of each
similar user n for the item i. A user n similar to u is called a
neighbor of u. Equation 1 gives the formula used to predict
the ratings:
pui = ru +
      </p>
      <p>Pn neighbors(u) sim(u; n) (rni</p>
      <p>Pn neighbors(u) sim(u; n)
rn)
(1)
Values ru and rn represent, respectively, the mean of the
ratings expressed by user u and user n. Similarity sim()
between two users is calculated using the Pearson correlation,
a coe cient that compares the ratings of all the items rated
by both the target user and the neighbor (corated items).
Pearson correlation between a user u and a neighbor n is
given in Equation 2. CRu;n is the set of corated items
between u and n.
sim(u; n) =</p>
      <p>P
i CRu;n (rui
ru)(rni
rn)
qP
i CRu;n (rui
ru)2qPi CRu;n (rni
rn)2
(2)</p>
    </sec>
    <sec id="sec-11">
      <title>4. IMPROVED GROUP</title>
    </sec>
    <sec id="sec-12">
      <title>RECOMMENDATION ALGORITHM</title>
      <p>(IMPROVEDGRA)
BaseGRA identi es groups of similar users using a Ratings
Matrix, i.e., a matrix that contains all the preferences
expressed by users for the evaluated items.</p>
      <p>However, the number of items rated by users is much lower
than the number of available items. This leads to the
sparsity problem that is common in clustering.</p>
      <p>ImprovedGRA was conceived to improve the quality of the
clustering step of BaseGRA. ImprovedGRA identi es groups
giving as input to the k-means algorithm not the original
Ratings Matrix M , that contains the ratings already
expressed by users, but the complete Predicted Ratings Matrix
P R previously presented, where the predicted values of the
unrated items for each user are added.</p>
      <p>In order to do so, the individual recommendations are
predicted by ImprovedGRA at the beginning of the
computation. Using more values as input for the clustering, the
algorithm should be able to identify better groups, i.e., groups
composed by users having more correlated preferences. This
should lead to a higher overall quality of the group
recommendations.</p>
      <p>In order to improve the e ciency of group modeling, our
algorithm completes the Ratings Matrix, adding individual
In conclusion, ImprovedGRA performs the same steps
performed by BaseGRA but computes individual
recommendations before clustering the users. This allows to cluster
the users using more preferences and identify better groups.
The preferences expressed by users and the individual
recommendations are also used to model the group.</p>
    </sec>
    <sec id="sec-13">
      <title>5. EXPERIMENTS</title>
      <p>In this section we rst describe the strategy and aims which
drove our experiments.</p>
      <p>Then a state-of-the-art group recommender system that
automatically identi es groups, chosen for comparison with the
proposed approach, is described.</p>
      <p>Experiments setup and metrics used are then described and,
at the end of the section, results are shown and commented.</p>
    </sec>
    <sec id="sec-14">
      <title>5.1 Experimental Methodology</title>
      <p>In order to evaluate the quality of the system, three aspects
were considered: quality of the predicted ratings,
distribution of the quality between the groups and homogeneity of
the groups size. The details of each experiment will be
described next.
5.1.1 Quality of the predicted ratings evaluation
The main objective of a recommender system is to produce
high quality predictions. The algorithm presented in this
paper produces group recommendations adapting to the
bandwidth available for the recommendation process.
In order to evaluate the quality of the predicted ratings for
di erent bandwidths, i.e., for di erent numbers of channels
that can be dedicated to the recommendation, we built three
di erent partitions of the users in groups. A partition is a
set of n groups in which users are subdivided. Of course,
if groups are homogeneous, the larger is n, the smaller are
the groups and the system can predict better ratings,
because the preferences of a small amount of users have to be
combined.</p>
      <p>In order to properly evaluate the performances of the
proposed algorithms, we compared them with the results
obtained considering a single group with all the users
(predictions are calculated considering all the preferences expressed
for an item), and the results obtained using no partition of
the users (i.e., quality of the individual recommendations is
calculated).</p>
      <p>To measure the quality of the predicted ratings, we used the
Root Mean Squared Error (RMSE). This metric was chosen
because it is the most common in literature.</p>
      <p>In order to analyze the quality of the predictions produced
by each algorithm for di erent partitions, we produced a
plot that shows the trend of RMSE for each partition in n
groups.
5.1.2 Distribution of quality between the groups
evaluation
A second important aspect that has to be evaluated is how
the quality of the predicted results is distributed between
the groups of a partition.</p>
      <p>In fact a group recommender system should be able to
distribute the quality of the predicted results in a su ciently
equal way, in order to satisfy the recommendation demand
for all the users of the system.</p>
      <p>To analyze how RMSE is distributed between the groups
produced by ImprovedGRA, a table that contains the mean
value of RMSE for each partition and how many groups have
a RMSE value close/far to the mean is presented.
To compare the di erent algorithms, we measured the
standard deviation of the RMSE values obtained for every group
of a partition.
5.1.3 Distribution of size between the groups
evaluation
The last aspect we evaluated is how homogeneous are the
groups in terms of size. Indeed, it is not acceptable to have
too large or too small groups. At the same time the
clustering step cannot create an homogeneity which is not
intrinsically existent in users. To evaluate this trade-o we
measured the standard deviation of the size of the groups
present in a partition.</p>
    </sec>
    <sec id="sec-15">
      <title>5.2 Benchmark algorithm:</title>
    </sec>
    <sec id="sec-16">
      <title>ModularityBasedGRA</title>
      <p>
        The technique selected for comparison with ImprovedGRA,
is the one proposed in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. From now on, the algorithm will
be called ModularityBasedGRA, because of the approach
used to identify groups (based on the Modularity function).
ModularityBasedGRA is an algorithm that generates group
recommendations and automatically detects intrinsic
communities of users whose preferences are similar. The input
is a Ratings Matrix that associates a set of users to a set
of items through a rating. Based on the ratings expressed
by each user, the algorithm evaluates the level of
similarity between users and generates a network that contains the
similarities.
      </p>
      <p>
        A modularity-based Community Detection algorithm
proposed in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is run on the network in order to nd partitions
of users in communities. For each community, ratings for all
the items are predicted using an item-based CF algorithm.
Since the Community Detection algorithm is able to produce
a dendrogram, i.e. a tree that contains hierarchical
partitions of the users in communities of increasing granularity,
the quality of the recommendations can be evaluated for the
di erent partitions.
      </p>
      <p>To achieve the objectives previously outlined, i.e., detect
the communities and produce group recommendations for
them, ModularityBasedGRA computes four steps, described
below.</p>
      <p>Users similarity evaluation In order to create
communities of users, the algorithm takes as input a Ratings
Matrix and evaluates through a standard metric
(cosine similarity) how similar the preferences of two users
are. The result is a weighted network where nodes
represent users and each weighted edge represents the
similarity value of the users it connects. A post-processing
technique is then introduced to remove noise from the
network and reduce its complexity.</p>
      <p>
        Communities detection In order to identify intrinsic
communities of users, a Community Detection algorithm
proposed by [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is applied to the users similarity
network and partitions of di erent granularities are
generated.
      </p>
      <p>Ratings prediction for the items rated by the group
A group's ratings are evaluated by calculating, for each
item, the mean of the ratings expressed by the users of
the group. In order to predict meaningful ratings, the
algorithm calculates a rating only if an item was
evaluated by a minimum percentage of users in the group.
With this step it is not possible to predict a rating for
each item, so another step was created to predict the
remaining ratings.</p>
      <p>Ratings prediction for the remaining items For some
of the items, ratings could not be calculated by the
previous step. In order to estimate such ratings,
similarity between items is evaluated, and the rating of an
item is predicted with a CF item-based algorithm that
considers the items most similar to it.</p>
      <p>The choice to compare ImprovedGRA with this approach is
motivated by the fact that both approaches produce group
recommendations and automatically identify groups of users.
Moreover, both can be evaluated for di erent partitions of
users in groups. This allows a direct comparison between
the two approaches.</p>
      <p>Let us also note that even if the aim of the two algorithms
is the same, the two techniques work in completely di erent
ways: ImprovedGRA clusters users with a classic algorithm
(k-means) after building individual recommendations and
then models the groups preferences, while
ModularityBasedGRA clusters users with a Community Detection algorithm
and then builds group recommendations.</p>
    </sec>
    <sec id="sec-17">
      <title>5.3 Experiments Setup</title>
      <p>The experimentation was made using the MovieLens-1M
dataset, which is composed of 1 million ratings, expressed by
6040 users for 3900 movies. In order to evaluate the quality
of the ratings predicted by each of the algorithms, around
20% of the ratings was extracted as a test set and the rest
of the dataset was used as a training set for the algorithm.
Each group recommendation algorithm was run with the
training set and, for each partition of the users in groups,
ratings were predicted.</p>
      <p>The obtained values were used to conduct the experiments
previously described.</p>
    </sec>
    <sec id="sec-18">
      <title>5.4 Evaluation metrics</title>
      <p>This section will introduce the two metrics used to evaluate
di erent characteristics of our algorithm, the Root Mean
Squared Error (RMSE) and the Standard deviation. Both
metrics compare the obtained results with a comparison
value, in order to evaluate the quality of the system.</p>
      <sec id="sec-18-1">
        <title>5.4.1 Root Mean Squared Error (RMSE)</title>
        <p>The quality of the predicted ratings was measured through
the Root Mean Squared Error (RMSE). The metric
compares the test set with the predicted ratings: each rating rui
expressed by a user u for an item i is compared with the
rating pgi predicted for the item i for the group in which
user u is. The formula is shown below:</p>
        <p>RM SE =
r Pin=0(rui</p>
        <p>pgi)2
n
where n is the number of ratings available in the test set.</p>
      </sec>
      <sec id="sec-18-2">
        <title>5.4.2 Standard deviation</title>
        <p>The homogeneity of the groups size and the distribution of
RMSE between the groups was measured with the standard
deviation (considering respectively the size of the groups and
the RMSE values of the groups).</p>
        <p>The metric evaluates how much variation there is from the
\average" value. A low standard deviation indicates that the
size of the groups/the RMSE obtained for the groups tend to
be close to the mean, while high values of standard deviation
indicate that the obtained values are scattered over a large
range of values.</p>
        <p>v
= tuu N1 Xi=1 (xi</p>
        <p>N
x)2</p>
      </sec>
    </sec>
    <sec id="sec-19">
      <title>5.5 Experimental results</title>
      <p>The rst experiment, presented in 5.1.1, aims to evaluate the
quality of the predicted values for a partition of the users in
groups. Figure 1 shows the trend of the RMSE values for
the di erent partitions of the users in groups.
For all the algorithms, we can notice that as the number of
groups grows, the quality of the recommendations improves,
since groups get smaller and the algorithms can predict more
precise ratings. We can see that the values of RMSE notably
decrease when the algorithms start grouping the users (i.e.,
there is a big di erence of RMSE between 1 and 4 groups).
The RMSE values continue to decrease for the other
partitions, but the improvement in quality is lower.</p>
      <p>Comparing the algorithms, we can see that BaseGRA and
ImprovedGRA outperform the benchmark algorithm
ModularityBasedGRA. Moreover, the performances of
ImprovedGRA are much better than the performances of BaseGRA:
this proves that enhancing the Ratings Matrix with
individual recommendations leads to great improvements in the
quality of the predicted results.</p>
      <p>The second experiment, presented in 5.1.2, was conducted
to evaluate how the quality of the predicted values is
distributed between the groups. To do so we measured the
standard deviation of RMSE of the groups in each
partition.</p>
      <p>Partition
4 groups
x = 0; 93
13 groups
x = 0; 93
40 groups
x = 0; 96</p>
      <p>Number of groups with RMSE r
r = 0; 85</p>
      <p>1
r &lt; 0; 87</p>
      <p>3
r &lt; 0; 90
15
Table 1 shows, for each partition, the mean of the RMSE
obtained for every group with ImprovedGRA and how the
RMSE is distributed between the groups. As we can see,
the majority of the groups in each partition has a RMSE
value su ciently close to the mean. This means that RMSE
is distributed quite equally between the groups and our
approach is able to satisfy the recommendation demand for all
the users.</p>
      <p>RMSE is distributed less equally between the groups but
the quality of the predictions compared with the other
approaches is much higher.</p>
      <p>The third experiment, presented in 5.1.3 was conducted to
evaluate how the size of the groups is distributed in each
partition (i.e., how homogeneous are the groups in terms of
size). To do so we measured the standard deviation of the
size of all the groups in each partition.</p>
      <p>Partition
4 groups
x = 1510
13 groups
x = 464; 62
40 groups
x = 151
Figure 2 compares the standard deviation of RMSE of the
groups for the di erent approaches. ImprovedGRA values
are slightly higher if compared to the other approaches.
However, it is important to remember that in this case there
is a trade-o between an equal distribution in terms of RMSE
and the similarity between the users in a group. In fact the
groups have to be intrinsic in order to improve the quality
of the predicted results. So it seems reasonable to loose a
bit of homogeneity in distribution of the quality in order
to improve the overall quality of the results predicted by
the system. This is the case of ImprovedGRA in which the
Figure 3 compares the standard deviation of the size of the
groups for the di erent approaches. It is important to notice
how the enhancement of the Ratings Matrix made for
ImprovedGRA leads to more homogeneous partitions in groups
compared with BaseGRA.</p>
      <p>The values obtained by ImprovedGRA are slightly higher
than ModularityBasedGRA but also in this case there is a
trade of between homogeneity of the groups size and
similarity between the users. In fact it is important to nd
partitions of intrinsic groups with similar preferences that can
lead to a high quality of the predicted results. So, a little
loss in homogeneity of the size leads to great improvements
in the quality of the results.</p>
    </sec>
    <sec id="sec-20">
      <title>6. CONCLUSIONS AND FUTURE WORK</title>
      <p>In this paper we presented an algorithm that combines user
clustering with individual recommendations in order to
identify and model groups of users with similar preferences and
improve the quality of group recommendations in systems
that automatically identify groups. In fact, BaseGRA and
ImprovedGRA outperform the benchmark algorithm
ModularityBasedGRA.</p>
      <p>Moreover, we can notice that ImprovedGRA, using an
enhanced Ratings Matrix to identify and model the groups,
is able to produce su ciently homogeneous groups in terms
of size and distribution of RMSE. Therefore, all the three
important objectives that should be achieved by a group
recommender systems are reached by the proposed algorithm
ImproveGRA.</p>
      <p>
        Future developments of the algorithm have been planned for
di erent steps performed by the algorithm. In [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] several
strategies for group modeling were presented. We are
currently studying how di erent strategies a ect the quality of
group recommendation with groups that are automatically
identi ed.
      </p>
      <p>
        Recently [
        <xref ref-type="bibr" rid="ref12 ref7">7, 12</xref>
        ] highlighted how di erent metrics to evaluate
the quality of recommendation lead to completely di erent
results. As a future work we plan to evaluate our systems
with such metrics, in order to catch di erent aspects of our
system.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Amer-Yahia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chawla</surname>
          </string-name>
          , G. Das, and
          <string-name>
            <given-names>C.</given-names>
            <surname>Yu</surname>
          </string-name>
          . Group recommendation:
          <article-title>Semantics and e ciency</article-title>
          .
          <source>PVLDB</source>
          ,
          <volume>2</volume>
          (
          <issue>1</issue>
          ):
          <volume>754</volume>
          {
          <fpage>765</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ardissono</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Goy</surname>
          </string-name>
          , G. Petrone, and
          <string-name>
            <given-names>M.</given-names>
            <surname>Segnan</surname>
          </string-name>
          .
          <article-title>A multi-agent infrastructure for developing personalized web-based systems</article-title>
          .
          <source>ACM Trans. Internet Technol.</source>
          ,
          <volume>5</volume>
          (
          <issue>1</issue>
          ):
          <volume>47</volume>
          {
          <fpage>69</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ardissono</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Goy</surname>
          </string-name>
          , G. Petrone,
          <string-name>
            <given-names>M.</given-names>
            <surname>Segnan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Torasso</surname>
          </string-name>
          . Intrigue:
          <article-title>Personalized recommendation of tourist attractions for desktop and handset devices</article-title>
          .
          <source>Applied Arti cial Intelligence</source>
          ,
          <volume>17</volume>
          (
          <issue>8</issue>
          ):
          <volume>687</volume>
          {
          <fpage>714</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V. D.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-L.</given-names>
            <surname>Guillaume</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lambiotte</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.</given-names>
            <surname>Lefebvre</surname>
          </string-name>
          .
          <article-title>Fast unfolding of communities in large networks</article-title>
          .
          <source>J. Stat. Mech.</source>
          ,
          <year>2008</year>
          (
          <volume>10</volume>
          ):P10008+,
          <year>October 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Boratto</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Carta</surname>
          </string-name>
          .
          <article-title>State-of-the-art in group recommendation and new approaches for automatic identi cation of groups</article-title>
          . In G. A.
          <string-name>
            <surname>Alessandro</surname>
            <given-names>Soro</given-names>
          </string-name>
          , Eloisa Vargiu and G. Paddeu, editors,
          <source>Information Retrieval and Mining in Distributed Environments</source>
          . Springer Verlag. In press,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Boratto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Carta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chessa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Agelli</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Clemente</surname>
          </string-name>
          .
          <article-title>Group recommendation with automatic identi cation of users communities</article-title>
          .
          <source>In Web Intelligence/IAT Workshops</source>
          , pages
          <volume>547</volume>
          {
          <fpage>550</fpage>
          . IEEE,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E.</given-names>
            <surname>Campochiaro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Casatta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Turrin</surname>
          </string-name>
          .
          <article-title>Do metrics make recommender algorithms?</article-title>
          <source>International Conference on Advanced Information Networking and Applications Workshops</source>
          ,
          <volume>0</volume>
          :
          <fpage>648</fpage>
          {
          <fpage>653</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>I.</given-names>
            <surname>Cantador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E. P.</given-names>
            <surname>Superior</surname>
          </string-name>
          .
          <article-title>Extracting multilayered semantic communities of interest from ontology-based user pro les: Application to group modelling and hybrid recommendations. In Computers in Human Behavior, special issue on Advances of Knowledge Management and the Semantic</article-title>
          . Elsevier. In press,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>L. M. de Campos</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          <string-name>
            <surname>Fernandez-Luna</surname>
            ,
            <given-names>J. F.</given-names>
          </string-name>
          <string-name>
            <surname>Huete</surname>
            , and
            <given-names>M. A.</given-names>
          </string-name>
          <string-name>
            <surname>Rueda-Morales</surname>
          </string-name>
          .
          <article-title>Group recommending: A methodological approach based on bayesian networks</article-title>
          .
          <source>In ICDE Workshops</source>
          , pages
          <volume>835</volume>
          {
          <fpage>844</fpage>
          . IEEE Computer Society,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>L. M. de Campos</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          <string-name>
            <surname>Fernandez-Luna</surname>
            ,
            <given-names>J. F.</given-names>
          </string-name>
          <string-name>
            <surname>Huete</surname>
            , and
            <given-names>M. A.</given-names>
          </string-name>
          <string-name>
            <surname>Rueda-Morales</surname>
          </string-name>
          .
          <article-title>Managing uncertainty in group recommending processes. User Model. User-Adapt</article-title>
          . Interact.,
          <volume>19</volume>
          (
          <issue>3</issue>
          ):
          <volume>207</volume>
          {
          <fpage>242</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Goldberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nichols</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. M.</given-names>
            <surname>Oki</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Terry</surname>
          </string-name>
          .
          <article-title>Using collaborative ltering to weave an information tapestry</article-title>
          .
          <source>Communication of the ACM</source>
          ,
          <volume>35</volume>
          (
          <issue>12</issue>
          ):
          <volume>61</volume>
          {
          <fpage>70</fpage>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gunawardana</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Shani</surname>
          </string-name>
          .
          <article-title>A survey of accuracy evaluation metrics of recommendation tasks</article-title>
          .
          <source>J. Mach. Learn. Res.</source>
          ,
          <volume>10</volume>
          :
          <fpage>2935</fpage>
          {
          <fpage>2962</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jameson</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Smyth</surname>
          </string-name>
          .
          <article-title>Recommendation to groups</article-title>
          . In P. Brusilovsky,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kobsa</surname>
          </string-name>
          , and W. Nejdl, editors,
          <source>The Adaptive Web: Methods and Strategies of Web Personalization</source>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>J. B. MacQueen.</surname>
          </string-name>
          <article-title>Some methods for classi cation and analysis of multivariate observations</article-title>
          . In L.
          <string-name>
            <surname>M. L. Cam</surname>
          </string-name>
          and J. Neyman, editors,
          <source>Proc. of the fth Berkeley Symposium on Mathematical Statistics and Probability</source>
          , volume
          <volume>1</volume>
          , pages
          <fpage>281</fpage>
          {
          <fpage>297</fpage>
          . University of California Press,
          <year>1967</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>T. W.</given-names>
            <surname>Malone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. R.</given-names>
            <surname>Grant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. A.</given-names>
            <surname>Turbak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Brobst</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Cohen</surname>
          </string-name>
          .
          <article-title>Intelligent information-sharing systems</article-title>
          .
          <source>Communication of the ACM</source>
          ,
          <volume>30</volume>
          (
          <issue>5</issue>
          ):
          <volume>390</volume>
          {
          <fpage>402</fpage>
          ,
          <year>1987</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mastho</surname>
          </string-name>
          . Group modeling:
          <article-title>Selecting a sequence of television items to suit a group of viewers. User Modeling</article-title>
          and
          <string-name>
            <surname>User-Adapted</surname>
            <given-names>Interaction</given-names>
          </string-name>
          ,
          <volume>14</volume>
          (
          <issue>1</issue>
          ):
          <volume>37</volume>
          {
          <fpage>85</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>M. O'Connor</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Cosley</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          <string-name>
            <surname>Konstan</surname>
            , and
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Riedl</surname>
          </string-name>
          .
          <article-title>Polylens: a recommender system for groups of users</article-title>
          .
          <source>In ECSCW'01: Proceedings of the seventh conference on European Conference on Computer Supported Cooperative Work</source>
          , pages
          <volume>199</volume>
          {
          <fpage>218</fpage>
          ,
          <string-name>
            <surname>Norwell</surname>
          </string-name>
          , MA, USA,
          <year>2001</year>
          . Kluwer Academic Publishers.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ram</surname>
          </string-name>
          .
          <article-title>Intelligent agents and the world wide web: Fact or ction</article-title>
          ?
          <source>Journal of Database Management</source>
          ,
          <volume>12</volume>
          (
          <issue>1</issue>
          ):
          <volume>46</volume>
          {
          <fpage>49</fpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>P.</given-names>
            <surname>Resnick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Iacovou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Suchak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bergstorm</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          . Grouplens:
          <article-title>An open architecture for collaborative ltering of netnews</article-title>
          .
          <source>In Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work</source>
          , pages
          <volume>175</volume>
          {
          <fpage>186</fpage>
          ,
          <string-name>
            <surname>Chapel</surname>
            <given-names>Hill</given-names>
          </string-name>
          , North Carolina,
          <year>1994</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>J. B. Schafer</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Frankowski</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Herlocker</surname>
            , and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Sen</surname>
          </string-name>
          .
          <article-title>Collaborative ltering recommender systems</article-title>
          .
          <source>In The Adaptive Web: Methods and Strategies of Web Personalization</source>
          , volume
          <volume>4321</volume>
          of Lecture Notes in Computer Science, chapter
          <volume>9</volume>
          , pages
          <fpage>291</fpage>
          {
          <fpage>324</fpage>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>