<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Improving Social Recommendations by applying a Personalized Item Clustering policy</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Georgios Alexandridis</string-name>
          <email>gealexandri@islab.ntua.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Georgios Siolas</string-name>
          <email>gsiolas@islab.ntua.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Stafylopatis</string-name>
          <email>andreas@cs.ntua.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Electrical and, Computer Engineering, National Technical University</institution>
          ,
          <addr-line>of Athens, Zografou, 157 80</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>8</lpage>
      <abstract>
        <p>In online Recommender Systems, people tend to consume and rate items that are not necessarily similar to one another. This phenomenon is a direct consequence of the fact that human taste is in uenced by many factors that cannot be captured by pure Content-based or Collaborative Filtering approaches. For this reason, a desirable property of Recommender Systems would be to identify correlations between seemingly di erent items that might be of interest to a particular user. This course of action is expected to improve the novelty and the diversity of the recommendations and therefore increase user satisfaction. In this paper, we address this problem by proposing a socially-aware personalized item clustering recommendation algorithm. We are trying to locate patterns between the items that a user has evaluated by grouping them into different clusters according to the rating behavior of the members of his Personal Network, which includes the individuals in his direct social network and those other persons that the user exhibits a similar item evaluation behavior. Once the clustering phase has been completed, we use each cluster's members as seed items in order to construct an item consumption network. Then, by performing a random walk on the aforementioned network, we are able to produce recommendations that are accurate and at the same time novel and diverse. Preliminary results reveal the potential of this idea.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Categories and Subject Descriptors</title>
      <p>H.3.3 [Information Search and Retrieval]: Clustering;
G.3 [Probability and Statistics]: Stochastic processes;
F.1.2 [Modes of Computation]: Probabilistic
computation
social recommendation, personal network, spectral
clustering, item consumption network, random walk</p>
    </sec>
    <sec id="sec-2">
      <title>1. INTRODUCTION</title>
      <p>Research over Recommended Systems (RS) has grown
almost exponentially over the past years. Initial approaches
tried to model the user to item interaction evident in every
RS in a plethora of ways; collaborative ltering algorithms
exploited the said relationship in order to infer similarities
and dissimilarities in taste between users that are not
otherwise related to one another (implicit user to user
interaction). On the other hand, content-based approaches tried to
estimate item relevance by accumulating all available
evaluations for each item and then compare them in some metric
space. Lastly, hybrid systems combined both approaches,
along with other possible sources of information (i.e. in the
form of metadata) in an e ort to produce better and more
meaningful recommendations.</p>
      <p>It is an indisputable fact that the recommendation
process has an inherent social dimension as well. Apart from
the fact that opinion and taste are formulated by a person's
interaction with his environment, it is also a common
everyday practice to turn to family, friends and acquaintances
when we want to make a decision or a purchase. It could be
further elaborated that we also tend to a liate and establish
bonds with people that we share the same interests with.</p>
      <p>
        The emergence of Web 2.0 technologies and the advent of
Online Social Networks (OSNs) introduced the social
relationships into the digital era and inevitably in the
recommendation process, brining about the so-called Social
Recommender Systems (SRS). The user to item interaction of
traditional RS has been extended in order to include explicit
user to user connections. Those ties are extremely useful for
the recommendation process as one of the most
fundamental characteristics of social networks is Homophily [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. This
term means that the members of a social network tend to
be more similar to those individuals they are connected with
than other persons with whom they don't share direct
relationships. Or, more strictly speaking, the acquaintances
of any given member of an OSN do not constitute random
samples drawn from the whole of the underlying population.
This observation has also been experimentally con rmed by
the work of Singla et at [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>Homophily is in uenced by many factors such as age,
gender, educational level, ethnicity etc. In our work, we plan
to utilize this phenomenon in a novel approach by trying to
examine the extend to which an individual's interests are
inuenced by the people he knows. We start by constructing
a personal network for each user; one that includes the
persons he has some interactions with, both direct and indirect.
This personal network is the basis for the item clustering
algorithm that places the items that this user has already
consumed in di erent clusters, according to the consumption
behavior of her peers. Once the clustering phase has been
completed, the items of each cluster are expected to share
some common characteristics. Therefore, using them as seed
items and by performing a random walk on a specially
constructed item consumption network for this purpose, we will
be able to get recommendations that will be accurate, novel
and diverse.</p>
    </sec>
    <sec id="sec-3">
      <title>RELATED WORK</title>
      <p>
        The application of clustering methodologies into the
traditional RS domain is not new [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Researches have been using
them in order to uncover the implicit relationships between
users (or items) based on the available evaluations. In most
cases, the purpose of clustering is to aide or to substitute
user and item neighborhood formation.
      </p>
      <p>
        The fusion of explicit user ties, in the form of social
information, into the clustering process is a relatively novel
research eld. Most approaches use them as an alternative
information source, upon which they apply existing
clustering algorithms. For example, the authors of [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] propose a
methodology to compute the inferred trust value between
any pair of users and then perform a correlation
clustering of the user space based on those values. They introduce
their method into traditional memory-based algorithms (CF
and trust-based) and witness a relative improvement in the
recommendation accuracy.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], the authors apply the a nity propagation clustering
algorithm for the dynamic identi cation of the user clusters.
The distance measure they employ is the Jaccard coe cient;
the number of common neighbors in the trust network
between a pair of users. Their experiments have shown that
their system outperforms other traditional clustering
techniques (such as k-means) in terms of the accuracy of the
recommendations as the number of user clusters grows.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], the authors perform a hierarchical clustering of
the user's social relationships in order to form user
neighborhoods. They use modularity, a graph-based criterion to
stop the cluster formation. In short, modularity counts the
fraction of the in-cluster edges against the number of edges
pointing to other clusters. In the next step, those
neighborhoods are fed into user-based and item based CF algorithms
that produce the recommendations. They have tested their
approach into two di erent datasets and have witnessed
improvement both in the accuracy and the usefulness of the
recommendations as measured by metrics such as precision
and recall.
      </p>
      <p>Our approach di ers from those mentioned above in three
key aspects; rstly, the clustering level is local
(personalized) instead of system-wide, rendering our algorithm more
scalable in terms of the dimensionality of the data involved
in the computations. Secondly, while most approaches are
geared towards grouping similar users, our approach is
focused on locating similar items through the identi cation of
common access patterns in the personal network. Finally,
the social network information is used in the ltering phase
rather than the clustering algorithm itself.</p>
    </sec>
    <sec id="sec-4">
      <title>THE PERSONAL NETWORK</title>
      <p>So far, a person's taste has been discussed in relation to
other people. However, taste itself is also a multidimensional
concept; after all most people are not con ned to having a
single interest only. For example, if someone has an interest
in gardening and in playing music, she is expected to
evaluate gardening and music books. And since not all people
who are interested in gardening are also interested in music,
his rating behavior might seem bizarre from the RS point
of view; indeed, even in large scale RS, few people would
be expected to rate items of both the aforementioned
categories.</p>
      <p>We may further elaborate on this example by
considering that this user's social network would consist of roughly
two categories of people; those she's met via her hobbies
and other, more general acquaintances. It is therefore quite
likely that the target user and some of her gardening peers
would have evaluated some gardening items in common. Our
intuition is to try to locate such relationships in the RS and
it is for this reason that we are performing a personalized
clustering of the consumed items. The personalized
clustering in this sense means that not every item evaluation in the
RS is taken into account. Instead, they are ltered by those
users that are explicitly (through the social network) or
implicitly (through similar consumption behavior) related to
the target user.</p>
      <p>Our proposed algorithm works in two stages; initially, the
ratings of a speci c user above a certain relevance
threshold are retrieved. Then, these items are put into one or
more di erent clusters according to some well-de ned
criteria. Following, the items of each cluster are used as a seed
in order to construct an item consumption network. This
structure is traversed in a random walk fashion for the
recommendation of new items to the user.
3.1</p>
    </sec>
    <sec id="sec-5">
      <title>Relationship Types</title>
      <p>
        The explicit user to user ties that are evident in OSN in
general and in SRS in particular lead to a variety of
connections among them. It is therefore necessary to de ne
the relationship types that would be considered valid for
the creation of their personal network. The discipline of
social network analysis examines simple and advanced
structures present in social networks [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. However, the nature of
our research places the emphasis on Dyadic Relationships.
They constitute the most common relationship type in OSN
and they directly involve two persons or actors. Those links
may either be symmetric (or bidirectional) as in the case of
Friendship or non-symmetric (directional). The most
common directional link in SRS is Trust, where a user expresses
his opinion on another user's behavior. These remarks are
usually private to the user who has issued them and
signify how useful she has found the interaction with the other
user. Trust statements are either binary (i.e. trust/distrust)
or assume a broader set of values (usually in the [0; 1]
interval). The SRS that incorporate trust statements in the
recommendation process are also know as Trust-aware
Recommender Systems.
3.2
      </p>
    </sec>
    <sec id="sec-6">
      <title>Network Formation</title>
      <p>As it has been stated before, we are interested in
providing a personalized ltering of the available items in the
RS for each speci c user. The basis of our approach is the
Personal Network (PN) of each user which is constructed
by two di erent but not necessarily distinct pools of users;
those that are part of the target user's social network and
those that are similar (de ned by some notion of similarity)
to him. Members of the personal network of a said user may
be further categorized in the following groups</p>
      <p>Users in the direct social network of the target user
that bear a similarity to her
s7
s6</p>
      <p>t2
s3, t3
ut</p>
      <p>Most widely-adopted indices of similarity in RS are the
Pearson correlation coe cient, the cosine similarity and the
Manhattan similarity. The rst two measure similarity quite
well when there is a large overlap between the items rated
by di erent users, while the latter is more suitable in those
cases where the overlap in ratings between di erent users is
small.</p>
      <p>Generally, the social and the similarity graphs could be
accessed in a number of ways and indeed there is a wealth of
methodologies in the SRS literature. As an initial approach,
in order to estimate user proximity, we opted for weighted
path counting criteria that are dependent on the structure
of the PN. For example, in Figure 1, u3 is considered to be
the most \close" user to the target user ut since he belongs
both to his direct social network (edge weight t3) , the direct
social network of ut's friend, u2 (edge weight t2;1) and nally
to ut's similarity network (edge weight s3). Following, users
u4 and u7 may be reached by two simple paths of lengths
one and two that originate from ut. In order to estimate
who is most proximate to him, we compute a total value of
each path by accumulating the respective weights.
4.1</p>
    </sec>
    <sec id="sec-7">
      <title>PERSONALIZED CLUSTERING</title>
    </sec>
    <sec id="sec-8">
      <title>The Item-to-Item Adjacency Matrix</title>
      <p>After determining the importance of each member in a
user's PN, we proceed into creating the item matrix A. Let
It be the items that have already been evaluated by target
user ut above a given relevance threshold rrel (that might
be dependent on ut). Then A is the n n adjacency matrix
i2
1
3
i1
i5
4
3
i7
i4
i3
(where n = jItj) whose ai;j and aj;i elements denote the
frequency items i; j 2 It have been accessed together (above
the relevance threshold) by members of ut's PN. Equation
1 displays an example item matrix for a user who has
evaluated 7 items.
It should be noted that since not all users in the PN are
equally proximate to the target user, the elements of the
matrix are not integer sums in the general case; indeed the
contribution of each peer in the frequency of the access
pattern between two items is not constant (i.e 1) but it is
actually scaled according to his proximity to the target user
(subsection 3.2).</p>
      <p>
        By de nition, matrix A is symmetric and may be viewed
as the adjacency matrix of an undirected graph whose nodes
are the items already evaluated by ut and whose edges
represent evaluation patterns on the same itemset by users in
ut's PN. Naturally, one would expect some items to be
accessed together more often than others and this phenomenon
is re ected on the graph by the formation of item clusters
(Figure 2). We wish to distinguish those clusters and for
this reason we apply a spectral clustering algorithm [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] on
the matrix A that is outlined in the following subsection.
2. Compute the normalized symmetric Laplacian matrix
L = I D 1=2AD1=2
3. Compute the n eigenvalues of L 1 2 n
and the corresponding eigenvectors v1 : : : vn. Since L
is symmetric, all its eigenvalues are real, but not
necessarily di erent; that is some eigenvalues might appear
more than once, or more formally, have a multiplicity
larger than 1.
4. According to the Perron-Frobenius theorem, the
smallest eigenvalue of a non-negative symmetric matrix is
always 0 and its multiplicity k speci es the connected
components of the graph induced by the similarity
matrix A. We set k as the number of the desired item
clusters.
5. Construct the n k matrix U whose column vectors
are the k largest eigenvectors of L, v1 : : : vk.
6. Cluster the n row vectors of U into k clusters (C1 : : : Ck)
using an appropriate clustering methodology (e.g.
kmeans clustering). Then place each item ot its
corresponding cluster by using a distance criterion.
      </p>
      <p>The eigen decomposition of step 3 is not a time consuming
task, since in the overwhelming majority of the cases, matrix
A (and consequently matrix L) are of low dimensionality (i.e.
n &lt; 100).</p>
    </sec>
    <sec id="sec-9">
      <title>THE ITEM CONSUMPTION NETWORK</title>
      <p>
        For each cluster created in the previous step of our
algorithm, we construct an Item Consumption Network (ICN)
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. An example of such a network is illustrated in Figure
3. The black dotted nodes s1 to s3 are all members of the
same cluster (interconnected with continuous edges). The
gray nodes i1 to i4 represent other items that have been
evaluated by members of the target user's PN but not by
the target user herself. An edge from a node of the
former category to a node in the latter represents the fact that
these two items have been accessed together by members
of ut's PN the number of times indicated by the weight of
the respective edge. In the aforementioned gure, items s2
and i1 have been evaluated together by 4 peers in ut's PN.
Finally the number in parentheses at each node shows the
number of evaluations this particular item has received from
the members of the PN.
      </p>
      <p>
        In order to produce recommendations, the ICN is modeled
as a graph with the purpose of performing a modi ed
random walk on it. Indeed, the ICN graph has the properties
of being connected and non-bipartite and for this reason it
may be viewed as a symmetric time-reversible nite Markov
chain [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The symmetric property is easily deduced from
the fact that the graph is undirected (Figure 3). The term
modi ed is a consequence of the fact that pi;j is not set to
be inversely proportional to the degree of the current node,
as in uniform random walks. Instead, it is the result of the
computation of the edge weight and the number of the
evaluations of both the current and the following node in the
walk.
      </p>
      <p>
        A fundamental property of the random walks on nite
Markov chains is that they reach their steady state
distribution regardless of which is the starting node each time
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. If M is the m m symmetric transition matrix of the
chain (where m is the cardinality of the itemset in the ICN)
such that M = (pi;j ); i; j 2 V; (i; j) 2 E then the following
equation holds true
      </p>
      <p>M T
=
(2)
which may be interpreted as the fact that the matrix M has
its largest eigenvalue equal to 1 and that the corresponding
left eigenvector is the steady state distribution .
Consequently, if we perform an eigen decomposition on matrix
M , we are able to retrieve the steady state vector whose
entries correspond to the probability of each item node being
visited by the random walk.</p>
      <p>
        In some cases, however, the dimensionality of M could
be large and therefore eigen decomposition may become a
time consuming task. For this reason, we follow the iterative
approach outlined below in order to approximate [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]:
1. Create the m 1 vector r and set its entries to 0 apart
from those which correspond to the seed items that
assume the value jS1j , (jSj is the cardinality of the seed
itemset)
2. Compute w = M
      </p>
      <p>r.
3. While dist(w; r) &gt; . Here dist(w; r) denotes a
function that computes vector distance in a metric space
(e.g. euclidean distance) and is the vector similarity
threshold (e.g. = 10 4)
(a) Set r = w
(b) Compute w = M</p>
      <p>r
4. End While
5. Return as recommendations the N non-seed nodes with
the largest probability value in w.
6.</p>
    </sec>
    <sec id="sec-10">
      <title>EXPERIMENTS</title>
      <p>
        We performed a series of initial experiments of our
algorithm and of other reference content-based, collaborative
and trust based systems on the Epinions dataset crawled
from the Epinions website by Massa and Bhattacharje [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
Table 1 outlines the characteristics of this dataset. It should
be noted that trust statements in this case are unweighted
(binary). A rst remark is that the dataset is extremely
sparse both in terms of the ratings' density and of the
connectedness of the trust network as measured by the global
clustering coe cient. Besides sparsity, the dataset contains
a large proportion of users and items with too few ratings.
These characteristics greatly a ect the quality of the
produced recommendations.
      </p>
      <p>Another peculiarity of this speci c dataset is that the
ratings of the items are not uniformly distributed in the rating
scale (1 to 5) but are skewed towards the upper scale (4 and
5) by a ratio of 1 to 3. This is attributed to the behavioral
phenomenon of users not rating items they've consumed in
general, but of predominantly rating items they have both
consumed and liked. Therefore, any RS that would blindly
recommend an item with a preference rating between 4 and
5 would exhibit satisfactory performance. This observation
is both an indication that RS e ciency should not solely
rely on optimizing the accuracy of the predicted preference
value and a justi cation of examining other aspects of RS
performance, such as the novelty and the diversity of the
recommendations (that will be discussed below).</p>
      <p>Since one of our stated goals has been the production of
accurate recommendations, we have chosen to evaluate RS
performance on the Statistical Accuracy metrics. Their
purpose is to measure how close the recommendation ru;i is
d
to the actual rating ru;i. The most widely used metric in
this sense is the Root Mean Square Error (RMSE), which is
de ned over a Test Set T as:</p>
      <p>v
RM SE = tuu 1 jT j</p>
      <p>X(rdu;i
jT j n=1
ru;i)2
(3)
where jT j is the cardinality of the test set.</p>
      <p>Another important evaluation metric of RS is the Ratings'
Coverage; that is the percentage of ratings for which the
system manages to produce recommendations. It should be
pointed out that a RS which exhibits satisfactory results in
the statistical accuracy metrics is still considered to perform
poorly if it manages to produce recommendations only for
a handful of users or items. More formally, the Rating's
Coverage is de ned as</p>
      <p>Coverage = 100 jTRj
jT j
where jTRj is the cardinality of the set of the items for which
the RS produced recommendations (generally, TR T ).</p>
      <p>
        Our objective has also been to measure the quality of the
recommendations is terms of how novel the new items are,
compared to what the target user has already evaluated.
Clearly, a RS that makes palpable recommendations is
considered to be of low performance even if it is accurate. In our
experiments, we used the de nition of Distance-based Item
Novelty [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which models the novelty of each item (in a list
of recommended items) with respect to all the other already
evaluated items on an euclidean space. More speci cally, it
(4)
is de ned as the average distance between the item i at hand
and the other already consumed items (that form the set S):
novelty(ijS) = X p(jjS)d(i; j)
      </p>
      <p>j2S
where p(jjS) denotes the probability of item j to have been
already evaluated and d(i; j) is the distance measure. Most
commonly, distance is related to similarity via the relation
d(i; j) = 1 sim(i; j). The similarity measure can be any of
the similarity measures mentioned before (Pearson, cosine,
etc). In our experiments, we used the Manhattan distance
measure which is de ned as the absolute distance between
items i and j (Equation 6)
d(i; j) = jru;i
rdu;ij</p>
      <p>
        The last evaluation measure we have examined is diversity.
Diversity measures how di erent are the recommended items
in a list from one another. Ideally, a RS that is able to
capture the multitude of human taste should be equally able
to propose items that would ful ll most of the interests of
its users. We used the de nition of Intra-List Diversity [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
diversity(Rju) =
2
      </p>
      <p>X d(ik; in)
jRj(jRj + 1) k&lt;n
where R is the list of n recommended items (n = jRj) and
ik 2 R.
6.1</p>
    </sec>
    <sec id="sec-11">
      <title>Other Systems</title>
      <p>In order to thoroughly examine the performance of our
proposed algorithm, we have implemented a number of RS
(both traditional and social) and have evaluated them on
the same Epinions dataset.
6.1.1</p>
      <p>Baseline Systems</p>
      <p>Those systems are given for reference purposes, in order
estimate the relative performance improvement of the other
approaches. ItemMean recommends each item according to
the mean value of the evaluations received for that item so
far, treating each rater equally (that is, without considering
if there exist any relationships between the raters). On the
other hand, UserMean recommends each item according to
the mean value of the evaluations given by the target user
to other items so far, treating each item equally (and not
considering any correlations in-between the items).
6.1.2</p>
      <p>Collaborative Filtering and Item-Based
Recommendation</p>
      <p>
        At the heart of those traditional RS is the prediction
formula proposed in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]:
(5)
(6)
(7)
(8)
jNj
      </p>
      <p>P wut;uc (ut
r[ut;it = rut + i=1
rut;it )
jNj
P wut;uc
i=1
where ut is the target user and uc 2 UN all of his neighbors
with whose the similarity value wut;uc is computable.
Equation 8 takes an equivalent form for the item based
recommender, where rut is substituted by rit and wut;uc by wit;ic
and ic 2 IN is now the set of items similar to it (whose
similarity value wit;ic is computable).</p>
      <p>
        Trust-based systems are roughly put into two large
categories according to the way trust values are processed [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
The rst approach is to accumulate all available trust
statements in the RS in order to estimate the system-wide in
uence of each user. The RS that follow this principle are also
known as reputation systems and rely on global trust metrics
in order to calculate the reputation of each and every user.
The second approach is to examine trust in a user-centric
level; the emphasis is placed on each individual user and the
RS departs from her in order to explore her trust network.
      </p>
      <p>
        Since our algorithm processes the trust network on a
local rather than a global level, we sought comparisons with
two other SRS based on local trust metrics. The st SRS is
based on the gradual trust metric MoleTrust [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] proposed by
Massa and Avesani. The trust graph is rstly transformed
into an acyclic form (a tree) by removing all loops in it and
then the trust statements are accumulated in a depth- rst
fashion, starting from each user, up to each and very other
user (in the trust network). The propagation horizon
determines the length of the exploration; the most common
forms being MoleTrust-1, where only the users that target
user trusts are considered, and MoleTrust-2, where the
exploration also includes those trusted by those the target user
trusts. More formally, if Tut is the set that includes all users
in ut's trust network that have rated item it (which has
not been evaluated by the target user yet), then the
recommendation value r[ut;it is approximated using the following
formula (trust-based collaborative ltering ):
r[ut;it = rut +
      </p>
      <p>P tut;u(ru;it
u2Tut</p>
      <p>P tut;u
u2Tut
ru)
(9)
where rut is the mean of the ratings ut has provided so far.</p>
      <p>
        The second RS is based on another popular gradual trust
metric, TidalTrust, proposed by Golbeck [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. TidalTrust is
di erent from MoleTrust in the sense that no propagation
horizon is required for the accumulation of trust; instead
the shortest path from the target user to each other user
in the trust network is computed. All trust paths above a
prede ned threshold form the Web of Trust (WOT) for that
particular user. If there exist more than one trust paths
between two users, then the one with the biggest trust value
is chosen. If W OTut is the set that includes those users
in ut's web of trust network that have rated item it, then
the recommendation value r[ut;it is approximated using the
formula (trust-based weighted mean):
r[ut;it =
      </p>
      <p>P tut;uru;it
u2Tut</p>
      <p>P tut;u
u2Tut
(10)
6.2</p>
    </sec>
    <sec id="sec-12">
      <title>Results</title>
      <p>Table 2 summarizes some rst results on the set of the
performance metrics presented earlier in this section. All
results were obtained on the Epinions dataset by using the
leave-one-out validation methodology, for a list of 5
recommended items. Naturally, a lower RMSE score means more
accurate predictions while higher scores for coverage, novelty
and diversity signify that the RS is able to produce
recommendations for more of its users, that are more novel and
more diverse respectively (the last three metrics are given
in the percentage scale). A RS is then considered to
outperform another if it achieves a better outcome on all (or most
of) the performance metrics.</p>
      <p>A rst observation is that our algorithm exhibits a steady
performance lead in terms of the accuracy, the novelty and
the diversity of the recommendations. We attribute these
encouraging results, especially the accuracy of the
recommendations, to the way the personal network of each user
is processed and his in uential neighbors are located
(subsection 3.2). Moreover, the increased novelty and diversity
of the recommended items are due to the personalized item
clustering strategy that we have followed. Indeed, in many
cases, this technique manages to capture the diverse
interests of each individual user.</p>
      <p>Some of the trust-based system display a very satisfactory
behavior in terms of the ratings' coverage. This stems from
the aggressive manner in which they process the trust
network, by accumulating all users up to a certain depth. It
is for this reason, of not being selective, that they fail to
achieve better results in terms of the accuracy of the
recommendations, even though they are able to cover more users.
However, trust statements in the dataset used in the
experiments are binary; had they assumed values in a broader
range, then the Equations 9 and 10 would have revealed
their full potential.</p>
      <p>Lastly, the baseline systems may exhibit very satisfactory
results in terms of the accuracy and coverage metrics
(particularly ItemMean) but those are rather the consequences
of the peculiarities of the dataset at hand. As it has
already been discussed before, a RS that would recommend
any item with a value in the range [4; 5] would exhibit
satisfactory results. It is obvious, however, that this approach
is not plausible for any practical RS.</p>
    </sec>
    <sec id="sec-13">
      <title>CONCLUSIONS AND FUTURE WORK</title>
      <p>In this work, we proposed a novel social recommendation
methodology based on personalized item clustering. Even
though the initial results are satisfactory, we feel there is
room for further enhancements on our algorithm. More
speci cally, it would be desirable to improve the procedure
of the formation of the personal network. A line of research
we are currently examining is to model trust and similarity
as the marginal probabilities of an unknown joint probability
distribution that we would like to approximate. Then user
proximity would be estimated by directly sampling from the
aforementioned distribution.</p>
      <p>The spectral clustering algorithm could also be further
developed by introducing some criteria that would examine
the size and the quality of the produced clusters. This may
be achieved by considering di erent clustering policies, such
as the fuzzy k-means clustering, along with di erent distance
functions (e.g. Chebysev of Mahalanobis distance) for the
placement of the items within the clusters.</p>
      <p>Finally, the properties of the random walk on the item
consumption network (i.e. mixing rate) may be ne-tuned
if we consider di erent transition probabilities, ones that
would incorporate more information about the
characteristics of each item node. A possible line of research in this
area is to use more ne-grained approaches to model the
relationship between the items.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bobadilla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ortega</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hernando</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Gutierrez</surname>
          </string-name>
          .
          <article-title>Recommender systems survey</article-title>
          .
          <source>Knowledge-Based Systems</source>
          ,
          <volume>46</volume>
          (
          <issue>0</issue>
          ):
          <volume>109</volume>
          {
          <fpage>132</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Breese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Heckerman</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Kadie</surname>
          </string-name>
          .
          <article-title>Empirical analysis of predictive algorithm for collaborative ltering</article-title>
          .
          <source>In Proceedings of the 14 th Conference on Uncertainty in Arti cial Intelligence</source>
          , pages
          <fpage>43</fpage>
          {
          <fpage>52</fpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vargas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <article-title>Novelty and diversity metrics for recommender systems: Choice, discovery and relevance</article-title>
          .
          <source>In International Workshop on Diversity in Document Retrieval (DDR 2011) at the 33rd European Conference on Information Retrieval (ECIR</source>
          <year>2011</year>
          ), Apr.
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>DuBois</surname>
          </string-name>
          , J. Golbeck,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kleint</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Srinivasan</surname>
          </string-name>
          .
          <article-title>Improving recommendation accuracy by clustering social networks with trust</article-title>
          .
          <source>In Recommender Systems &amp; the Social Web</source>
          , volume
          <volume>532</volume>
          , pages
          <issue>1{8</issue>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>X. Z. Georgios</given-names>
            <surname>Pitsilis</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          .
          <source>5th IFIP WG 11</source>
          .11 International Conference,
          <string-name>
            <surname>IFIPTM</surname>
          </string-name>
          <year>2011</year>
          , Copenhagen, Denmark, June 29 - July 1,
          <year>2011</year>
          . Proceedings, chapter
          <article-title>Clustering Recommenders in Collaborative Filtering Using Explicit Trust Information</article-title>
          , pages
          <volume>82</volume>
          {
          <fpage>97</fpage>
          . 358. Springer Berlin Heidelberg,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Golbeck</surname>
          </string-name>
          .
          <article-title>Computing and applying trust in web-based social networks</article-title>
          .
          <source>PhD thesis</source>
          , College Park, MD, USA,
          <year>2005</year>
          . AAI3178583.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Xiang</surname>
          </string-name>
          , E. Chen,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Bao</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zheng</surname>
          </string-name>
          .
          <article-title>In uential seed items recommendation</article-title>
          .
          <source>In Proceedings of the sixth ACM conference on Recommender systems, RecSys '12</source>
          , pages
          <fpage>245</fpage>
          {
          <fpage>248</fpage>
          , New York, NY, USA,
          <year>2012</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Lovasz</surname>
          </string-name>
          .
          <source>Random walks on graphs: A survey</source>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Massa</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Avesani</surname>
          </string-name>
          .
          <article-title>Trust metrics on controversial users: balancing between tyranny of the majority and echo chambers</article-title>
          .
          <source>International Journal on Semantic Web and Information Systems</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>Massa</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Bhattacharjee</surname>
          </string-name>
          .
          <article-title>Using trust in recommender systems: An experimental analysis</article-title>
          .
          <source>In In Proceedings of iTrust2004 International Conference</source>
          , pages
          <volume>221</volume>
          {
          <fpage>235</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. I.</given-names>
            <surname>Jordan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Weiss</surname>
          </string-name>
          .
          <article-title>On spectral clustering: Analysis and an algorithm</article-title>
          .
          <source>In ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS</source>
          , pages
          <volume>849</volume>
          {
          <fpage>856</fpage>
          . MIT Press,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>M. C. Pham</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Cao</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Klamma</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Jarke</surname>
          </string-name>
          .
          <article-title>A clustering approach for collaborative ltering recommendation using social network analysis</article-title>
          .
          <source>Journal of Universal Computer Science</source>
          ,
          <volume>17</volume>
          (
          <issue>4</issue>
          ):
          <volume>583</volume>
          {604, feb
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Singla</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Richardson</surname>
          </string-name>
          .
          <article-title>Yes, there is a correlation: - from social networks to personal behavior on the web</article-title>
          .
          <source>In Proceedings of the 17th international conference on World Wide Web, WWW '08</source>
          , pages
          <fpage>655</fpage>
          {
          <fpage>664</fpage>
          , New York, NY, USA,
          <year>2008</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Sun</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          .
          <article-title>A survey of models and algorithms for social in uence analysis</article-title>
          . In C. C. Aggarwal, editor,
          <source>Social Network Data Analytics</source>
          , pages
          <volume>177</volume>
          {
          <fpage>214</fpage>
          .
          <string-name>
            <surname>Springer</surname>
            <given-names>US</given-names>
          </string-name>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wasserman</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Faust</surname>
          </string-name>
          .
          <source>Social Network Analysis: Methods and Applications</source>
          . Cambridge University Press, 1st edition,
          <year>November 1994</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>