<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Efficiency Improvement of Neutrality-Enhanced Recommendation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Toshihiro Kamishima, Shotaro Akaho,</string-name>
          <email>mail@kamishima.net, s.akaho@aist.go.jp, h.asoh@aist.go.jp</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jun Sakuma</string-name>
          <email>jun@cs.tsukuba.ac.jp</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Tsukuba</institution>
          ,
          <addr-line>1-1-1 Tennodai, Tsukuba, 305-8577</addr-line>
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>and Hideki Asoh, National Institute of Advanced Industrial Science</institution>
          ,
          <addr-line>and Technology (AIST), AIST Tsukuba Central 2, Umezono 1-1-1, Tsukuba, Ibaraki, 305-8568</addr-line>
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper proposes an algorithm for making recommendations so that neutrality from a viewpoint speci ed by the user is enhanced. This algorithm is useful for avoiding decisions based on biased information. Such a problem is pointed out as the lter bubble, which is the in uence in social decisions biased by personalization technologies. To provide a neutrality-enhanced recommendation, we must rst assume that a user can specify a particular viewpoint from which the neutrality can be applied, because a recommendation that is neutral from all viewpoints is no longer a recommendation. Given such a target viewpoint, we implement an information-neutral recommendation algorithm by introducing a penalty term to enforce statistical independence between the target viewpoint and a rating. We empirically show that our algorithm enhances the independence from the speci ed viewpoint.</p>
      </abstract>
      <kwd-group>
        <kwd>recommender system</kwd>
        <kwd>neutrality</kwd>
        <kwd>fairness</kwd>
        <kwd>lter bubble</kwd>
        <kwd>collaborative ltering</kwd>
        <kwd>matrix factorization</kwd>
        <kwd>information theory</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>A recommender system searches for items or
information that is estimated to be useful to a user based on the
user's prior behaviors and the features of items. Over the
past decade, such recommender systems have been
introduced and managed at many e-commerce sites to promote
items sold at those sites. The in uence of personalization
technologies such as recommender systems or personalized
search engines on people's decision making is considerable.
For example, at a shopping site, if a customer checks a
recommendation list and nds ve-star-rated items, he/she will
more seriously consider buying these strongly recommended
items. These technologies have thus become an
indispensable tool for users. However, the problem of lter bubble,
which is the unintentional bias or the limited diversity of
information provided to users, has accompanied the growing
in uence of personalization algorithms.</p>
      <p>
        The term lter bubble was recently coined by Pariser [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
Due to the strong in uence of personalized technologies,
the topics of information provided to users are becoming
restricted to those originally preferred by them, and this
restriction is not perceived by users. In this way, each
individual is metaphorically enclosed in his/her own separate
bubble. Pariser claimed that users lose the opportunity to
nd new interests because of the limitations of the bubbles
created around their original interests, and that sharing
reasonable yet opposing viewpoints on public issues a ecting
our society is thus becoming more di cult. To discuss this
lter bubble problem, a panel discussion was held at the
RecSys 2011 conference [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>During the RecSys panel discussion, panelists made the
following assertions about the lter bubble problem. The
diversity of topics is certainly biased by the in uence of
personalization. At the same time, it is impossible to make
recommendations that are absolutely neutral from any
viewpoint, and thus there is a trade-o between focusing on
topics that better t users' interests or needs and enhancing the
varieties of provided topics. To address this problem, the
panelists also pointed out several possible directions: taking
into account users' immediate needs as well as their
longterm needs; optimizing a recommendation list as a whole;
and providing tools for perspective-taking.</p>
      <p>To our knowledge, there is no major tool that enables
users to control their perspective to address this lter
bubble problem. We therefore advocate a new
informationneutral recommender system that guarantees the neutrality
of recommendations. As pointed out during the RecSys 2011
panel discussion, it is impossible to make a recommendation
that is absolutely neutral from all viewpoints, and we
therefore focus on neutrality from a viewpoint or type of
information speci ed by the user. For example, users can specify a
feature of an item, such as a brand, or a user feature, such as
a gender or an age, as a viewpoint. An information-neutral
recommender system is designed so that these speci ed
features will not in uence the recommendation results. This
system can also be used to ensure fair treatment of content
providers or product suppliers or to avoid the use of
information that is restricted by law or regulation.</p>
      <p>
        Last year at this Decisions workshop, we borrowed the
idea of fairness-aware data mining, which we had proposed
earlier [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], to build an information-neutral recommender
system of the type described above [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. To enhance
neutrality or independence in recommendations, we introduced a
constraint term that represents the mutual information
between a recommendation result and a speci ed viewpoint.
The naive implementation of this constraint term did
indeed enhance the neutrality of recommendations, but there
remained serious shortcomings in its scalability. In this
paper, therefore, we advocate several new formulations of this
constraint term that are more scalable.
      </p>
      <p>Our contributions are as follows. First, we present a
definition of neutrality in recommendation based on the
consideration of why it is impossible to achieve an absolutely
neutral recommendation. Second, we propose a method to
enhance the neutrality of a probabilistic matrix
factorization model. Finally, we demonstrate that the neutrality of
a recommender system can be enhanced.</p>
      <p>In section 2, we discuss the lter bubble problem and the
concept of neutrality in recommendation, and de ne the
goal of an information-neutral recommendation task. An
information-neutral recommender system is proposed in
section 3, and the experimental results of its application are
shown in section 4. Sections 5 and 6 cover related work and
our conclusion, respectively.</p>
    </sec>
    <sec id="sec-2">
      <title>2. INFORMATION NEUTRALITY</title>
      <p>In this section, we discuss information neutrality in
recommendation based on an examination of the lter bubble
problem and the ugly duckling theorem.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>The Filter Bubble Problem</title>
      <p>
        We will rst summarize the lter bubble problem posed
by Pariser and the panel discussion about this problem held
at the RecSys 2011 conference. The Filter Bubble problem
is the concern that personalization technologies narrow and
bias the topics of information provided to people, who do
not notice this phenomenon [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>
        Pariser demonstrated the following examples in a TED
talk about this problem [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Users of the social network
service Facebook specify other users as friends with whom
they then can chat, have private discussions, and share
information. To help users nd their friends, Facebook
provides a recommendation list of others who are expected to
be related to a user. When Pariser started to use
Facebook, the system showed a friend recommendation list that
consisted of both conservative and progressive people.
However, because he more frequently selected progressive people
as friends, conservative people were increasingly excluded
from his recommendation list by a personalization
functionality. Pariser claimed that, in this way, the system excluded
conservative people without his permission and that he lost
the opportunity to be exposed to a wide variety of opinions.
      </p>
      <p>Pariser's claims can be summarized as follows. First,
personalization technologies restrict an individual's
opportunities to obtain information about a wide variety of topics.
The chance to gain knowledge that could ultimately enhance
an individual's life is lessened. Second, the individual
obtains information that is too personalized; thus, the amount
of shared information and shared debate in our society is
decreased. Pariser asserts that the loss of shared information is
a serious obstacle for building social consensus. He claimed
that the personalization of information thereby becomes a
serious obstacle for building consensus.</p>
      <p>
        RecSys 2011 featured a panel discussion on this lter
bubble problem [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. The panel concentrated on the following
three points: (a) Are there lter bubbles? (b) To what
degree is personalized ltering a problem? and (c) What
should we as a community do to address the lter bubble
problem? Among these points, we focus on the point (c).
The panelists presented several directions to explore in
addressing the lter bubble problem. First, a system could
consider users' immediate needs as well as their long-term
needs. Second, instead of selecting individual items
separately, a recommendation list or portfolio could be
optimized as a whole. And Finally a system could provide tools
for perspective-taking to see the world through other
viewpoints.
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Neutrality in Recommendation</title>
      <p>
        Among the directions for addressing the lter bubble, we
here take the approach of providing a tool for
perspectivetaking. Before presenting this tool, we explored the notion
of neutrality based on the ugly duckling theorem. The ugly
duckling theorem is a classical theorem in pattern
recognition literature that asserts the impossibility of classi cation
without weighing certain features or aspects of objects as
more important than others [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Consider a case in which
2n ducklings are represented by n binary features and are
classi ed into positive or negative classes based on these
features. It is easy to show that the number of possible
decision rules based on these features to discriminate an ugly
duckling and a normal duckling is equal to the number of
patterns to discriminate any pair of normal ducklings. In
other words, every duckling resembles a normal duckling and
an ugly duckling equally. This counterintuitive conclusion
is deduced from the premise that all features are treated
equally. Attention to an arbitrary feature such as black
feathers makes an ugly duckling ugly. When we classify
something, we of necessity weigh certain features, aspects,
or viewpoints of classi ed objects. Because recommendation
is considered a task for classifying whether items are
interesting or not, certain features or viewpoints inevitably must be
weighed when making a recommendation. Consequently, the
absolutely neutral recommendation is impossible, as pointed
out in the RecSys panel.
      </p>
      <p>We propose a neutral recommendation framework other
than the absolutely neutral recommendation. Recalling the
ugly duckling theorem, we must focus on certain features
or viewpoints in classi cation. This fact indicates that it is
feasible to make a recommendation that is neutral from a
speci c viewpoint instead of all viewpoints. We hence
advocate an information-neutral recommender system (INRS)
that enhances the neutrality in recommendation from the
viewpoint speci ed by a user. In Pariser's Facebook
example, a system could enhance the neutrality so that
recommended friends are both conservative and progressive, but
the system would be allowed to make biased decisions in
terms of the other viewpoints, e.g., the birthplace or age of
friends.</p>
      <p>We formally model this neutrality by the statistical
independence between recommendation results and viewpoint
values, i.e., Pr[RjV ] = Pr[R]. This means that the same
recommendations are made for the cases where all
conditions are the same except for the viewpoint values. In other
words, no information of viewpoint features in uences the
recommendation results according to the information theory.
An INRS hence tends to be less accurate, because useable
information is decreased. In the example of a friend
recommendation, no matter what a user's political conviction
is, the conviction is ignored and excluded in the process of
making a recommendation.</p>
      <p>
        We wish to emphasize that neutrality is distinct from
recommendation diversity, which is the attempt to recommend
items that are mutually less similar. Topic diversi cation is
one of the proposed techniques for enhancing diversity by
excluding similar items from a recommendation list [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. The
constraint term in [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] is designed to exclude similar items
from a nal list. Therefore, while neutrality involves the
relation between recommendations and single viewpoint
features, diversity concerns the mutual relation among
recommendations. Inversely, enhancing the diversity cannot
suppress the use of speci c information, and an INRS is allowed
to o er mutually similar items. In the case of the friend
recommendation, if a progressive person is recommended as a
friend, the INRS will recommend another person whose
conditions other than political convictions are the same. In the
case of the diversi ed recommendation, one of two persons
would not be recommended because the two persons are very
similar.
      </p>
      <p>
        The INRS is bene cial not only for users but also for
system managers. It can be used to ensure the fair
treatment of content providers or product suppliers. The
federal trade commission has been investigating Google to
determine whether the search engine ranks its own services
higher than those of competitors [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. E-commerce sites want
to treat their product suppliers fairly when making
recommendations to their customers. If a brand of providers or
suppliers is speci ed as a viewpoint, a system can make
recommendations that are neutral in terms of the items' brands.
An information-neutral recommendation is also helpful for
avoiding the use of information that is restricted by law or
regulation. For example, the use of some information is
prohibited for the purpose of making recommendations by
privacy policies. In this case, by treating the prohibited
information as a viewpoint, recommendations can be neutral
in terms of the prohibited information.
      </p>
    </sec>
    <sec id="sec-5">
      <title>THE INFORMATION-NEUTRAL RECOMMENDER SYSTEM</title>
      <p>We formalize the task of information-neutral
recommendation and present an algorithm for performing this task.
3.1</p>
    </sec>
    <sec id="sec-6">
      <title>Task Formalization</title>
      <p>
        Recommendation tasks can be classi ed into three types:
recommending good items that meet a user's interest,
optimizing the utility of users, and predicting item ratings of
items for a user [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Among these tasks, we here
concentrate on the task of predicting ratings. X 2 f1; : : : ; ng and
Y 2 f1; : : : ; mg denote random variables for the user and
item, respectively. An event (x; y) is an instance of a pair
(X; Y ). R denotes a random variable for the rating of Y
as given by X, and its instance is denoted by r. We here
assume that the domain of ratings is the set of real values.
These variables are in common with an original predicting
ratings task.
      </p>
      <p>To enhance information neutrality in recommendation,
we additionally introduced a viewpoint random variable, V ,
which indicates the viewpoint feature from which the
neutrality is enhanced. This variable is speci ed by a user, and
its value depends on various aspects of an event. Possible
examples of viewpoint variables are a user's gender, which
is part of the user component of an event, a movie's release
year, which is part of the item component of an event, and
the timestamp when a user rates an item, which would
belong to both elements in an event. In this paper, we restrict
the domain of a viewpoint variable to a binary type, f0; 1g,
for simplicity. A training sample consists of an event, (x; y),
a viewpoint value for the event, v, and a rating value for
the event, r. A training set is a set of N training samples,
D = f(xi; yi; vi; ri)g; i = 1; : : : ; N .</p>
      <p>Given a new event, (x; y), and its corresponding
viewpoint value, v, a rating prediction function, r^(x; y; v),
predicts a rating of the item y by the user x, and satis es
r^(x; y; v) = EPr[Rjx;y;v][R]. This rating prediction function is
estimated by optimizing an objective function having three
components: a loss function, loss(r ; r^), a neutrality term,
neutral(R; V ), and a regularization term, reg. The loss
function represents the dissimilarity between a true rating value,
r , and a predicted rating value, r^. The neutrality term
quanti es the expected degree of neutrality of the predicted
rating values from a viewpoint expressed by a viewpoint
feature, and its larger value indicates the higher level of
neutrality. The aim of the regularization term is to avoid
over- tting. Given a training sample set, D, the goal of
the information-neutral recommendation (predicting rating
case) is to acquire a rating prediction function, r^(x; y; v), so
that the expected value of the loss function is as small as
possible and the neutral term is as large as possible. We
formulate this goal by nding a rating prediction function,
r^, so as to minimize the following objective function:
X loss(r; r^(x; y; v)) + neutral(R; V ) + reg( );
(1)</p>
      <p>D
where &gt; 0 is a neutrality parameter to balance between the
loss and the neutrality, &gt; 0 is a regularization parameter,
and is a set of model parameters.
3.2</p>
      <p>
        Probabilistic Matrix Factorization Model
In this paper, we adopt a probabilistic matrix
factorization model [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] to predict ratings, because this model is
highly e ective in its prediction accuracy as well as e cient
in its scalability. Though there are several minor variants
of this model, we here use the following model de ned as
equation (3) in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]:
r^(x; y) =
+ bx + cy + px&gt;qy;
(2)
where , bx, and cy are global, per-user, and per-item bias
parameters, respectively, and px and qy are K-dimensional
parameter vectors, which represent the cross e ects between
users and items. We then adopt the following squared loss
with a regularization term:
(ri
r^(xi; yi))2 + reg( ):
(3)
      </p>
      <p>X
(xi;yi;ri)2D</p>
      <p>
        This model is proved to be equivalent to assuming that
true rating values are generated from a normal distribution
whose mean is equation (2). If all samples over all X and Y
are available, the objective function is convex; and thereby
globally optimal parameters can be derived by a simple
gradient descent method. Unfortunately, because not all
samples are observed, the loss function (3) is non-convex, and
only local optima can be found. However, it is empirically
known that a simple gradient method succeeds in nding a
good solution in most cases [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>We then extend this model to enhance the information
neutrality. First, we modify the model of equation (2) so
that it is dependent on the viewpoint value, v. For each value
of V , 0 and 1, we prepare a parameter set, (v), b(xv), c(yv),
p(xv), and q(yv). One of the parameter sets is chosen according
to the viewpoint value, and we get the rating prediction
function:
r^(x; y; v) =
(v) + b(xv) + c(yv) + p(xv)&gt;q(yv):
(4)
By substituting equations (4) into equation (1) and
adopting a squared loss function as in the original probabilistic
matrix factorization case, we obtain an objective function of
an information-neutral recommendation model:</p>
      <p>X (ri r^(xi; yi; vi))2+ neutral(R; V )+ reg( ); (5)
(xi;yi;ri;vi)2D
where the regularization term is a sum of L2 regularizers of
parameter sets for each value of v except for global biases,
(v). Model parameters, (v) = f (v); b(xv); c(yv); p(xv); q(yv)g,
for v 2 f0; 1g, are estimated so as to minimize this objective.
Once we learn the parameters of the rating prediction
function, we can predict a rating value for any event by applying
equation (4).
3.3</p>
    </sec>
    <sec id="sec-7">
      <title>Neutrality Term</title>
      <p>Now, all that remains is to de ne a neutrality term. As
described in section 2.2, we formalize the neutrality as the
statistical independence between a recommendation result
and a viewpoint feature. We propose neutrality terms that
are based on mutual information and Calders-Verwer's
discrimination score, both of which quantify the degree of
independence between R and V .
3.3.1</p>
      <sec id="sec-7-1">
        <title>Mutual Information</title>
        <p>
          We rst use the same idea as in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] and quantify the degree
of the neutrality by negative mutual information under the
assumption that neutrality can be regarded as statistical
independence. Negative mutual information between R and
V is de ned as:
        </p>
        <p>I(R; V ) =
=</p>
        <p>X Z
V
1
N
where r^i is derived by applying (xi; yi; vi) 2 D to
equation (4). The marginalization over R and V is approximated
by the sample mean over D in the second line, and we use
a sample mass function as Pr[V ]. Pr[RjV ] can be derived
by marginalizing Pr[RjX; Y; V ] Pr[X; Y ] over X and Y . We
again approximate this marginalization by the sample mean
and get:</p>
        <p>Pr[rjv]</p>
        <p>1
jD(v)j
(xi;yi)2D(v)</p>
        <p>X Normal (r; r^(xi; yi; v); VD(v) (R)) ; (7)
where Normal( ) is a pdf of normal distribution, D(v)
consists of all training samples whose viewpoint values are equal
to v, and VD(v) (R) is a sample variance,</p>
        <p>1
jD(v)j</p>
        <p>Pri2D(v) (ri</p>
        <p>MD(v) (fr^g))2;
where MD(fr^g) is</p>
        <p>MD(fr^g) = jD1 j P(xi;yi;vi)2D r^(xi; yi; vi):
This is very hard to manipulate because this is a mixture
distribution with an enormous number of components. We
hence took an approach of directly modeling Pr[rjv], and
used two types of models.</p>
        <p>
          The rst one is a histogram model, which was proposed
in our preliminary work [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ]. Though rating values are
treated as real values, they are originally discrete scores.
Therefore, a set of predicted ratings, fr^ig, are divided
into bins. Given a set of intervals, fIntjg, for example
f( 1; 1:5]; (1:5; 2:5]; : : : ; (4:5; 1)g in a ve-point-scale case,
predicted ratings are placed into the bins corresponding
these intervals. By using these bins, Pr[rjv] is modeled by a
multinomial distribution:
        </p>
        <p>Pr[r^jv]
j=1
#Int P(xi;yi)2D(v) I[r^(xi; yi; v) 2 Intj] I[r2Intj]
Y ; (8)
jD(v)j
where I[r 2 Int] is an indicator function and #Int is the
number of intervals. We refer to this model as mi-hist, which is an
abbreviation of mutual information modeled by a histogram
model.</p>
        <p>However, because this model has discontinuous points, we
develop a second new approach, which is to model Pr[r^jv] by
a single normal distribution, which is continuous and easy
to handle. Formally,</p>
        <p>Pr[r^jv]</p>
        <p>Normal (r^; MD(v) (fr^g); VD(v) (fr^g)) ;
(9)
where VD(fr^g) is a sample variance over predicted ratings
r^i from samples in D. We refer to this model as mi-normal,
which is an abbreviation of mutual information modeled by
a normal distribution model.</p>
        <p>Unfortunately, it is not easy to derive an analytical form of
gradients for these neutrality terms. This is because the
discretization is a discontinuous transformation in the mi-hist
case, and Pr[r^] is a normal mixture, which is not a member
of an exponential family, in a mi-normal. We therefore adopt
the Powell optimization method for this class of neutrality
terms, because it can be applied without computing
gradients. However, this optimization method is too slow to
apply to a large data set, and its lack of scalability is a serious
de cit. In our implementation, these methods failed to
complete the processing of 100k data in several days, whereas
the methods described in the next section could process this
dataset in minutes.
3.3.2</p>
      </sec>
      <sec id="sec-7-2">
        <title>Calders-Verwer’s Discrimination Score</title>
        <p>
          To develop a neutrality term whose gradients can be
derived in analytical form, we borrowed an idea in
discrimination-aware data mining [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. We here introduce
Calders and Verwer's approach used in [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. They proposed
a score to measure the degree of socially discriminative
decision, which is here referred by a CV score. This CV score
is de ned as the di erence between distributions of target
variable given V = 0 and V = 1.
To reduce the in uence of V on R, they tried to learn a
classi cation model that would make the two distributions,
Pr[RjV = 0] and Pr[RjV = 1], similar by causing the CV
score to approach zero. It is easy to show that this
process enforces the statistical independence between V and
R [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Based on this idea, we design two types of neutrality
terms the would make the two distributions Pr[RjV = 0]
and Pr[RjV = 1] similar.
        </p>
        <p>We design the rst type of neutrality term so as to match
the rst-order moment of the two distributions, i.e., the
means. It is formally de ned as
(MD(0) (fr^g)</p>
        <p>MD(1) (fr^g))2:
(11)
We refer to this neutrality term as m-match, which is an
abbreviation of mean matching. The second type is designed
to constrain so that the same ratings are predicted for the
same value pair, x and y, irrelevant of the viewpoint values.
This neutrality term is formally de ned as
(r^(xi; yi; 0)
r^(xi; yi; 1))2:
(12)</p>
        <p>X
(xi;yi)2D
We refer to this neutrality term as r-match, which is an
abbreviation of rating matching.</p>
        <p>Because both types of neutrality terms are simple quadratic
polynomials, it is very easy to derive analytical forms of their
derivatives. We hence used a conjugate gradient method for
these neutrality terms in optimization, which is much more
e cient than the Powell method. Even if the size of data set
becomes larger, more scalable optimizers, e.g., a stochastic
gradient method, can be used because the gradients can be
analytically calculated.</p>
        <p>These terms have the additional merit of being less
frequently trapped by local minima, because they are simple
quadratic formulae. Conversely, it is not straightforward
to extend these CV-score-based neutrality terms so that
they are applicable to the case in which a viewpoint
variable is multivariate discrete or continuous, as in
mutualinformation-based neutrality terms. When comparing
mmatch and r-match, the computation time for r-match is
roughly twice that for m-match, because a rating prediction
function must be evaluated for the cases of both V = 0 and
V = 1 to compute r-match. r-match more strictly formulates
neutrality than m-match. In the case of m-match, because
the neutrality is enhanced on average over the user
population, the neutrality of one user might be greatly enhanced,
but that of the other might not. On the other hand, r-match
is designed so that neutrality is uniformly enhanced almost
everywhere over the domain of users and items. Unlike
mmatch, r-match treats counterfactual cases. For example,
when the gender of a user is a viewpoint, even though the
gender does not change, ratings in such a counterfactual case
must be computed for using the r-match term. This fact may
be semantically improper.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>EXPERIMENTS</title>
      <p>We implemented our information-neutral recommender
system and applied it to a benchmark data set. We examined
the four types of neutrality terms proposed in section 3.3.
4.1</p>
    </sec>
    <sec id="sec-9">
      <title>Data Set</title>
      <p>
        We used a Movielens 100k data set [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] in our
experiments. Unfortunately, neither of the
mutual-informationbased methods in section 3.3.1, mi-hist and mi-normal, were
able to process this entire data set. Therefore, we shrank the
Movielens data set by extracting events whose user ID and
item ID were less than or equal to 200 and 300, respectively.
For scalable m-match and r-match methods, we applied a
larger data set as described in section 4.4. This shrunken
data set contained 9; 409 events, 200 users, and 300 items.
The purpose of experiments on this small set was to compare
the characteristics of all four neutrality terms. The
mutualinformation-based methods more strictly modeled the
distribution over R and V than the CV-score-based methods
described in section 3.3.2, m-match and r-match. If the
CV-score-based methods behaved similarly to the
mutualinformation-based methods, we would be able to conclude
that CV-score-based methods can enhance the neutrality
and are scalable.
      </p>
      <p>
        We tested the following two types of viewpoint variable.
The rst type of variable, Year, represents whether a movie's
release year is newer than 1990, which is part of the item
component of an event. In [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], Koren reported that older
movies have a tendency to be rated more highly, perhaps
because masterpieces are more likely to survive, and thus the
set of older movies has more masterpieces. When adopting
Year as a viewpoint variable, our recommender enhances the
neutrality from this masterpiece bias. The second type of
variable, Gender, represents the user's gender, which is part
of the user component of an event. We expect that the movie
ratings would depend on the user's gender.
4.2
      </p>
    </sec>
    <sec id="sec-10">
      <title>Experimental Conditions</title>
      <p>
        We optimized an objective function (5) with neutrality
terms mi-hist or mi-normal by the Powell method, and that
with terms m-match or r-match by the conjugate gradient
method implemented in the SciPy package [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. To
initialize the model parameters, events in a training set, D,
were rst divided into two sets according to their
viewpoint values. For each value of a viewpoint variable, the
parameters were initialized by minimizing an objective
function of an original probabilistic matrix factorization model
(equation (3)). For convenience in implementation, a loss
term of an objective was re-scaled by dividing it by the
number of training examples, and an L2 regularizer was
scaled by dividing it by the number of parameters. The four
types of neutrality terms were re-scaled so that the
magnitudes of these terms became roughly equal. Because the
original rating values are 1; 2; : : : ; 5, we adopted ve bins
( 1; 1:5]; (1:5; 2:5]; : : : ; (4:5; 1) for the mi-hist term. We
use a regularization parameter = 0:01 and the number of
latent factors, K = 1, which is the size of vectors p(v) or
q(v). It should be notice that this data set was so small that
the prediction performance was degraded if K &gt; 1. Though
in the case without cross term, i.e., K = 0, the performance
was better than the case where K = 1, but we tested the
model having the minimum cross terms. Our experimental
codes are available at http://www.kamishima.net/inrs/.
      </p>
      <p>
        We evaluated our experimental results in terms of
prediction errors and the degree of neutrality. Prediction errors
were measured by the mean absolute error (MAE) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This
index was de ned as the mean of the absolute di erence
between the observed rating values and predicted rating
values. A smaller value of this index indicates better prediction
accuracy. To measure the degree of neutrality, we adopted
mutual information between the predicted ratings and
viewpoint values. The smaller mutual information indicates a
higher level of neutrality. Mutual information is
normalized into the range [0; 1] by employing the geometrical mean
as described in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Note that the distribution Pr[r^jv] is
required to compute this mutual information, and we used the
same histogram model as in equation (8). We performed a
ve-fold cross-validation procedure to obtain evaluation
indices of the prediction errors and the neutrality measures.
4.3
      </p>
    </sec>
    <sec id="sec-11">
      <title>Experimental Results</title>
      <p>Experimental results for the four types of neutrality terms
are shown in Figure 1. The MAE was 0:903, when the
rating being o ered was held constant at 3:74, which is the
mean rating over all ratings in the training data. This
approximately simulates the case of randomly recommending
items, and can be considered as the most unbiased and
unintentional recommendation. We call this case random
prediction. On the other hand, when applying the original
probabilistic matrix factorization model (equation (2)), the MAE
was 0:759. Because the trade-o for enhancing the
neutrality generally worsens the prediction accuracy the accuracy
as discussed in section 2.2, this error level can be considered
as the lower bound. We call this case basic prediction.</p>
      <p>In Figures 1(a) and (c), the prediction errors were
better than random predictions. Overall, the increase of MAEs
as the neutrality parameter, , was not very great in any
of the neutrality terms. The errors for the r-match term
sometimes decreased even if was increased. As described
in section 4.2, the model without cross terms better
performed. We think that the cross term e ects would be
eliminated by the strong restriction of the r-match terms, and
MAEs were improved. Turning to Figure 1(b) and (d), the
results obtained with the r-match term and with the other
three terms were clearly contrasted. The three terms,
mihist, mi-normal, and m-match, yielded successfully enhanced
neutrality for the Year data, but less enhanced neutrality
for the Gender data. Conversely, the r-match term was able
to enhance neutrality for the Gender data, but it failed to
do so for the Year data. We expected that this distinction
was caused by the original independence between predicted
ratings and viewpoint values. By comparing the NMIs at
= 0:01 of Figures 1(b) and (d), it was found that the
dependence between ratings and viewpoint values for the Year
data was larger that for the Gender data. Additionally, as
described in section 3.3.2, while the r-match term is designed
so that neutrality is uniformly enhanced over the domain of
users and items, the other three terms are designed so as
to enhance neutrality on average. In the case of the Year
data, the three terms could enhance neutrality on average
because the neutrality was low when was small. However,
the restriction of the r-match term was expected to be too
strong for this data set. On the other hand, because the
averaged neutrality for the Gender data was high at the
beginning, the three terms failed to improve the neutrality, but
the stronger neutrality-enhancement ability of the r-match
would be e ective in this case.
0.0001.01
0.1
10
100</p>
      <p>(b) NMI for the Year and Gender data sets</p>
      <p>To further investigate this phenomenon, we show the changes
of mean predicted ratings in Figure 2. Two types of
neutrality terms, m-match and r-match, were examined. First,
we focus on the case where = 0:01, in which the
neutrality term was less in uenced. By comparing Figures 2(a)
and (b), the di erence between the mean ratings for old and
new movies was much larger than the di erence between the
mean ratings rated by male and female users. In particular,
while the former di erence was 0:36, the latter di erence
was 0:024. This result again indicates that a higher level
of neutrality is achieved for the Gender data than for the
Year data. For the Year data, the m-match term successfully
reduced the di erence of two means as the increase of ,
but the r-match term failed to do so. For the Gender data,
both terms failed to reduce the di erence between the two
means, because the di erence was already small and
constraint terms were not e ective.
4.4</p>
    </sec>
    <sec id="sec-12">
      <title>Experiments on a Larger Data set</title>
      <p>
        To show that our new neutrality terms are applicable to
larger data sets, we made an INRS on the entire Movielens
100k data set, which contains 10 times as much data as the
data set in the previous section. In our preliminary work [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
a data set of this size could not be processed. MAE of
random and basic predictions for this data set were 0:945
and 0:750, respectively. We adopted the m-match neutrality
term, and the other conditions were set as in section 4.2
except for K = 3. Figure 3 shows the changes of the MAE
and NMI according to the increase of . Trends similar
to those in Figure 1 were observed. While neutrality was
successfully enhanced without sacri cing prediction errors
for the Year data, the m-match term was not e ective for
E
A
M
E
A
M
(c) Prediction error (MAE) for Gender data
(d) Degree of neutrality (NMI) for Gender data
NOTE : Sub gures (a) and (b) are results on the Year data set, and Sub gures (c) and (d) are results on the Gender data
set. Sub gures (a) and (c) show the changes of prediction errors measured by the mean absolute error (MAE in a linear
scale). A smaller value of this index indicates better prediction accuracy. Sub gures (b) and (d) show the changes of the
normalized mutual information (NMI in a log scale). A smaller NMI indicates a higher level of neutrality. The X-axes
(log-scale) of these gures represent the values of a neutrality parameter, , which balance the prediction accuracy and the
neutrality. These parameters were changed from 0:01, at which the neutrality term was almost completely ignored, to 100,
at which the neutrality was strongly enhanced.
      </p>
      <p>0.1
10
100
0.1
10
100
NOTE : In both gures, the X-axes (log-scale) represent the values of a neutrality parameter, , and the Y-axes represent
mean predicted ratings for each case with a di erent viewpoint value. Sub gure (a) shows mean the predicted ratings
when the viewpoint variable is Year. Means for the movies before 1990 were designated as \old," and those after 1991 were
designated as \new." Sub gure (b) shows the mean predicted ratings when the viewpoint variable is Gender. Means of the
ratings given by males and females were represented by \M" and \F," respectively.
the Gender data.</p>
      <p>Finally, we should comment on the computational time.</p>
      <p>Generally, terms based on mutual information were much
slower than those based on CV score. This is because
analytical forms of gradients can be derived for the m-match
and r-match. In comparing the two terms, m-match and
r-match, the former is found to be faster, as described in
section 3.3.2. Empirically, as increased, the convergence
of optimizers became slower, because the neutrality terms
were not smooth compared to the loss term and harder to
optimize. The in uence of the increase of was more serious
for the r-match than for the m-match.</p>
    </sec>
    <sec id="sec-13">
      <title>RELATED WORK</title>
      <p>
        We adopted techniques for fairness-aware or
discriminationaware data mining to enhance the neutrality. Fairness-aware
data mining is a general term for mining techniques designed
so that sensitive information does not in uence the
mining results. Pedreschi et al. rst advocated such mining
techniques, which emphasized the unfairness in association
rules whose consequents include serious determinations [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Another technique of fairness-aware data mining focuses on
classi cation designed so that the in uence of sensitive
information on classi cation results is reduced [
        <xref ref-type="bibr" rid="ref2 ref8">8, 2</xref>
        ]. These
techniques would be directly useful in the development of an
information-neutral variant of content-based recommender
systems, because content-based recommenders can be
implemented by standard classi ers.
      </p>
      <p>
        Because information-neutral recommenders can be used
to avoid the exploitation of private information, these
techniques are related to privacy-preserving data mining [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. To
protect private information contained in rating information,
dummy ratings were added [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
      </p>
    </sec>
    <sec id="sec-14">
      <title>CONCLUSION</title>
      <p>In this paper, we proposed an information-neutral
recommender system that enhances neutrality from the viewpoint
speci ed by a user. This system is useful for alleviating the
lter bubble problem. We then developed an
informationneutral recommendation algorithm by introducing several
types of neutrality terms. Because the neutrality term in our
preliminary work had poor scalability, we proposed a new
and more e cient neutrality term. Finally, we demonstrated
that neutrality in recommendation could be enhanced by our
algorithm without sacri cing the prediction accuracy.</p>
      <p>There are many functionalities required for this
informationneutral recommender system. We plan to explore the other
types of neutrality terms that can more exactly evaluate
the independence between a target variable and a
viewpoint variable while maintaining e ciency. Because
viewpoint variables are currently restricted to binary type, we
also try to develop a neutrality term that can deal with a
viewpoint variable that is multivariate discrete or
continuous. Though our current technique is mainly applicable to
the task of predicting ratings, we will develop another
algorithm for the task of recommending good items.</p>
    </sec>
    <sec id="sec-15">
      <title>ACKNOWLEDGMENTS</title>
      <p>We would like to thank for providing a data set for
the Grouplens research lab. This work is supported
by MEXT/JSPS KAKENHI Grant Number 16700157,
21500154, 23240043, 24500194, and 25540094.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C. C.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          and P. S. Yu, editors.
          <source>Privacy-Preserving Data Mining: Models and Algorithms</source>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Calders</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Verwer</surname>
          </string-name>
          .
          <article-title>Three naive bayes approaches for discrimination-free classi cation</article-title>
          .
          <source>Data Mining and Knowledge Discovery</source>
          ,
          <volume>21</volume>
          :
          <fpage>277</fpage>
          {
          <fpage>292</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Forden</surname>
          </string-name>
          .
          <article-title>Google said to face ultimatum from FTC in antitrust talks</article-title>
          . Bloomberg,
          <string-name>
            <surname>Nov. 13</surname>
          </string-name>
          <year>2012</year>
          . hhttp://bloom.bg/PPNEaSi.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4] Grouplens research lab, university of minnesota. hhttp://www.grouplens.
          <source>org/i.</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gunawardana</surname>
          </string-name>
          and
          <string-name>
            <given-names>G.</given-names>
            <surname>Shani</surname>
          </string-name>
          .
          <article-title>A survey of accuracy evaluation metrics of recommendation tasks</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>10</volume>
          :
          <fpage>2935</fpage>
          {
          <fpage>2962</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kamishima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Akaho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Asoh</surname>
          </string-name>
          , and
          <string-name>
            <surname>J. Sakuma.</surname>
          </string-name>
          <article-title>Considerations on fairness-aware data mining</article-title>
          .
          <source>In Proc. of the IEEE Int'l Workshop on Discrimination and Privacy-Aware Data Mining</source>
          , pages
          <volume>378</volume>
          {
          <fpage>385</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kamishima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Akaho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Asoh</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Sakuma</surname>
          </string-name>
          .
          <article-title>Enhancement of the neutrality in recommendation</article-title>
          .
          <source>In Proc. of the 2nd Workshop on Human Decision Making in Recommender Systems</source>
          , pages
          <fpage>8</fpage>
          {
          <fpage>14</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kamishima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Akaho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Asoh</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Sakuma</surname>
          </string-name>
          .
          <article-title>Fairness-aware classi er with prejudice remover regularizer</article-title>
          .
          <source>In Proc. of the ECML PKDD</source>
          <year>2012</year>
          ,
          <string-name>
            <surname>Part</surname>
            <given-names>II</given-names>
          </string-name>
          , pages
          <volume>35</volume>
          {
          <fpage>50</fpage>
          ,
          <year>2012</year>
          . [LNCS 7524].
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Koren</surname>
          </string-name>
          .
          <article-title>Factorization meets the neighborhood: A multifaceted collaborative ltering model</article-title>
          .
          <source>In Proc. of the 14th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining</source>
          , pages
          <volume>426</volume>
          {
          <fpage>434</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Koren</surname>
          </string-name>
          .
          <article-title>Collaborative ltering with temporal dynamics</article-title>
          .
          <source>In Proc. of the 15th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining</source>
          , pages
          <volume>447</volume>
          {
          <fpage>455</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>E. Pariser.</surname>
          </string-name>
          <article-title>The lter bubble</article-title>
          . hhttp://www.thefilterbubble.
          <source>com/i.</source>
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>E. Pariser.</surname>
          </string-name>
          <article-title>The Filter Bubble: What The Internet Is Hiding From You</article-title>
          . Viking,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          .
          <article-title>Discrimination-aware data mining</article-title>
          .
          <source>In Proc. of the 14th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining</source>
          , pages
          <volume>560</volume>
          {
          <fpage>568</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Resnick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Konstan</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A. Jameson.</surname>
          </string-name>
          <article-title>Panel on the lter bubble</article-title>
          .
          <source>The 5th ACM Conf. on Recommender Systems</source>
          ,
          <year>2011</year>
          . hhttp://acmrecsys.wordpress.com/
          <year>2011</year>
          /10/25/panel-on
          <article-title>-the-filter-bubble/i.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R.</given-names>
            <surname>Salakhutdinov</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Mnih</surname>
          </string-name>
          .
          <article-title>Probabilistic matrix factorization</article-title>
          .
          <source>In Advances in Neural Information Processing Systems</source>
          <volume>20</volume>
          , pages
          <fpage>1257</fpage>
          {
          <fpage>1264</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Scipy</surname>
          </string-name>
          .org. hhttp://www.scipy.
          <source>org/i.</source>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Watanabe</surname>
          </string-name>
          .
          <article-title>Knowing and Guessing { Quantitative Study of Inference and Information</article-title>
          . John Wiley &amp; Sons,
          <year>1969</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>U.</given-names>
            <surname>Weinsberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhagat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ioannidis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Taft</surname>
          </string-name>
          . Blurme:
          <article-title>Inferring and obfuscating user gender based on ratings</article-title>
          .
          <source>In Proc. of the 6th ACM Conf. on Recommender Systems</source>
          , pages
          <fpage>195</fpage>
          {
          <fpage>202</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Hurley</surname>
          </string-name>
          .
          <article-title>Avoiding monotony: Improving the diversity of recommendation lists</article-title>
          .
          <source>In Proc. of the 2nd ACM Conf. on Recommender Systems</source>
          , pages
          <fpage>123</fpage>
          {
          <fpage>130</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>C.-N. Ziegler</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>McNee</surname>
            ,
            <given-names>J. A.</given-names>
          </string-name>
          <string-name>
            <surname>Konstan</surname>
            , and
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Lausen</surname>
          </string-name>
          .
          <article-title>Improving recommendation lists through topic diversi cation</article-title>
          .
          <source>In Proc. of the 14th Int'l Conf. on World Wide Web</source>
          , pages
          <volume>22</volume>
          {
          <fpage>32</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>