<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Recommendation from intransitive pairwise comparisons</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marta Crispino</string-name>
          <email>marta.crispino@phd.unibocconi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valeria Vitelli</string-name>
          <email>valeria.vitelli@medisin.uio.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elja Arjas</string-name>
          <email>earjas@mappi.helsinki</email>
          <email>earjas@mappi.helsinki.fi</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arnoldo Frigessi</string-name>
          <email>arnoldo.frigessi@medisin.uio.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DEC, Bocconi University</institution>
          ,
          <addr-line>Milano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>OCBE, University of Oslo</institution>
          ,
          <addr-line>Oslo</addr-line>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Helsinki and, OCBE, University of Oslo</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <issue>2</issue>
      <abstract>
        <p>In this paper we propose a full Bayesian probabilistic method to learn preferences from non-transitive pairwise comparison data. Such lack of transitivity easily arises when the number of pairwise comparisons is large, and they are given sequentially without allowing for consistency check. We develop a Bayesian Mallows model able to handle such data through a latent layer of uncertainty which captures the generation of preference misreporting. We then construct an MCMC algorithm, and test the procedure on simulated data.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>We consider pairwise preference data of the form \x is
preferred to y", denoted x y, where x and y are some items
of interest. The challenge with this kind of data, is that
pairwise preferences are not always transitive. For instance
the data coming from a single user may contain preferences
of the form fx y ; y z; z xg. Such non-transitivity
of preferences arises for many reasons, including the user's
inattentiveness, actual ambiguity in the user's preference,
multiple users with the same account and varying framing
of the data collection. These situations are so common that
most pairwise comparison data are in fact non-transitive,
thus creating the need for methods able to learn users'
preferences from data that lack logical transitivity.</p>
      <p>
        In this paper we assume that non-transitive data arise
because users make mistakes, i.e. switch the order between
two compared items, either simply by mistake or because
the items compared have a rather similar rank for the user.
The developed Bayesian methodology provides the
posterior distribution of the consensus ranking of a homogeneous
group of users, as well as of the estimated individual
rankings for each user. The consensus ranking can be seen as
a compromise which is formed from the individual pairwise
c 2016 Copyright held by the authors.
preferences. The estimated individual rankings can be used
for personalized recommendation. Inference is based on an
adapted version of the Markov Chain Monte Carlo algorithm
in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        Previous work on pairwise preferences includes the Bradley
Terry model [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] (but no personalized preferences are
estimated) and, within the framework of Mallows model, [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]
and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], where transitivity is assumed.
2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>METHODS</title>
      <p>Let N users independently express their preferences
between pairs of items in O = fO1; :::; Ong. We assume that
each user j receives a possibly di erent subset Cj = fCj;1; :::;
Cj;Mj g of pairs of dimension Mj n(n 1)=2. Let Bj =
fBj;1; :::; Bj;Mj g be the set of preferences given by user j,
where Bj;m is the order she gives to the pair Cj;m. For
example Bj;m = fO1 O2g means that O1 is preferred to
O2 if Cj;m = (O1; O2). The considered data are incomplete
since not all items are observed by each user. We assume no
ties in the data: users are forced to express their preference
for all pairs in the list Cj assigned to them, and indi erence
is not permitted.</p>
      <p>The main assumption is that each user j has a personal
latent ranking, R~ j = (R~j1; :::; R~jn) 2 Pn (the space of
n-dimensional permutations), distributed according to the
Mallows model
(R~ j j ; ) =
expf ( =n)d(R~ j ; )g1Pn ( R~j )</p>
      <p>
        Zn( )
We assume conditional independence between R~ 1; :::; R~N
given and . Here 2 Pn is the shared consensus ranking,
&gt; 0 is the scale parameter (describing the concentration
around the shared consensus), d( ; ) : Pn Pn ! [0; 1) is
any right-invariant distance function (e.g. Kendall,
Spearman or footrule), and Zn( ) is the normalizing constant.
See [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] for more details.
      </p>
      <p>We model the situation where each user j, when
announcing her pairwise preferences, mentally checks where the items
under comparison are in her latent ranking R~j . Then, if the
user is consistent with R~ j , the pairwise orderings in Bj are
induced by R~ j according to the rule: (Ok Oi) ()
R~jk &lt; R~ji. In this case Bj contains only transitive
preferences. However, if the user is not fully consistent with her
own latent ranking, the pairwise orderings in Bj can be
nonRj2Pn
( Bj j ; ) =
transitive. In order to deal with this situation, we propose
a probabilistic model based on the assumption that
nontransitivities are due to mistakes in deriving the pair order
from the latent raking R~ j . The likelihood assumed for a set
of preferences Bj is</p>
      <p>X</p>
      <p>(Bj j R~j = Rj ) ( R~j = Rj j ; );
where (Bj j R~j = Rj ) is the probability of ordering the pairs
in Cj as in Bj (possibly generating non-transitivities), when
the latent ranking for user j is Rj . It is therefore the
probability of making mistakes instead of just following Rj .
The posterior density is then:
3
(Bj jRj ) (Rj j ; )5 ;
where we assume a gamma prior, ( ), for and a uniform
prior on Pn, ( ), for . We suggest two models for the
probability of making a mistake: the Bernoulli model (BM)
to handle random mistakes, and the logistic model (LM) for
mistakes that are due to di culty in ordering similar items.
In both models we assume that any two pair comparisons
made by a user are conditionally independent given her
latent ranking: (Bj;m1 ?? Bj;m2 ) j R~j , 8m1; m2 = 1; :::Mj . In
BM we assume the following Bernoulli type model for the
probability of making a mistake:</p>
      <p>P(Bj;m =</p>
      <p>B~j;m( R~j ) j ; R~j ) =
;
2 [0; 0:5) ;
where B~j;m( R~j ) is the reversed preference order, i.e. a
mistake. We assign to the truncated Beta distribution on
the interval [0; 0:5) as prior. In LM we assume the following
logistic type model for the probability of making a mistake:
logit P Bj;m =</p>
      <p>B~j;m( R~j ) R~j ; 0; 1
= 0 + 1d R~j;m;
where d R~j;m is the footrule or l1 distance of the
individual ranks of the items compared: if Bj;m = (O1 O2),
d R~j;m = jR~j1 R~j2j. We assign to 1 an exponential prior
on the negative support1 and to 0, conditioned on 1, an
exponential prior on the shifted support [ 1; 1]2. These
choices are motivated by the fact that we want to model a
negative dependence between the distance of the items and
the probability of making a mistake. Also, we want to force
the probability of making a mistake when the items have
ranks di ering by 1 to be less than 0.5.</p>
    </sec>
    <sec id="sec-3">
      <title>EXPERIMENTAL RESULTS</title>
      <p>We performed a number of experiments on simulated data,
generated according to the model in Section 2 with BM
noise, a xed TRUE and n = 10 items. For various
values of we sampled latent rankings R~ j;TRUE from the
Mallows density centered at TRUE, using which we generated
the individual non-transitive pair comparisons. In order
to assess the performance of our methods, in Figure 1 we
plotted the posterior CDF of the footrule distance of the
estimated consensus ranking and the true generating one,
df ( ; TRUE) = Pin=1 j i i;TRUEj, for varying parameters
N , , and M (the parameter of a Poisson from which the
number of comparisons assigned to each user is sampled).
As expected, the performance of the method improves as
1 ( 1) = 1e 1 1 1( 1 &lt; 0), 1 &gt; 0.
2 ( 0j 1) = 0e 0( 0+ 1)1( 0 &lt;
1), 0 &gt; 0.
the number of users N increases (Figure 1, top right),
becomes worse as the probability of doing mistakes increases
(top left), improves as the dispersion of the individual latent
rankings R~ j;TRUE around TRUE decreases, i.e. when
increases, (bottom left), and becomes worse when the number
of pairwise comparisons diminishes, i.e. when M decreases,
(bottom right). The new method performs generally well
also if the number of pair comparisons per user is around
one third of all possible pair comparisons ( M = 15). We
then studied the precision of the estimated individual
preferences by comparing R~j with R~ j;TRUE in terms of top 3
detection. For each user j, we found the triplet of items
D3j = fOi1 ; Oi2 Oi3 g with the maximal estimated posterior
probability of being ranked jointly among the top 3 items.
Let H5j be the set of 5 items with the 5 highest ranks in
R~j;TRUE. We computed the percentage of users for which
D3j H5j . This success-percentage was 60% - 85% when the
data were generated with = 2 and Mj = 15 8j, which
is a hard problem. For easier experiments, with larger
(ranging from 3 to 4) and larger M (up to 30), the
successpercentages increased to 90%-100%.</p>
    </sec>
    <sec id="sec-4">
      <title>CONCLUSIONS</title>
      <p>
        We extended the Bayesian method for learning preferences
in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] to non-transitive pairwise comparison data. We
produce estimates of the consensus ranking of all items under
considerations, as well as a personal estimated ranking of
all items for each user. This can be directly used for
personalized recommendations. We also obtain posterior
distributions for these preferences, allowing quanti cation of the
uncertainty of recommendations of interest.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Caron</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Doucet, \
          <article-title>E cient Bayesian Inference for generalized Bradley-Terry Models"</article-title>
          ,
          <source>Journal of Computational and Graphical Statistics</source>
          ,
          <volume>21</volume>
          (
          <issue>1</issue>
          ),
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Lu</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Boutilier</surname>
          </string-name>
          , \
          <article-title>E ective sampling and learning for Mallows models with pairwise preference data"</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>V.</given-names>
            <surname>Vitelli</surname>
          </string-name>
          , . S rensen,
          <string-name>
            <given-names>A.</given-names>
            <surname>Frigessi</surname>
          </string-name>
          and E. Arjas, \
          <article-title>Probabilistic preference learning with the Mallows rank model"</article-title>
          ,
          <source>ArXiv</source>
          e-prints,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>