<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A graphical view of distance between rankings: The Point and Area measures (Extended Abstract)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Giorgio Maria Di Nunzio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gianmaria Silvello</string-name>
          <email>silvellog@dei.unipd.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Information Engineering</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Padua</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In Information Retrieval (IR), measuring the distance between rankings is a way for comparing evaluation measures and assess system rankings. In this paper, we present a variation of the Spearman foot rule which allows us to de ne two measures that have nice analytical and geometrical properties that can be e ectively used to compare di erent rankings and to evaluate IR experiments. A Web application that shows how these measures behave from the graphical point of view is available at: https://gmdn.shinyapps.io/ShinyPointArea/</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Search engines e ectiveness can be measured by analyzing their visible outcomes
which are lists of documents ranked in descending order of relevance to a given
topic. In this context, the correlation among rankings can be used to assess the
search engines e ectiveness; in fact, when one of the two rankings is the ideal one
{ i.e. the best obtainable result in a laboratory based evaluation { the correlation
between two rankings becomes a measure of e ectiveness. A high correlation
between a ranking under evaluation and the ideal indicates a good behavior of
the search engine being tested. Two standard de-facto measures of distance are
the Spearman foot rule [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and the Kendall rank distance [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Both measures are
very easy to calculate and to interpret, but they lack some properties when it
comes to evaluate search engines e ectiveness. A review and a classi cation of
ranking similarity measures (including extensions of the Spearman foot rule and
the Kendall rank distance) has been presented by Webber et alii in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. This
classi cation is based on two main properties: conjointness and weightedness.
Conjointness is the property of dealing with complete (conjoint) or partial
(nonconjoint) rankings; weightedness is the property of being able (weighted) or not
being able (unweighted) to weight misplacements at the top of the list more than
at its bottom.
      </p>
      <p>In this paper, we present a variation of the Spearman foot rule leading to
the de nition of two new measures: a point-wise (qualitative) measure and an
area-wise (quantitative) measure which can be classi ed as non-conjoint and
unweighted. 1 Point-wise and area-wise measures present analytical and
geometrical properties that can be e ectively used to compare di erent rankings and
to evaluate IR experiments. Furthermore, the point-wise measure provides an
intuitive and e ective graphical interpretation that can be used for performing
a qualitative analysis on rankings comparison. Whereas, the area-wise measure
can be used for a quantitative analysis given that its normalized version o ers
a simple measure of correlation between rankings. Moreover, as a by-product
of this approach, we also obtain an original reformulation of the Kendall rank
distance computed at each rank.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Methodological Approach</title>
      <p>Given a set of documents D = fd1; : : : ; di; : : : ; dng, we consider two rankings r
and r as two permutations without repetitions of D. 2 We can use a method
idx (di) to extract the index of document di within r . In general, the problem
is to nd the index of a document of r in the another ranking r . To this
purpose, we de ne the function F ; (k) as:</p>
      <p>F ; (k) = idx (r (k)))
This function translates the k-th document in r and returns the index of
that document in the ranking r . For example, let ra = [d2; d1; d4; d3] and
rb = [d1; d4; d2; d3] be two instances of r and r . Then, for k = 1, Fa;b(1) =
idxb (ra(1)) = idxb (d2) = 3.</p>
      <p>The de nition of Spearman footrule can be rewritten using Eq. 1 as:
P ; (i) =</p>
      <p>
        i
X(F ; (k)
k=1
k) :
1 The point-wise and area-wise measures already contain a parameter for weighting
the top and the bottom of a ranking list di erently. For the purpose of this paper,
we present the unweighted version of the measures.
2 There are important research areas in IR which require distance measures on
incomplete permutations (see Webber et alii [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], Fagin et alii [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] for Web search results and
Angelini et alii [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] for search engine failure analysis), also known as the property of
conjointness of a distance measure. For space reasons, we cannot address this issue
in this paper and leave the solution of this problem to future works.
which is the total element-wise misplacements between the two lists r and r .
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Point and Area Measures</title>
      <p>The two measures we present in this paper derive from a variation of the
Spearman foot rule. By removing the absolute value from the Spearman foot rule, we
obtain the point-wise measure:</p>
      <p>i
S ; (i) = X jF ; (k)
k=1
kj :
(1)
(2)
(3)
As a result, we make a distinction between negative and positive misplacements:
when F ; (k) &gt; k, we obtain a positive error because the document at rank k
should have been ranked higher in r ; when F ; (k) &lt; k, we obtain a negative
value because we nd a highly relevant document lower in the list, which means
that we are recovering a misplacement occurred earlier. Ultimately, the
pointwise goes to zero when the last element of the list is computed. When the
pointwise measure returns a positive number at any rank i, it means that there is
still some non-recovered misplacement in rb. On the other hand, every time
P ; (i) = 0 it means that the two rankings have retrieved the same elements at
rank i. Compared to Spearman, the measure P ; (i) gives us some additional
information about the distance between a given ranking and the ideal one at rank
i: we can tell how far we are from the ideal ranking. However, the point-wise
measure does not tell how bad a ranking is before rank i.</p>
      <p>The area-wise measure considers the area formed by the segments between
two adjacent points P ; (k 1) and P ; (k) and the x-axis. As an approximation
of the area to be measured, we use a linear interpolation between adjacent points
and we de ne the A ; (i) as the as a sum of all the trapezoids formed with the
x-axis from rank 1 to i:</p>
      <p>i
A ; (i) = X
k=1</p>
      <p>P ; (k
1) + P ; (k)
2
h
where h is the height of each trapezoid, and P ; (0) = 0 by assumption. The
height h of each trapezoid can be a constant (for example h = 1), or it can be
used as a tuning parameter for weighting errors di erently according to the rank
of the elements.</p>
      <p>The area-wise measure can be used e ectively in two ways: to accumulate the
information about misplacements until rank i, to compare two or more rankings
given a reference ranking (for example, the ideal one). Moreover, we can also
study when the same type of misplacements occur in one ranking or another
(earlier or later in the ranking), for example when two adjacent relevance degrees
are exchanged.</p>
      <p>It may be convenient to normalise the value returned by the area-wise
between 0 and 1. The normalisation can be done by dividing the area of a relevance
list at rank i by the largest obtainable area given by the worst possible ranking,
that is the ideal ranking taken in the inverse order. The normalisation of the
area-wise measure is de ned as:
nA ; =</p>
      <p>
        A ;
A ;
(4)
(5)
where A ; is the area of the worst possible ranking. The normalized area nA ;
de nes the distance between and where nA ; = 0 means that and are
the same ranking and nA ; = 1 means that they are inverse rankings one of
the other. From this it is straightforward to derive the correlation coe cient as:
A-corr ; = 1 nA ; .
In this paper, we have introduced two new measures of distance among rankings:
the point-wise and the area-wise measure. The point-wise measure was derived
as a modi cation of the original Spearman foot rule, while the area-wise measure
is built on top of the point-wise. The normalisation of the area-wise measure and
the correlation coe cient derived from it { i.e. A-corr { is a measure of correlation
between rankings. We have discussed some properties of these two measures
in terms of qualitative analysis and quantitative analysis. For space reasons,
we could not give a complete formalisation of the fact that both measures are
metrics { i.e. the re exivity, non-negativity, symmetry and triangle inequality
properties hold. The area-wise measure already incorporates a parameter h that
can be used to weight misplacements that happen at top ranks more than at low
ranks; a possibility is to de ne h as the inverse of the index h(i) = 1=i. In this
way, we would obtain a non-conjoint weighted distance measure. In addition, the
parameter h can be used to model users search intents; for example, we could
adjust the height h as a tuning parameter to represent a particular aspect of
user behaviour such as the level of patience in the RBO measure [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
Furthermore, we want to investigate the use of the point-wise curve and the
A-corr measure as a full- edged e ectiveness measures. The behavior of the
point-wise curve resembles the Cumulative Relative Position (CRP) [
        <xref ref-type="bibr" rid="ref1 ref4">1, 4</xref>
        ] one,
but it presents several key di erences such as the fact that the point-wise curve
does not cross the x-axis whereas the CRP one does and that CRP is de ned as
an e ort-oriented measure and not as a correlation one.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>M.</given-names>
            <surname>Angelini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          , K. Jarvelin,
          <string-name>
            <given-names>H.</given-names>
            <surname>Keskustalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pirkola</surname>
          </string-name>
          , G. Santucci, and
          <string-name>
            <given-names>G.</given-names>
            <surname>Silvello. Cumulated Relative</surname>
          </string-name>
          <article-title>Position: A Metric for Ranking Evaluation</article-title>
          .
          <source>In Proc. of the 3rd Int. Conf. of the CLEF Initiative (CLEF</source>
          <year>2012</year>
          ), pages
          <fpage>112</fpage>
          {
          <fpage>123</fpage>
          . LNCS 7488, Springer, Heidelberg, Germany,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>M.</given-names>
            <surname>Angelini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          , G. Santucci, and
          <string-name>
            <given-names>G.</given-names>
            <surname>Silvello. VIRTUE</surname>
          </string-name>
          :
          <article-title>A Visual Tool for Information Retrieval Performance Evaluation and Failure Analysis</article-title>
          .
          <source>Journal of Visual Languages &amp; Computing</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kumar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Sivakumar</surname>
          </string-name>
          .
          <article-title>Comparing Top k Lists</article-title>
          .
          <source>In Proc. of the 14th annual ACM-SIAM symposium on Discrete algorithms, SODA '03</source>
          , pages
          <fpage>28</fpage>
          {
          <fpage>36</fpage>
          .
          <source>Society for Industrial and Applied Mathematics</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          , G. Silvello,
          <string-name>
            <given-names>H.</given-names>
            <surname>Keskustalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pirkola</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Ja</surname>
          </string-name>
          <article-title>rvelin. The Twist Measure for IR Evaluation: Taking User's E ort Into Account</article-title>
          .
          <source>Journal of the American Society for Information Science and Technology (JASIST)</source>
          ,
          <article-title>(in print).</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Kendall</surname>
          </string-name>
          .
          <article-title>A New Measure of Rank Correlation</article-title>
          . Biometrika,
          <volume>30</volume>
          (
          <issue>1</issue>
          /2):
          <volume>81</volume>
          {
          <fpage>93</fpage>
          ,
          <year>1938</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>C.</given-names>
            <surname>Spearman</surname>
          </string-name>
          .
          <article-title>The Proof and Measurement of Association Between Two Things</article-title>
          .
          <source>American Journal of Psychology</source>
          ,
          <volume>15</volume>
          :
          <fpage>88</fpage>
          {
          <fpage>103</fpage>
          ,
          <year>1904</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>W.</given-names>
            <surname>Webber</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.</surname>
          </string-name>
          <article-title>Mo at, and</article-title>
          <string-name>
            <given-names>J.</given-names>
            <surname>Zobel</surname>
          </string-name>
          .
          <article-title>A Similarity Measure for Inde nite Rankings</article-title>
          .
          <source>ACM Trans. Inf</source>
          . Syst.,
          <volume>28</volume>
          (
          <issue>4</issue>
          ):
          <fpage>20</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>