<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Run</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Markov Precision: Modelling User Behaviour over Rank and Time</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marco Ferrante</string-name>
          <email>ferrante@math.unipd.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicola Ferro</string-name>
          <email>ferro@dei.unipd.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Maistro</string-name>
          <email>maistro@dei.unipd.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dept. of Information Engineering, University of Padua</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dept. of Mathematics, University of Padua</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2000</year>
      </pub-date>
      <volume>1</volume>
      <issue>2</issue>
      <abstract>
        <p>We propose a family of new evaluation measures, called Markov Precision (MP), which exploits continuous-time and discrete-time Markov chains and we conduct a thorough experimental evaluation providing also an example of calibration of its time parameters. We propose a family of measures of retrieval e ectiveness, called Markov Precision (MP) [1] where we exploit Markov chains [2] to inject di erent user models into precision. We represent each position in a ranked result list with a state in a Markov chain and the di erent transition probabilities among the states allow us to model the perhaps complex user paths in scanning a ranked result list. The framework we propose is actually more general and it is based on continuoustime Markov chains in order to take into account also the time a user spends in visiting a single document. This gives us a two-fold opportunity: when we consider the discrete-time Markov chain, we are basically reasoning as traditional evaluation measures which assess the utility for the user in scanning the ranked result list; when we consider the continuous-time Markov chain, we also embed the information about the time spent by the user in visiting a document. Finally, we conduct a thorough experimental evaluation of the MP measure both using standard Text REtrieval Conference (TREC)3 collections and clicklogs with assessed queries made available by Yandex [3]. The results show that MP is comparable to other measures, while the Yandex click-logs allow us to estimate the time spent by the users on the documents.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Firstly we will introduce some notation that we will use through the whole paper.
Let us consider a ranked list of T documents and let R be the set of the ranks
of the relevant documents. We will denote by RB the recall base, i.e. the total
number of judged relevant documents.</p>
      <sec id="sec-1-1">
        <title>3 http://trec.nist.gov/</title>
        <p>We will assume that each user starts from a chosen document, at rank X0 in
the list, and considers this document for a random time T0, that is distributed
according to a known positive random variable. Then he/she decides to move to
another document, at rank X1, and he/she considers this new document for a
random time T1. Successively, he/she moves, independently, to a third document
and so on. Hence, we will denote by X0; X1; X2; : : : the (random) sequence of
document ranks visited by the user and by T0, T1, T2 the random times spent
visiting each considered document.</p>
        <p>
          We mathematically model the user behaviour in the framework of the
Markovian processes [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. First of all, we will assume that X0 is a random variable on
T = f1; 2; : : : ; T g with a given distribution = ( 1; : : : ; T ); so for any i 2 T ,
P[X0 = i] = i. Then, we will assume that the probability to pass from the
document at rank i to the document at rank j will only depend on the starting
rank i and not on the whole list of documents visited before. Thanks to this
condition and xing a starting distribution , the random variables (Xn)n2N
dene a time homogeneous discrete time Markov Chain, with state space T , initial
distribution and transition matrix P .
        </p>
        <p>To obtain a continuous-time Markov Chain, we have to assume that the
holding times Tn have all exponential distribution, and conditioned on the fact
that Xn = i, the law of Tn will be exponential with parameter i, where i is
a positive real number. When our interest is only on the jump chain (Xn)n2N,
we simply assume that all these variables are exponential with parameter = 1;
while when we are also interested in the time dimension, we have to provide a
calibration for these exponential variables.</p>
        <p>Let us assume hereafter that the matrix P will be irreducible and that after
visiting n documents in the list the user will stop his/her search. In order to
measure his/her satisfaction, we will evaluate the average of the precision of the
ranks of the judged relevant documents visited by the user during their search
as
k=0
n 1
1 X Prec(Yk) :
n
where (Yn)n2N denotes the sub-chain of (Xn)n2N that considers just the visits
to the judged relevant documents at ranks R. Note that this sub-chain has in
general a transition matrix di erent form P , that we will denote with Pe.</p>
        <p>Clearly, the previous quantity is of little use if evaluated at an unknown nite
step n. However, the Ergodic Theorem for the Markov processes approximates
this quantity with</p>
        <p>M P = X</p>
        <p>iPrec(i) ;
i2R
where is the (unique) invariant distribution of the Markov chain (Yn)n2N.
Hence, we have de ned a new family of user oriented retrieval e ectiveness
measures, called Markov Precision (MP). Note that MP is de ned without knowing
the recall base RB, but just the ranks of the judged relevant documents in the
given run.
bpref</p>
        <p>Rprec
bpref</p>
        <p>Rprec</p>
        <p>In order to include the time dimension, we can replicate the previous
computations and de ne a new measure</p>
        <p>M P cont = X eiPrec(i):</p>
        <p>i2R
where ei = Pj2 Ri( ij)( 1j) 1 ,
Markov chain (Yn)n2N and
i.
3</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Evaluation</title>
      <p>
        denotes again the (unique) distribution of the
i is the parameter of the holding time in state
In order to assess MP and compare it to the other evaluation measures (Average
Precision (AP), P@10, Rprec, Rank-Biased Precision (RBP), and Binary
Preference (bpref)), we conducted a correlation analysis and we studied its robustness
to pool downsampling. We used the following data sets: TREC 7 Ad Hoc, TREC
8 Ad Hoc, TREC 10 Web, and TREC 14 Robust. As far as calibration of time
is concerned, we used click logs made available by Yandex [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The full source
code of the software used to conduct the experiments is available for download4
in order to ease comparison and veri cation of the results.
      </p>
      <p>Firstly, we computed the Kendall correlation between the di erent models
for MP and the performance measures of direct comparison, for all the considered
collections; Figure 1 reports the results in the case of TREC 8. We indicate by
[OR] the model where the user moves only on relevant documents, [GL] means that
the user moves among all the documents, [LO] only among adjacent documents.
[ID] denotes that the transition probabilities are proportional to the inverse of
the distance, or to the inverse of the logarithm of the distance, indicated with
[LID].</p>
      <p>As a general trend MP tends not to have high correlations with the other
evaluation measures, even if the correlation never drops below 0.70, indicating</p>
      <sec id="sec-2-1">
        <title>4 http://matters.dei.unipd.it/</title>
        <p>that it takes a di erent angle from them (users move forward and backward in
the result list). With regard to the rescaled version of MP by recall (@Rec), the
correlation with AP increases in almost all cases. Moreover, if we assume that
a user moves with a constant, xed transition probability, the number of which
depends on the number of retrieved relevant document, we will obtain that MP
is equal to AP, once rescaled by the recall.</p>
        <p>Then we analyse the e ect of reducing the pool size on the absolute average
performances, over all the topics and runs. MP shows a consistent behaviour over
all the collections and for various models: its absolute average values decrease
as the pool reduction rate increases.</p>
        <p>Furthermore, we study the e ect of reducing the pool size on the Kendall
correlation between each measure on the full pool and the pool at a given
reduction rate. MP models tend to perform comparably to AP and, when provided
with the same information about the recall base, they consistently improve their
performances. For all models, using the log of the inverse of the distance [LID]
provides more robustness to pool reduction.</p>
        <p>Finally, on the basis of the click logs, we can state that 21% of the observed
transitions are backward, a fact that validates our assumption that a user moves
forward and backward along the ranked list. Moreover, we compute the values of
continuous-time MP and we compare them with the discrete-time MP, reported
in Table 1, concluding that the continuous-time version depends heavily on the
calibration of the holding times.
4</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Future Work</title>
      <p>Future works concern the investigation of alternative user models able to
account also for the number of relevant/not relevant documents visited so far, the
possibility of learning the transition probabilities of the Markov chain directly
from click-logs, the calibration of time into MP, the adjustment of MP to fact,
entity, or attributes focused searches, and the investigation of the robustness of
MP.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>M.</given-names>
            <surname>Ferrante</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maistro</surname>
          </string-name>
          .
          <article-title>Injecting User Models and Time into Precision via Markov Chains</article-title>
          .
          <source>In SIGIR 2014</source>
          , pages
          <fpage>597</fpage>
          -
          <lpage>606</lpage>
          . ACM Press, USA. (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Norris</surname>
          </string-name>
          . Markov chains. Cambridge University Press, UK,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>P.</given-names>
            <surname>Serdyukov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Craswell</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Dupret</surname>
          </string-name>
          .
          <source>WSCD2012: Workshop on Web Search Click Data</source>
          <year>2012</year>
          . In WSDM, pages
          <volume>771</volume>
          {
          <fpage>772</fpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>