<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Improving Learning to Rank By Leveraging User Dynamics and Continuation Methods</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Discussion Paper?</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicola Ferro</string-name>
          <email>ferro@dei.unipd.it</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudio Lucchese</string-name>
          <email>claudio.lucchese@unive.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Maria Maistro</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ra aele Perego</string-name>
          <email>raffaele.perego@isti.cnr.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ca' Foscari University of Venice</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ISTI CNR Pisa</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Copenhagen</institution>
          ,
          <country country="DK">Denmark</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Padua</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Learning to Rank (LtR) techniques leverage assessed samples of query-document relevance to learn ranking functions able to exploit the noisy signals hidden in the features used to represent queries and documents. In this paper, we explore how to enhance the state-ofthe-art LambdaMart algorithm by integrating in the training process an explicit knowledge of the underlying user-interaction model and the possibility of targeting di erent objective functions, that can e ectively drive the algorithm towards promising areas of the search space. We enrich the learning algorithm in two ways: (i) by considering complex query-based user dynamics instead than simply discounting the gain by the rank position; (ii) by designing a learning path across di erent loss functions that can capture di erent signals in the training data. Our extensive experiments, conducted on publicly available datasets, show that the proposed solution permits to improve various ranking quality measures by statistically signi cant margins.</p>
      </abstract>
      <kwd-group>
        <kwd>Learning to Rank</kwd>
        <kwd>User Dynamics</kwd>
        <kwd>Continuation Methods</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], we explored whether the explicit knowledge of the underlying user
interaction model [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and the possibility of targeting di erent objective functions,
by means of Continuation Methods (CM) [
        <xref ref-type="bibr" rid="ref1 ref5 ref7">1, 5, 7</xref>
        ], make it possible to improve
the ranking models learned by Learning to Rank (LtR) algorithms. In
particular, our investigation focused on improving the state-of-the-art LtR algorithm
LambdaMart [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This extended abstract summarizes the main contributions
discussed in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] by relying on a extensive assessment conducted on publicly
available datasets:
? Extended abstract of the paper originally published in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
      </p>
      <p>
        Copyright c 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0). This volume is published
and copyrighted by its editors. SEBD 2020, June 21-24, 2020, Villasimius, Italy.
{ modelling user dynamics into LambdaMart: instead of proposing new
features to account for user behavior and then training the LtR model on this
extended set of features, we: (1) explicitly model the user dynamics in
scanning a ranked result list with Markov chains trained on query log data; (2)
de ne a novel measure which embeds this trained Markov chain
(Normalized Markov Cumulated Gain (nMCG)), and modify the LambdaMart loss
function to rely on this measure instead of the usual Normalized Discounted
Cumulated Gain (nDCG) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ];
{ user interaction-based curriculum learning for LambdaMart: designing a
curriculum of objective functions of increasing complexity has shown to be
a promising research direction. Therefore we explore if our nMCG measure
provides further gain when building a curriculum based on CM techniques.
2
      </p>
    </sec>
    <sec id="sec-2">
      <title>Methodology</title>
      <p>
        An LtR algorithm exploits a ground-truth set of training examples in order to
learn a document scoring function [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Such training set is composed of a
large collection of queries Q, where each query q 2 Q is associated with a
set of assessed documents D = fd0; d1; : : :g. Each document di is labeled by
a relevance judgment li according to its relevance to the query q. These labels
induce a partial ordering over the assessed documents, thus de ning an ideal
ranking which the LtR algorithm aims at approximating. Each query-document
pair (q; di) is represented by a vector of features x, able to describe the query
(e.g., its length), the document (e.g., the in-link count) and their relationship
(e.g., the number of occurrences of each query term in the document).
      </p>
      <p>
        LambdaRank can be summarized as follows. Given a query q and two
candidate documents di and dj in the training set with relevance labels li and lj
respectively, si and sj are the scores currently predicted for the documents. The
lambda gradient of any given Information Retrieval (IR) quality function Q, as
de ned in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], is:
ij =
sj )
= sgn(yi
yj )
      </p>
      <p>Qij</p>
      <p>
        1
1 + esi sj
where, the sign is determined by the document labels only, the rst factor Q
is the quality variation when swapping scores si and sj , and the second factor is
the derivative of the RankNet cost [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which minimizes the number of disordered
pairs. When li lj , the quality Q increases with the score of document di. The
larger the quality variation Q, the higher the document di should be scored.
Note that the RankNet multiplier fades Q if documents are scored correctly,
i.e. si sj , and boosts Q otherwise. The lambda gradient for a document di is
computed by marginalizing over all possible pairs in the result list: i = Pj ij .
LambdaRank uses nDCG as Q and so Q is the variation in nDCG caused by
the swap of two documents.
      </p>
      <p>
        0.25
0.2
4Rank Positions 8
6
10
4Rank Positions 8
6
100.2
We summarize here our methodology [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for including the user dynamics into the
above discussed LambdaRank algorithm. We model the user dynamics with a
Markovian process [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], where the user scans the ranked documents in the Search
Engine Result Page (SERP) according to possibly complex paths, moving both
forward and backward. Let us denote by X1; X2; : : : the sequence of random
variables representing the rank positions in R = f1; 2; : : : ; Rg visited by the
user, where Xn = j means that the nth document visited by the user is at
rank j. Moreover, we assume that the probability to move from the document
at rank i to the document at rank j depends on the document at rank i only
and is independent of all the previously visited documents. Finally, we denote
by P the transition matrix whose entries represent the transition probabilities
P = (pij : i; j 2 R), where pij = P[Xn+1 = jjXn = i]. The sequence of random
variables (Xn)n&gt;0 de nes a discrete-time homogeneous Markov chain.
      </p>
      <p>Figure 1(a) plots the stationary distributions obtained from the Yandex query
log described in Section ??. We focus on two distinct macroscopic behavior types,
and, for the sake of simplicity, we call navigational the queries where users
concentrated on just the rst item, and we consider all the other queries as
informational since users tend to visit more documents. On the basis of the above
experimental observations, we claim that the user dynamics can be described
as a mixture of the navigational and informational behavior. The navigational
component is represented by the inverse of the rank position 1i , while the
informational component is linear with respect to the rank position i. Therefore, we
model the user dynamics as</p>
      <p>(i) = i 1 + i +
where the parameters , and are calibrated in order to t the estimated
stationary distributions computed on the Yandex dataset.</p>
      <p>Figures 1(b) and 1(c) show the stationary distributions together with the
tted curves for the navigational and informational cases, respectively. In
Figure 1(b) the stationary distribution is the same reported in the red line of
Figure 1(a), i.e. queries with just 1 relevant document, while to compute the
stationary distribution reported in Figure 1(c) we aggregate all the user dynamics
corresponding to the other queries, i.e. queries without relevant documents or
with more than one relevant document.</p>
      <p>We enhance the existing LambdaMart algorithm by replacing the above Q
with a new quality measure which integrates the proposed user dynamics . This
new measure is called Normalized Markov Cumulated Gain (nMCG) and it is
de ned as follows:</p>
      <p>Pi k 2li
1
c(i)
1)
c(h)
where li is the relevance label of the i-th ranked document and c(i) is the user
dynamics function at rank i relative to the query class c, either navigational or
informational. Basically, nMCG can be seen as an extension of nDCG, where
the discount function is de ned by the user dynamic and depends on the query
class. Moreover, since c depends on the query class, i.e. depends on the query
q, we are optimizing two di erent variants of the same quality measure nMCG
across the training dataset. Finally, nMCG@kij can be computed e ciently as
follows:
Hereinafter, we use nMCG-MART to refer to the described variant of
LambdaMart aimed at maximizing nMCG.
2.2</p>
      <sec id="sec-2-1">
        <title>Continuation Methods for LtR</title>
        <p>Applying a Continuation Methods (CM) to a cost function C means to de ne
a sequence of cost functions C with 2 [0; 1], such that by increasing , the
complexity of the function increases. Therefore, C0 represents a highly smoothed
version of the original cost function corresponding to C1.</p>
        <p>We implement CMs in the contest of forests of decision trees as a two steps
learning process, where the initial trees are trained by minimizing a cost function
C0 and the remaining trees are trained by optimizing C1. We did not explore
in this work more complex multi-stage scenarios, which can trivially extend this
work. We denote continuation methods as C0T0 C1T1, meaning that the rst T0
trees of the ensemble minimize C0, while the remaining T1 trees minimize C1,
additively re ning the prediction of the rst T0 trees.</p>
        <p>
          In order to apply a CM to LambdaMart we need to smooth the
LambdaMart loss function. Smoother variants of nDCG have been previously
proposed, e.g. SoftNDCG [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]; however, their performance did not prove to be
signi cantly better than the original LambdaMart loss function. Therefore, we
look at two di erent smoother replacements of the nDCG measure.
        </p>
        <p>The rst option we consider is to use as C0 the Mean Square Error (MSE) as a
smooth variant of the target nDCG measure. Even if MSE is easily di erentiable
and an MSE equal to 0 corresponds to the maximum nDCG value, the MSE cost
cannot be considered a smooth approximation of nDCG. Nevertheless, Gradient
Boosted Regression Trees that minimize MSE are known to perform well, even
at optimizing nDCG. The rationale of using C0 equal to MSE is that of using
a smooth function that alone exhibits good performance as a good seed for the
re nement that the optimization of nDCG or nMCG may provide.</p>
        <p>The second option we consider is to use Recall@k as C0. Recall is not a
di erentiable function either, as it su ers the same issues as nDCG. Still, it is
an easier function to be optimized because it considers binary relevance instead
of graded, and it discriminates between documents above or below the cut-o
k without further discounting according to document ranks. The rationale is to
train the rst portion of the forest in order to place the most relevant documents
above the cut-o , and then to complete the training with the target metric nDCG
or nMCG to adjust their relative ranking.</p>
        <p>Since Recall is not di erentiable, we devised a LambdaMart-based approach
similar to nMCG-MART. We de ne nRecall@kij as the normalized change in
Recall@k when swapping the i-th and j-th documents in the current ranking.
Given a relevance binarization thresholds , nRecall@kij is de ned as follows:
1li
1lj
Ph 1lh
(1i k
1j k) ;
where 1p is the indicator function evaluating to 1 if predicate p is true and 0
otherwise.</p>
        <p>Finally, in addition to C1 being equal to nDCG, we also consider the case
where nMCG-MART is used as C1. Our goal is to evaluate the added value of
exploiting the user dynamics in producing the nal ranking, even when the rst
trees of the forest where built by optimizing a di erent metric.</p>
        <p>In conclusion, we compare the e ectiveness of LambdaMart and
nMCGMART and we asses the bene t they can receive by a pre-training on a di erent
metric, which allows the two algorithms to initiate their search process from a
di erent point in the solution space, possibly leading to a better nal solution.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experimental Evaluation</title>
      <p>
        We remark that since there is no publicly available dataset providing, at the same
time, user session data, document relevance and query-document pairs features,
we have to use two di erent datasets in our analysis. Therefore, we calibrate the
user model on the click log dataset provided by Yandex [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], while the training,
validation and test for the LtR model are performed on MSLR-WEB30K and
MSLR-WEB10K, provided by Microsoft [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
      </p>
      <p>We used LambdaMart as the reference LtR algorithm. Speci cally, we used
(and modi ed) the open source implementation of LambdaMart provided in
the LightGBM gradient boosting framework and swiped the hyper parameters</p>
      <p>Available at https://github.com/Microsoft/LightGBM.
with cross-validation so as to maximize the average performance over the
validation folds.</p>
      <p>
        As signi cance test, we computed Fisher's randomization test with 100 000
random permutations and = 0:05, which is the most appropriate statistical
test to evaluate whether two approaches di er signi cantly [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        As name convention, all the CM approaches are referred with the name of
the rst objective function C0, the corresponding number of trees used during
the rst training phase T0, followed by the name of the second objective function
C1 with the corresponding number of trees T1. Therefore, recallT300 ndcgT200
means that the model is trained with Recall@10 for the rst 300 trees and then
with nDCG@10 for the following 200 trees. Moreover, we exploit as reference
models for CM approaches nMCG-MART and mseT200 ndcgT300, which is the
best performing model proposed in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
3.1
      </p>
      <sec id="sec-3-1">
        <title>Experimental Results</title>
        <p>
          In this section we summarize the main ndings without reporting any gure or
table, the complete results can be found in the original paper [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>First, we evaluate the performance of the proposed models against the
baselines on MSLR-WEB10K and MSLR-WEB30K:
{ With nDCG@10, CM approaches combining Recall and nMCG, namely
recallT300 nmcgT200 and recallT400 nmcgT100, achieve the best
performance on each fold. They are signi cantly di erent from both the baseline
and the corresponding CM approaches which combine Recall followed by
nDCG. This supports our hypothesis that the combination of CM and the
user dynamics integrated by nMCG can boost LambdaMart performance.
{ With nDCG@100, CM approaches using nMCG are less e ective and
mseT200 ndcgT300 is the best performing model, signi cantly improving
over the baseline both on each fold and on the average across folds. Moreover,
CM approaches using nMCG are not signi cantly di erent from those using
nDCG. However, they are still signi cantly improving over the baseline.
{ With Recall@10, the results are somehow in between nDCG@10 and
nDCG@100. mseT200 ndcgT300 is the best performing model when the
average across each fold is considered. However, if we break down the results
on each fold, recallT400 nmcgT100 is the best performing model on three
folds out of ve for MSLR-WEB30K. Moreover, recallT400 nmcgT100 is
signi cantly better than the baseline and than recallT400 ndcgT100.
{ With Recall@100, CM approaches using nDCG, i.e. mseT200 ndcgT300,
recallT300 ndcgT200 and recallT400 ndcgT100 perform better than those
using nMCG. When considering the score across all the folds,
mseT200 ndcgT300 is still the best performing model. Thus, when using
Recall@100 as evaluation measure, we reach conclusions similar to those
inferred with nDCG@100.
{ With Expected Reciprocal Rank (ERR)@10, recallT300 nmcgT200 is the
best performing model and it is the only approach which is signi cantly
better than the baseline. Moreover, CM approaches using nMCG are also
signi cantly better than those using nDCG, with respect to both each
separate fold and the average across folds.</p>
        <p>To summarize, CM approaches exploiting the user dynamics perform better
at lower rank positions, while CM approaches using nDCG, as mseT200 ndcgT300,
are somehow better in retrieving a higher number of relevant documents in the
rst 100 rank positions.</p>
        <p>Second, we repeated the previous analysis by considering the query type,
i.e. we compared the proposed models on navigational and informational queries
separately. On a high level the experimental results show that:
{ With navigational queries, there are just a few models which are signi
cantly better than the baselines, independently of the measure or the fold.
Furthermore, none of the model performs markedly better than the others,
there is instead a lot of variability depending on the measure and the fold
under consideration. We argue that with navigational queries the number of
relevant documents is limited and systems have little room for manoeuvre:
they need to place the most relevant document at the top of the ranked list
and, when doing this properly, they all perform equally well in achieving this
task.
{ The experimental comparison of models on informational queries provides
more insights on the performance of CM approaches combined with the user
dynamics. The exploitation of the user dynamics together with CM boosts
the ranking of queries for which several relevant documents are available.
Indeed, it helps in identifying those documents that are more relevant than
others and in placing them at the beginning of the ranked list.</p>
        <p>Finally, we analyse the tree-wise performance of the proposed models: we
compare the performance of di erent models with a varying number of trees. All
CM approaches leveraging the user dynamics results to be always better than
the baselines as the number of trees increases. Moreover, each boosting of the
performance corresponds exactly to the switching point T0, showing that
replacing the objective function from Recall to nMCG is bene cial for the learning
algorithm at any switching point.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion and Future Work</title>
      <p>This paper investigated whether integrating in LtR the knowledge of the
userinteraction model and the possibility of targeting di erent objective functions
is pro table and allows us to train more e ective ranking functions. The
results of the experiments, measured with di erent metrics and cut-o s, gave us
a more clear understanding of the problem addressed and con rmed our initial
intuitions. We showed, by also breaking down the analysis to di erent classes
of queries, that the proposed continuation methods exploiting our user-aware
nMCG measure consistently outperform the baselines by statistically signi cant
margins.</p>
      <p>
        As future work we will study di erent methods for applying click models to
LtR datasets. A possibility suggested in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] could be to exploit editorial relevance
labels as a link between di erent datasets and use a trained click model to
generate new click-based features for datasets not providing them.
Acknowledgements: This paper is partially supported by the BIGDATAGRAPES (EU H2020 RIA,
grant agreement N. 780751) and the OK-INSAID (MIUR-PON 2018, grant agreement N.ARS01 009
17) projects. The work is also partially funded by the \DAta BenchmarK for Keyword-based Access
and Retrieval" (DAKKAR) Starting Grants project sponsored by University of Padua and
Fondazione Cassa di Risparmio di Padova e di Rovigo, and by AMAOS (Advanced Machine Learning
for Automatic Omni-Channel Support), funded by Innovationsfonden, Denmark.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Allgower</surname>
            ,
            <given-names>E.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Georg</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <source>Numerical Continuation Methods. An Introduction</source>
          . Springer-Verlag, Heidelberg, Germany (
          <year>1980</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Burges</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shaked</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Renshaw</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lazier</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Deeds</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hamilton</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hullender</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Learning to Rank using Gradient Descent</article-title>
          .
          <source>In: ICML 2005</source>
          . pp.
          <volume>89</volume>
          {
          <fpage>96</fpage>
          . ACM Press, New York, USA (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Burges</surname>
            ,
            <given-names>C.J.C.</given-names>
          </string-name>
          :
          <article-title>From RankNet to LambdaRank to LambdaMART: An Overview</article-title>
          .
          <source>Tech. rep., Microsoft Research</source>
          ,
          <string-name>
            <surname>MSR-TR-</surname>
          </string-name>
          2010-
          <volume>82</volume>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Chuklin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Markov</surname>
          </string-name>
          , I., de Rijke, M.:
          <article-title>Click Models for Web Search</article-title>
          . Morgan &amp; Claypool Publishers, USA (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Coleman</surname>
            ,
            <given-names>T.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          :
          <article-title>Parallel Continuation-based Global Optimization for Molecular Conformation and Protein Folding</article-title>
          .
          <source>Journal of Global Optimization</source>
          <volume>8</volume>
          (
          <issue>1</issue>
          ),
          <volume>49</volume>
          {
          <fpage>65</fpage>
          (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lucchese</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maistro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perego</surname>
          </string-name>
          , R.:
          <article-title>On Including the User Dynamic in Learning to Rank</article-title>
          .
          <source>In: SIGIR 2017</source>
          . pp.
          <volume>1041</volume>
          {
          <fpage>1044</fpage>
          . ACM Press, New York, USA (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lucchese</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maistro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perego</surname>
          </string-name>
          , R.:
          <article-title>Continuation Methods and Curriculum Learning for Learning to Rank</article-title>
          .
          <source>In: CIKM 2018</source>
          . pp.
          <volume>1523</volume>
          {
          <fpage>1526</fpage>
          . ACM Press, New York, USA (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Ferro</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lucchese</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Maistro</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perego</surname>
          </string-name>
          , R.:
          <article-title>Boosting Learning to Rank with User Dynamics and Continuation Methods</article-title>
          .
          <source>Information Retrieval Journal</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9] Jarvelin,
          <string-name>
            <surname>K.</surname>
          </string-name>
          , Kekalainen, J.:
          <source>Cumulated Gain-Based Evaluation of IR Techniques. TOIS</source>
          <volume>20</volume>
          (
          <issue>4</issue>
          ),
          <volume>422</volume>
          {
          <fpage>446</fpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Liu</surname>
          </string-name>
          , T.Y.:
          <article-title>Learning to Rank for Information Retrieval</article-title>
          .
          <source>(FnTIR</source>
          <volume>3</volume>
          (
          <issue>3</issue>
          ),
          <volume>225</volume>
          {
          <fpage>331</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Norris</surname>
            ,
            <given-names>J.R.</given-names>
          </string-name>
          :
          <article-title>Markov chains</article-title>
          . Cambridge University Press, UK (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Qin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
          </string-name>
          , T.Y.:
          <source>Introducing LETOR 4.0 Datasets</source>
          . arXiv.org,
          <source>Information Retrieval (cs.IR) arXiv:1306.2597</source>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Serdyukov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Craswell</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dupret</surname>
          </string-name>
          , G.:
          <source>WSCD2012: Workshop on Web Search Click Data</source>
          <year>2012</year>
          . In:
          <string-name>
            <surname>WSDM</surname>
          </string-name>
          <year>2012</year>
          . pp.
          <volume>771</volume>
          {
          <fpage>772</fpage>
          . ACM Press, New York, USA (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>Smucker</surname>
            ,
            <given-names>M.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Allan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carterette</surname>
            ,
            <given-names>B.A.</given-names>
          </string-name>
          :
          <article-title>A Comparison of Statistical Signi - cance Tests for Information Retrieval Evaluation</article-title>
          .
          <source>In: CIKM 2007</source>
          . pp.
          <volume>623</volume>
          {
          <fpage>632</fpage>
          . ACM Press, New York, USA (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Taylor</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guiver</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robertson</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Minka</surname>
          </string-name>
          , T.:
          <article-title>SoftRank: Optimizing NonSmooth Rank Metrics</article-title>
          .
          <source>In: WSDM 2008</source>
          . pp.
          <volume>77</volume>
          {
          <fpage>86</fpage>
          . ACM Press, New York, USA (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>