=Paper= {{Paper |id=Vol-2007/LEARNER2017_keynote_1 |storemode=property |title=Ranking Learning-to-Rank Methods |pdfUrl=https://ceur-ws.org/Vol-2007/LEARNER2017_keynote_1.pdf |volume=Vol-2007 |authors=Djoerd Hiemstra,Niek Tax,Sander Bockting |dblpUrl=https://dblp.org/rec/conf/ictir/HiemstraTB17 }} ==Ranking Learning-to-Rank Methods== https://ceur-ws.org/Vol-2007/LEARNER2017_keynote_1.pdf
                                      Ranking Learning-to-Rank Methods∗
                 Djoerd Hiemstra                                                   Niek Tax                                    Sander Bockting
             University of Twente                              Eindhoven University of Technology                           KPMG Netherlands
           Enschede, The Netherlands                               Eindhoven, The Netherlands                           Amstelveen, The Netherlands
            hiemstra@cs.utwente.nl                                        n.tax@tue.nl                                   bockting.sander@kpmg.nl

ABSTRACT                                                                                The Normalized Winning Number is the Winning Number divided
We present a cross-benchmark comparison of learning-to-rank                             by the Ideal Winning Number. The Normalized Winning Num-
methods using two evaluation measures: the Normalized Win-                              ber gives insight in the ranking accuracy of the learning to rank
ning Number and the Ideal Winning Number. Evaluation results                            method. The Ideal Winning Number gives insight in the degree
of 87 learning-to-rank methods on 20 datasets show that ListNet,                        of certainty concerning the ranking accuracy. We report the best
SmoothRank, FenchelRank, FSMRank, LRUF and LARF are Pareto                              performing methods by Normalized Winning Number and Ideal
optimal learning-to-rank methods, listed in increasing order of Nor-                    Winner Number.
malized Winning Number and decreasing order of Ideal Winning
Number.                                                                                  3    RESULTS
                                                                                         Figure 1 shows the Normalized Winning Number as function of the
CCS CONCEPTS                                                                             Ideal Winning Number for 87 learning-to-rank methods over 20
• Information systems → Learning to rank;                                                datasets and all investigated evaluation measures: Mean Average
                                                                                         Precision and nDCG at 3, 5, 10. The figure labels the Pareto optimal
1     INTRODUCTION                                                                       algorithms and also the Rank-2 Pareto optima in a smaller font,
                                                                                         which are the labels of the algorithms with exactly one algorithm
Like most information retrieval methods, learning-to-rank methods                        having a higher value on both axes. In addition, Linear Regression
are evaluated on benchmark datasets, such as the many datasets pro-                      and the ranking method of simply sorting on the best single feature
vided by Microsoft and the datasets provided by Yahoo and Yandex.                        are labeled as baselines.
These learning-to-rank datasets offer feature set representations of
the to-be-ranked documents instead of the documents themselves.
Therefore, any difference in ranking performance is due to the rank-
ing algorithm and not the features used. This opens up a unique
opportunity for cross-benchmark comparison of learning-to-rank
methods. In this paper, we compare learning to rank methods based
on a sparse set of evaluation results on many benchmark datasets.

2     DATASETS AND METHODS
Evaluation results of 87 learning-to-rank methods on 20 well-known
benchmark datasets are collected using a systematic literature re-
view [1]. We included papers that report the mean average precision                      Figure 1: Winning numbers of 87 learning to rank methods.
or nDCG at 3, 5 or 10 documents retrieved. Papers that used differ-
ent or additional features, or that reported no baseline performance                     The figure shows that LRUF beats almost all other methods with
that allowed us to check validity of the results, were excluded from                     an Ideal Winning Number of almost 500 measures and datasets.
the analysis.                                                                            If we move to the right of the figure, we increase our confidence
   The Winning Number of a learning-to-rank method is defined                            in the results. That is, we are more confident about the results of
as the number of other methods that a method beats over the set of                       ListNet as its Ideal Winning Number is close to 1000 measures and
datasets. So, a method with a high Winning Number beats many                             datasets. However, ListNet is outperformed on half, so about 500,
other methods on many datasets. For every method, we find a dif-                         of the datasets and measures.
ferent set of datasets on which the method was evaluated. The
Ideal Winning Number is the maximum Winning Number that                                  4    CONCLUSION
the method can achieve on all datasets on which it was evaluated.                        Based on a cross-benchmark comparison of 87 learning-to-rank
                                                                                         methods on 20 datasets, we conclude that ListNet, SmoothRank,
∗ The full version of this work was published by Tax, Bockting and Hiemstra [1].         FenchelRank, FSMRank, LRUF and LARF are Pareto optimal learning-
                                                                                         to-rank methods, listed in increasing order of Normalized Winning
LEARNER’17, October 1, 2017, Amsterdam, The Netherlands                                  Number and decreasing order of Ideal Winning Number [1].
Copyright ©2017 for this paper by its authors. Copying permitted for private and
academic purposes.
                                                                                         REFERENCES
                                                                                         [1] Niek Tax, Sander Bockting, and Djoerd Hiemstra. 2015. A cross-benchmark
                                                                                             comparison of 87 learning to rank methods. Information processing & management
                                                                                             51, 6 (2015), 757–772. (Awarded IPM Best Paper of 2015)