<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>IIR</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Towards an Information Retrieval Evaluation Library</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Discussion Paper</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elias Bassani</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Consorzio per il Trasferimento Tecnologico - C2T</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Milano-Bicocca</institution>
          ,
          <addr-line>Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>12</volume>
      <fpage>0000</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>This manuscript discusses our ongoing work on ranx, a Python evaluation library for Information Retrieval. First, we introduce our work, summarize the already available functionalities, show the user-friendly nature of our tool through code snippets, and briefly discuss the technologies we relied on for the implementation and their advantages. Then, we present the upcoming features, such as several Metasearch algorithms, and introduce the long-term goals of our project.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Information Retrieval</kwd>
        <kwd>Evaluation</kwd>
        <kwd>Comparison</kwd>
        <kwd>Metasearch</kwd>
        <kwd>Fusion</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Overview</title>
      <p>In this section, we present the main functionalities ranx provides, show its user-friendly nature
through some code snippets, and discuss its implementation and the advantages brought by the
employed technologies. More details and examples are available in the oficial repository.</p>
      <sec id="sec-2-1">
        <title>2.1. Qrels and Run</title>
        <p>
          First, ranx provides a convenient way of managing the data needed for evaluating and
comparing diferent retrieval models: the query relevance judgments (qrels) and ranked lists of
documents retrieved for those queries by the systems (runs). ranx implements two custom
classes for these kinds of data: Qrels and Run. In particular, data can be loaded from Python
dictionaries and Pandas DataFrames [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] or read from TREC-style files and JSON files. Moreover,
ranx integrates seamlessly with ir-datasets [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], allowing the users to load qrels for several
Information Retrieval datasets, such as those from TREC’s challenges2, BEIR [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ], and MS
MARCO [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. Figure 1 shows the standard way of creating Qrels and Run instances. ranx
takes care of sorting the result lists so that the user does not have to think about it. To learn
more about Qrels and Run, we invite the reader to follow our online Jupyter Notebook3.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Metrics, Evaluation, and Comparison</title>
        <p>
          ranx provides the most commonly used ranking evaluation metrics4 such as Reciprocal Rank,
Average Precision, and Normalized Discounted Cumulative Gain [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]. These metrics can be used
to evaluate a run in a single line of code, as depicted in Figure 2. As the figure shows, ranx
allows the user to provide one or multiple metrics and define cut-ofs using a convenient syntax.
Additional information can be found online5.
        </p>
        <p>ranx also ofers functionalities to compare runs and perform statistical tests. As shown in
Figure 3, by providing the query relevance judgments and a list of runs and defining the desired
2https://trec.nist.gov
3https://colab.research.google.com/github/AmenRa/ranx/blob/master/notebooks/2_qrels_and_run.ipynb
4A complete list of the implemented metrics can be found here: https://github.com/AmenRa/ranx#metrics
5https://colab.research.google.com/github/AmenRa/ranx/blob/master/notebooks/3_evaluation.ipynb
metrics, the compare function performs a comparison of the runs. It returns a Report instance,
which stores the information produced by the compare function and can be printed as in Figure
3 or exported as a LATEX table, ready for a scientific publication. The code underlying Table 1
was generated by ranx. To learn more about comparing diferent runs, we invite the reader to
follow our online Jupyter Notebook6.
6https://colab.research.google.com/github/AmenRa/ranx/blob/master/notebooks/4_comparison_and_report.ipynb</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Backend</title>
        <p>
          In addition to its user-friendly interface, ranx is also very eficient due to its Numba-based
implementation. Numba[
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] is a just-in-time[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] compiler for Python and NumPy[
          <xref ref-type="bibr" rid="ref16 ref17 ref18">16, 17, 18</xref>
          ]
that translates and compiles for-loop-based code to high-speed vector operations and allows
for automatic parallelization, which is very handy on modern multi-core CPUs. Almost every
operation performed by ranx relies on Numba-compiled code. The internal data structures used
by Qrels and Run and all the evaluation metrics provided by ranx are built on top of Numba.
Our implementation allows for conducting evaluations and comparisons much faster than other
popular Python evaluation libraries for Information Retrieval. Table 2 reports the execution
time of diferent metrics in ranx and pytrec_eval, a Python wrapper for trec_eval, the
standard Information Retrieval evaluation library.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Upcoming Features</title>
      <p>
        We are currently implementing several Metasearch [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] algorithms, such as comb_min [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ],
comb_max [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], comb_med [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], comb_anz [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], comb_mnz [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], comb_sum [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], comb_gmnz
[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], RRF [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], MAPFuse [27], ISR [28], Log_ISR [28], LogN_ISR [28], and many more. Our
goal is to ofer a Python implementation for all those methods with a standardized interface.
Moreover, we want to provide a working and easy-to-use implementation of those models
that could serve as baselines for researchers working on Metasearch algorithms. Moreover,
we argue young researchers in the Deep Learning-based Information Retrieval era have little
knowledge regarding Metasearch methods as they often rely on the weighted sum to fuse
lexical matching scores, such as those computed by BM25 [29], and semantic matching scores
computed by Transformer-based [30] rankers [31]. We hope that our work can stimulate
researchers to explore diferent fusion approaches. As many Metasearch algorithms require to
be tuned, we are also working on an auto-tune functionality that takes care of trying diferent
hyper-parameters configurations and finding the best performing one with no user efort.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion and Long-term Goals</title>
      <p>To conclude our discussion, we introduce the long-term goals of our library. Besides adding more
metrics and other Metasearch methods, we plan to build a companion repository for storing
runs of state-of-the-art models accompanied by rich metadata for searching and indexing.
By integrating this online repository with ranx, we aim to allow researchers to download
pre-computed runs and compare the results of their models with those of state-of-the-art
approaches in just a few seconds. We think such functionality could help accelerate research in
Information Retrieval, allowing researchers to rapidly find appropriate baselines and avoiding
time-consuming and error-prone tasks entirely, such as re-implementing or re-training complex
retrieval models from scratch. Moreover, sharing runs of state-of-the-art models could promote
virtuous behaviors and transparency and reduce electricity consumption and pollution.
[27] D. Lillis, L. Zhang, F. Toolan, R. W. Collier, D. Leonard, J. Dunnion, Estimating probabilities
for efective data fusion, in: F. Crestani, S. Marchand-Maillet, H. Chen, E. N. Efthimiadis,
J. Savoy (Eds.), Proceeding of the 33rd International ACM SIGIR Conference on Research
and Development in Information Retrieval, SIGIR 2010, Geneva, Switzerland, July
1923, 2010, ACM, 2010, pp. 347–354. URL: https://doi.org/10.1145/1835449.1835508. doi:10.
1145/1835449.1835508.
[28] A. Mourão, F. Martins, J. Magalhães, Multimodal medical information retrieval with
unsupervised rank fusion, Comput. Medical Imaging Graph. 39 (2015) 35–45. URL: https:
//doi.org/10.1016/j.compmedimag.2014.05.006. doi:10.1016/j.compmedimag.2014.05.
006.
[29] S. E. Robertson, S. Walker, Some simple efective approximations to the 2-poisson model
for probabilistic weighted retrieval, in: Proceedings of the 17th Annual International
ACM-SIGIR Conference on Research and Development in Information Retrieval. Dublin,
Ireland, 3-6 July 1994 (Special Issue of the SIGIR Forum), ACM/Springer, 1994. doi:10.
1007/978-1-4471-2099-5\_24.
[30] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I.
Polosukhin, Attention is all you need, in: Advances in Neural Information Processing Systems
30: Annual Conference on Neural Information Processing Systems 2017, December 4-9,
2017, Long Beach, CA, USA, 2017.
[31] J. Lin, R. Nogueira, A. Yates, Pretrained Transformers for Text Ranking: BERT and
Beyond, Synthesis Lectures on Human Language Technologies, Morgan &amp; Claypool
Publishers, 2021. URL: https://doi.org/10.2200/S01123ED1V01Y202108HLT053. doi:10.2200/
S01123ED1V01Y202108HLT053.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Harman</surname>
          </string-name>
          , Information Retrieval Evaluation,
          <source>Synthesis Lectures on Information Concepts</source>
          , Retrieval, and Services, Morgan &amp; Claypool Publishers,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sanderson</surname>
          </string-name>
          ,
          <article-title>Test collection based evaluation of information retrieval systems</article-title>
          ,
          <source>Found. Trends Inf. Retr</source>
          .
          <volume>4</volume>
          (
          <year>2010</year>
          )
          <fpage>247</fpage>
          -
          <lpage>375</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Järvelin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kekäläinen</surname>
          </string-name>
          ,
          <article-title>Cumulated gain-based evaluation of IR techniques</article-title>
          ,
          <source>ACM Trans. Inf. Syst</source>
          .
          <volume>20</volume>
          (
          <year>2002</year>
          )
          <fpage>422</fpage>
          -
          <lpage>446</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Voorhees</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Harman</surname>
          </string-name>
          ,
          <article-title>Experiment and evaluation in information retrieval</article-title>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Macdonald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tonellotto</surname>
          </string-name>
          ,
          <article-title>Declarative experimentation in information retrieval using pyterrier</article-title>
          , in: ICTIR, ACM,
          <year>2020</year>
          , pp.
          <fpage>161</fpage>
          -
          <lpage>168</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Macdonald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tonellotto</surname>
          </string-name>
          , S. MacAvaney, I. Ounis, Pyterrier:
          <article-title>Declarative experimentation in python from BM25 to dense retrieval</article-title>
          , in: CIKM, ACM,
          <year>2021</year>
          , pp.
          <fpage>4526</fpage>
          -
          <lpage>4533</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C. V.</given-names>
            <surname>Gysel</surname>
          </string-name>
          , M. de Rijke,
          <article-title>Pytrec_eval: An extremely fast python interface to trec_eval</article-title>
          , in: SIGIR, ACM,
          <year>2018</year>
          , pp.
          <fpage>873</fpage>
          -
          <lpage>876</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J. R. M.</given-names>
            <surname>Palotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Scells</surname>
          </string-name>
          , G. Zuccon,
          <article-title>Trectools: an open-source python library for information retrieval practitioners involved in trec-like campaigns</article-title>
          , in: SIGIR, ACM,
          <year>2019</year>
          , pp.
          <fpage>1325</fpage>
          -
          <lpage>1328</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Breuer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Maistro</surname>
          </string-name>
          , P. Schaer,
          <article-title>repro_eval: A python interface to reproducibility measures of system-oriented IR experiments</article-title>
          ,
          <source>in: ECIR (2)</source>
          , volume
          <volume>12657</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2021</year>
          , pp.
          <fpage>481</fpage>
          -
          <lpage>486</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lucchese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. I.</given-names>
            <surname>Muntean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Nardini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Perego</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Trani</surname>
          </string-name>
          ,
          <article-title>Rankeval: Evaluation and investigation of ranking models</article-title>
          ,
          <source>SoftwareX</source>
          <volume>12</volume>
          (
          <year>2020</year>
          )
          <fpage>100614</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lucchese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. I.</given-names>
            <surname>Muntean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Nardini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Perego</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Trani</surname>
          </string-name>
          ,
          <string-name>
            <surname>Rankeval:</surname>
          </string-name>
          <article-title>An evaluation and analysis framework for learning-to-rank solutions</article-title>
          , in: SIGIR, ACM,
          <year>2017</year>
          , pp.
          <fpage>1281</fpage>
          -
          <lpage>1284</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E.</given-names>
            <surname>Bassani</surname>
          </string-name>
          ,
          <article-title>ranx: A blazing-fast python library for ranking evaluation and comparison</article-title>
          , in: M.
          <string-name>
            <surname>Hagen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Verberne</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Seifert</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Balog</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Nørvåg</surname>
          </string-name>
          , V. Setty (Eds.),
          <source>Advances in Information Retrieval - 44th European Conference on IR Research</source>
          , ECIR
          <year>2022</year>
          , Stavanger, Norway,
          <source>April 10-14</source>
          ,
          <year>2022</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>II</given-names>
          </string-name>
          , volume
          <volume>13186</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2022</year>
          , pp.
          <fpage>259</fpage>
          -
          <lpage>264</lpage>
          . URL: https://doi.org/10.1007/ 978-3-
          <fpage>030</fpage>
          -99739-7_
          <fpage>30</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -99739-7\_
          <fpage>30</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Abras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Maloney-Krichmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Preece</surname>
          </string-name>
          , et al.,
          <article-title>User-centered design</article-title>
          , Bainbridge,
          <string-name>
            <surname>W.</surname>
          </string-name>
          <article-title>Encyclopedia of Human-Computer Interaction</article-title>
          .
          <source>Thousand Oaks: Sage Publications</source>
          <volume>37</volume>
          (
          <year>2004</year>
          )
          <fpage>445</fpage>
          -
          <lpage>456</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Lam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pitrou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Seibert</surname>
          </string-name>
          ,
          <article-title>Numba: a llvm-based python JIT compiler, in: LLVM@SC</article-title>
          ,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          ,
          <year>2015</year>
          , pp.
          <volume>7</volume>
          :
          <fpage>1</fpage>
          -
          <issue>7</issue>
          :
          <fpage>6</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Aycock</surname>
          </string-name>
          ,
          <article-title>A brief history of just-in-time</article-title>
          ,
          <source>ACM Comput. Surv</source>
          .
          <volume>35</volume>
          (
          <year>2003</year>
          )
          <fpage>97</fpage>
          -
          <lpage>113</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>T. E.</given-names>
            <surname>Oliphant</surname>
          </string-name>
          , A guide to NumPy, volume
          <volume>1</volume>
          ,
          <string-name>
            <given-names>Trelgol</given-names>
            <surname>Publishing</surname>
          </string-name>
          <string-name>
            <surname>USA</surname>
          </string-name>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S. van der</given-names>
            <surname>Walt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Colbert</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Varoquaux,</surname>
          </string-name>
          <article-title>The numpy array: A structure for eficient numerical computation</article-title>
          ,
          <source>Comput. Sci. Eng</source>
          .
          <volume>13</volume>
          (
          <year>2011</year>
          )
          <fpage>22</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>C. R.</given-names>
            <surname>Harris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. J.</given-names>
            <surname>Millman</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. van der Walt</surname>
          </string-name>
          , R. Gommers,
          <string-name>
            <given-names>P.</given-names>
            <surname>Virtanen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Wieser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Taylor</surname>
          </string-name>
          , S. Berg,
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Picus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hoyer</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. H. van Kerkwijk</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Brett</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Haldane</surname>
            ,
            <given-names>J. F.</given-names>
          </string-name>
          <string-name>
            <surname>del Río</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Wiebe</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Peterson</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Gérard-Marchant</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Sheppard</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Reddy</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Weckesser</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Abbasi</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gohlke</surname>
            ,
            <given-names>T. E.</given-names>
          </string-name>
          <string-name>
            <surname>Oliphant</surname>
          </string-name>
          ,
          <article-title>Array programming with numpy</article-title>
          ,
          <source>Nat</source>
          .
          <volume>585</volume>
          (
          <year>2020</year>
          )
          <fpage>357</fpage>
          -
          <lpage>362</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>W.</given-names>
            <surname>McKinney</surname>
          </string-name>
          , et al.,
          <article-title>pandas: a foundational python library for data analysis and statistics, Python for high performance and scientific computing 14 (</article-title>
          <year>2011</year>
          )
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>S.</given-names>
            <surname>MacAvaney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Yates</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Feldman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Downey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cohan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goharian</surname>
          </string-name>
          ,
          <article-title>Simplified data wrangling with ir_datasets</article-title>
          , in: SIGIR, ACM,
          <year>2021</year>
          , pp.
          <fpage>2429</fpage>
          -
          <lpage>2436</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>N.</given-names>
            <surname>Thakur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rücklé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>BEIR: A heterogeneous benchmark for zero-shot evaluation of information retrieval models</article-title>
          , in: J.
          <string-name>
            <surname>Vanschoren</surname>
          </string-name>
          , S. Yeung (Eds.),
          <source>Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks</source>
          <volume>1</volume>
          ,
          <string-name>
            <given-names>NeurIPS</given-names>
            <surname>Datasets</surname>
          </string-name>
          and
          <source>Benchmarks</source>
          <year>2021</year>
          ,
          <year>December 2021</year>
          , virtual,
          <year>2021</year>
          . URL: https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/ 65b9eea6e1cc6bb9f0cd2a47751a186f-Abstract-round2.
          <fpage>html</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>T.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rosenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Tiwary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Majumder</surname>
          </string-name>
          , L. Deng, MS MARCO:
          <article-title>A human generated machine reading comprehension dataset</article-title>
          , in: T. R.
          <string-name>
            <surname>Besold</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Bordes</surname>
          </string-name>
          , A. S.
          <string-name>
            <surname>d'Avila Garcez</surname>
          </string-name>
          , G. Wayne (Eds.),
          <source>Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS</source>
          <year>2016</year>
          ), Barcelona, Spain, December 9,
          <year>2016</year>
          , volume
          <volume>1773</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2016</year>
          . URL: http://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>1773</volume>
          /CoCoNIPS_2016_paper9.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Aslam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Montague</surname>
          </string-name>
          ,
          <article-title>Models for metasearch</article-title>
          , in: W. B.
          <string-name>
            <surname>Croft</surname>
            ,
            <given-names>D. J.</given-names>
          </string-name>
          <string-name>
            <surname>Harper</surname>
            ,
            <given-names>D. H.</given-names>
          </string-name>
          <string-name>
            <surname>Kraft</surname>
          </string-name>
          , J. Zobel (Eds.),
          <source>SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, September</source>
          <volume>9</volume>
          -
          <issue>13</issue>
          ,
          <year>2001</year>
          , New Orleans, Louisiana, USA, ACM,
          <year>2001</year>
          , pp.
          <fpage>275</fpage>
          -
          <lpage>284</lpage>
          . URL: https://doi.org/10.1145/ 383952.384007. doi:
          <volume>10</volume>
          .1145/383952.384007.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Fox</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Shaw</surname>
          </string-name>
          ,
          <article-title>Combination of multiple searches</article-title>
          ,
          <source>in: TREC</source>
          , volume
          <volume>500</volume>
          -215 of NIST Special Publication,
          <source>National Institute of Standards and Technology (NIST)</source>
          ,
          <year>1993</year>
          , pp.
          <fpage>243</fpage>
          -
          <lpage>252</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Analyses of multiple evidence combination</article-title>
          , in: SIGIR, ACM,
          <year>1997</year>
          , pp.
          <fpage>267</fpage>
          -
          <lpage>276</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>G. V.</given-names>
            <surname>Cormack</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L. A.</given-names>
            <surname>Clarke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Büttcher</surname>
          </string-name>
          ,
          <article-title>Reciprocal rank fusion outperforms condorcet and individual rank learning methods</article-title>
          , in: SIGIR, ACM,
          <year>2009</year>
          , pp.
          <fpage>758</fpage>
          -
          <lpage>759</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>