<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using CHERCHE to Empower Newcomers into Neural Information Retrieval - Extended Abstract</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Raphaël Sourty</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jose G. Moreno</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lynda Tamine</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francois-Paul Servant</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Renault</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Toulouse, IRIT, UMR 5505 CNRS</institution>
          ,
          <addr-line>F-31000, Toulouse</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we briefly 1 present a new open-source python module for building information retrieval pipelines with transformers namely CHERCHE. Our aim is to propose an easy to plug tool capable to execute, simple but strong, state-of-the-art information retrieval models and show their capabilities in small collections. To do so, we have integrated classical models based on lexical matching but also recent models based on semantic matching. CHERCHE is oriented to newcomers into that want to use transformer-based models in small collections without struggling with heavy tools. The code and documentation of CHERCHE is public available at https://github.com/raphaelsty/cherche INTRODUCTION Most of the existing tools for neural IR focus on the training of new models, giving little attention to the use of existing models which makes their integration harder on out of the box Information Retrieval (IR) systems. Indeed, even if an existing pre-trained model for IR is available, its integration on a portable IR system or in a larger system (e.g. in question answering, summarization, entity linking, etc.) is not a straightforward task. Additionally, it is clear that an important number of IR users may not be interested in training their own models while they are still interested in using recent, and publicly available, advances in the ifeld. Currently, more than 29000 public models are available on model hub of huggingface, with more than 8000 focused on the use of BERT and more than 300 finetuned for sentence similarity1. Our tool, called CHERCHE [1], was developed as an option to fill this gap and here we test its capabilities in small IR collections. We also aim to empower newcomers in neural IR to explore pipelines with pretrained models without extra efort. Our full architecture is depict in Figure 1. As intended, when using CHERCHE, it is only needed few lines to read, process, and evaluate a neural model. We expect that by reducing the load in the use of these models, more IR users will be motivated to integrate neural IR models into larger systems. The main contributions of CHERCHE, and thus, of this abstract paper, are:</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Neural Information Retrieval</kwd>
        <kwd>Python Library</kwd>
        <kwd>Information Retrieval Pipelines</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        • A new tool that reduces the load when integrating transformer-based models
• A new option to perform the exploration of new (expert driven) pipelines into larger
systems by using neural IR models
RELATED TOOLS Multiple python-based IR tools are publicly available nowadays. Some of
the most popular, pyterrier [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and pyserini, are backended by their Java versions Terrier and
Anserini [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], because, at their time, they are based on Lucene2. Both of them are standard, and
well established, alternatives when considering a python-based IR tool. Although they both
are developed by strong communities, both tools are “heavy”3 to install and use as extra steps
are needed for their use (e.g., starting a java machine is required in both cases even when only
python code is used). This contrasts with the NLP alternatives for similar downstream tasks.
CHERCHE Here, we summarize CHERCHE. An extended presentation of CHERCHE can be
found in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Installation We opted for a light installation from pip repository as many other python libraries.
So the single line “pip –install cherche” allows the full installation of CHERCHE.
Retriever Retrievers allow the speed up of a neural search pipeline by filtering out documents
that may be not relevant. We implemented most common retrievers based on lexical matching
between the query and the documents. However, recent models also use semantic similarity
combined with approximate search, based on faiss [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], to speed up the process. Here is the list
of available retrievers using CHERCHE: TfIdf, BM25L and BM25Okapi, Elastic, Lunr, Flash,
Encoder, DPR, and Fuzz. Currently, only the retriever Elastic is recommended with large
corpora in CHERCHE. The other retrievers are adapted to small size corpora.
Reranker Rerankers will then be able to pull up documents based on semantic similarity.
Rerankers are models that measure the semantic similarity between a document and a query.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2https://lucene.apache.org/</title>
      <p>3It is clear that current installation process is clean but we refer to heavy to the fact that a Java virtual machine is
needed.
The reranker allows to reorder the documents retrieved by the retriever based on the semantic
similarity between the query and the documents retrieved. Rerankers are compatible with all
the retrievers in CHERCHE. To enhance the integration of new models, CHERCHE supports
SentenceTransformers models available in the model hub of huggingfaces. This opens a
multiple of models that could be used. Additionally, local models can be also specified as the
system supports the standard class loader of huggingfaces models. Here is the list of available
rerankers using CHERCHE: Encoder and DPR.</p>
      <p>Pipelines CHERCHE overcharges the operators ‘+’ (plus), ‘|’ (union) and ‘&amp;’ (intersection) to
build pipelines eficiently.</p>
      <p>Evaluation Evaluate a pipeline using pairs of query and answers. The pipeline objects allow
evaluation with three diferent metrics including F1, Precision, Recall, and P-Recall. However,
we used external evaluation libraries in our experiments.</p>
      <p>EXPERIMENTS WITH CHERCHE In order to highlight the characteristics of CHERCHE,
we performed a set of experiments using two small IR datasets. A simple pipeline strategy was
used, e.g. a pipeline composed by a retriever followed by a reranker.</p>
      <p>
        Datasets We used the vaswani4 and scifact5 datasets in all experiments. In both cases, we opted
for the python libraries that ofer an easy access to the datasets, BEIR [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and IR_datasets [6], to
highlight the flexibility of opting for CHERCHE.
      </p>
      <p>Details of both dataset are presented in Table 1. Note that both dataset are small as experiments
were performed without Elastic6.</p>
      <p>Metrics Although CHERCHE proposes an internal evaluation module, we opted for standard
IR evaluations metrics including MAP, NDCG@5, NDCG@10, and NDCG@20. The public
available pytrec_eval7 [7] implementation was used as evaluation tool.</p>
      <p>Models We used a set of the most downloaded models on huggingfaces hub for the sentence
encoders. The full list of models is presented in Table 2.</p>
      <p>RESULTS</p>
      <p>Table 3 presents the summary of our results on both datasets. Note that we included pyterrier</p>
    </sec>
    <sec id="sec-3">
      <title>4http://ir.dcs.gla.ac.uk/resources/test_collections/npl/ 5https://scifact.apps.allenai.org/ 6Elastic is needed for larger datasets. 7https://github.com/cvangysel/pytrec_eval</title>
      <p>scifact
ndcg@10
pyterrier</p>
      <p>PL2 0.2060* 0.4245
LambdaMART 0.2043*
retriever only 0.2590 0.4316
CHERCHE AVG (A:H) 0.1827 0.3456</p>
      <p>BEST 0.2508 0.4741
as baseline, and if available, we reported the results from their oficial repository 8 or performed
experiments to obtain its results. In both datasets, we used Lunr as retriever as it performs
similarly than pyterrier in terms of ndcg@10 when using vaswani dataset9. As a main result, note
that based on Table 3, the best reranker strongly outperforms the retriever, but it is not the case</p>
    </sec>
    <sec id="sec-4">
      <title>8https://github.com/terrier-org/pyterrier</title>
      <p>9Also note that as mentioned before, CHERCHE is clearly an option for small datasets but standards libraries, such
as pyterrier, will be more adapted for large datasets.
when averaging the performance of all models (AVG). This result highlight the importance of
selecting an adapted transformer model, which can be easily performed when using CHERCHE.
vaswani Results of the performances for the eighth rerankers mentioned in Table 2 are presented
in Figure 2. Note that the model A outperforms other alternatives and clearly outperforms
the single retriever. Indeed, models A, B, and C outperform the retriever performances. Other
models underperformed the retriever. This trend was observed for most of the used metrics.
scifact Results using scifact are presented in Figure 3 and follow the vaswani results, e.g., the
retriever performance is outperformed by a clear margin for models A and B. However, model C
did not manage to obtain a good performance, but model E does across multiple metrics. Finally,
model G is clearly not an option for this dataset. This shows one of the feature of CHERCHE,
the rapidly identification of candidate models to integrate a production pipeline.
CONCLUSION This paper briefly presents CHERCHE a library for neural pipelines definition.
Our library was developed to be light and portable to new environments as a tool to quickly
evaluate/integrate neural IR models into multiple text-related tasks including question
answering. Although it can be used to develop new models, CHERCHE targets newcomers to the
neural search that want to verify the pipelines based on transformers within their collections.
Our results show that state of the art performance can be easily implements using CHERCHE.
on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021.</p>
      <p>URL: https://openreview.net/forum?id=wCu6T5xFjeJ.
[6] S. MacAvaney, A. Yates, S. Feldman, D. Downey, A. Cohan, N. Goharian, Simplified data
wrangling with ir_datasets, in: SIGIR, 2021.
[7] C. Van Gysel, M. de Rijke, Pytrec_eval: An extremely fast python interface to trec_eval, in:
SIGIR, ACM, 2018.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sourty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G.</given-names>
            <surname>Moreno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Tamine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.-P.</given-names>
            <surname>Servant</surname>
          </string-name>
          ,
          <article-title>Cherche: A new tool to rapidly implement pipelines in information retrieval</article-title>
          ,
          <source>in: Proceedings of SIGIR</source>
          <year>2022</year>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Macdonald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tonellotto</surname>
          </string-name>
          ,
          <article-title>Declarative experimentation ininformation retrieval using pyterrier</article-title>
          ,
          <source>in: Proceedings of ICTIR</source>
          <year>2020</year>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <article-title>Anserini: Enabling the use of lucene for information retrieval research</article-title>
          ,
          <source>in: Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR '17</source>
          ,
          <year>2017</year>
          , p.
          <fpage>1253</fpage>
          -
          <lpage>1256</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Johnson</surname>
          </string-name>
          , M. Douze,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jégou</surname>
          </string-name>
          ,
          <article-title>Billion-scale similarity search with GPUs</article-title>
          ,
          <source>IEEE Transactions on Big Data</source>
          <volume>7</volume>
          (
          <year>2019</year>
          )
          <fpage>535</fpage>
          -
          <lpage>547</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>N.</given-names>
            <surname>Thakur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rücklé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>BEIR: A heterogeneous benchmark for zero-shot evaluation of information retrieval models</article-title>
          , in: Thirty-fifth Conference
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>