<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Team OpenWebSearch at CLEF 2024: QuantumCLEF</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maik Fröbe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daria Alexander</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gijs Hendriksen</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ferdinand Schlatt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matthias Hagen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Martin Potthast</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Friedrich-Schiller-Universität Jena</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Radboud Universiteit Nijmegen</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Kassel</institution>
          ,
          <addr-line>hessian.AI, ScaDS.AI</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>We describe the OpenWebSearch group's participation in the CLEF 2024 QuantumClef IR Feature Selection track. Our submitted runs focus on the observation that the importance of features in learning-to-rank models can vary and contradict itself when changing the training setup. To address this problem and identify a subset of features that is robust across diverse downstream training procedures, we bootstrap feature importance scores by repeatedly training models on randomly selected subsets of features and measuring their importance in trained models. We indeed observe that feature importance varies widely across diferent bootstraps and also contradicts itself. We hypothesized that quantum annealers could better explore this complex optimization landscape than simulated annealers. However, we find that quantum annealers do not find substantially more optimal solutions that yield substantially more efective learning-to-rank models.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;learning-to-rank</kwd>
        <kwd>bootstrapping</kwd>
        <kwd>feature selection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Learning-to-Rank aims to identify a combination of features that produce an efective ranking [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Even
in the era of pre-trained transformers [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], feature-based learning-to-rank remains important as it can
integrate features not available in transformers, compensating for knowledge to which transformers
have no access [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ]. Especially commercial search engines might combine many features, e.g., a recent
leak claims that Google search incorporates more than 14 000 features into their ranking.1
      </p>
      <p>
        Such scenarios highlight the importance of proper feature selection, as diferent search systems
(even if they might be bundled behind a single UI) might target at diferent tasks (expressed via an
evaluation scenario, e.g., evaluation measure with a test dataset) that require diferent sets of features.
In the scenario of the QuantumCLEF task [
        <xref ref-type="bibr" rid="ref5 ref6 ref7">5, 6, 7</xref>
        ], we start from the original quadratic unconstrained
binary optimization prepared in the oficial tutorial [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and contrast the components of this optimization
problem with bootstrapped alternatives. Bootstrapping is a frequently used approach in statistics if
the mean of some population is not meaningful or can not be calculated (e.g., for categorical values)
that draws repeated samples of some data [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. We use bootstrapping for feature selection by repeatedly
sampling LambdaMART models from the training data. Thereby, we follow the intuition that the
original optimization problem that uses the mutual information and the conditional mutual information
can not capture all potentially interesting dependencies that might impact what features are important.
Our code and the bootstrapped feature-importance scores are available online.2
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>We will review related work on bootstrapping and feature selection in information retrieval that inspired
our work.</p>
      <p>Algorithm 1 Bootstrapping Feature Importance Scores
Require:
,  features for learning to rank with target predictions y</p>
      <p>
        number of desired bootstrapped feature importance scores
lightGBM lightGBM training procedure
sample a sampling approach
1:  ← []
2: while  ≤  do
3:  ′ , ′ ← sample( , )
4: model ← lightGBM.train( ′ , ′ )
5:  ←  + [model.calculateFeatureImportance()]
6: end while
7: return 
Bootstrapping in Information Retrieval Bootstrapping, i.e., the process of repeatedly sampling
from the same distribution, has been used previously in information retrieval, e.g., to sample from the
relevance judgments, from the topics, or from the document corpus [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The leave-out-uniques test is
a form of re-sampling of relevance judgments used to estimate the reusability of test collections [
        <xref ref-type="bibr" rid="ref11 ref12">11,
12, 13</xref>
        ]. Bootstrapping topics has been used for significance tests [ 14, 15] respectively for assessing the
discriminatory power of evaluation measures [16, 17, 18]. Analogously, bootstrapping the document
corpus can help to simulate diferent corpora [ 18], estimate if results transfer to other corpora [19], or,
again, for meta evaluations of evaluation measures [18]. Given the wide applicability of bootstrapping
in the field of information retrieval, we now intend to apply it to learning to rank. Contrary to the
approaches discussed above, our approach mainly focuses on re-sampling the set of features that
subsequent learning-to-rank models can access.
      </p>
      <p>
        Feature Selection Feature selection approaches are either filter methods, wrapper methods, or
embedded methods [20], distinguished on how deep (if at all) they integrate with the learning
algorithm [21]. Filter methods have no integration with the learning algorithm [21] (i.e., they run before the
learning starts), e.g., the original quadratic unconstrained binary optimization prepared in the oficial
QuantumCLEF tutorial [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] falls into this category. Wrapper methods use a search algorithm to select the
features [22], whereas embedd methods integrate the selection into the actual learning phase [21]. Our
approach falls into the category of wrapper methods. There is already an high number of existing feature
selection approaches for learning to rank [22, 21, 23, 24, 25, 26], comparing respectively integrating
them with boostrapping could be interesting directions for future work.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Selecting Important Features with Bootstrapping</title>
      <p>
        This section describes our bootstrapping approach for feature selection. Conceptually, we formulate
a quadratic unconstrained binary optimization problem [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] that can be optimized via simulated
annealing and via quantum annealing. The number of features that our feature selection selects is an
hyperparameter that one could optimize, but we leave this for future work and always select the top-25
features (our focus was on the MQ2007 dataset that had around 50 features, so we intuitively selected
25 as number of features to target at). We create three optimization formulations for our bootstrapping
feature selection that difer in if they incorporate mutual information optimization objectives or not. We
submitted our three approaches within the qCLEF platform [27] for simulated annealers and quantum
annealers, yielding 6 runs overall.
      </p>
      <p>Algorithm 1 shows our bootstrapping algorithm. The algorithm has the features  , the target label ,
the number of bootstraps , an LightGBM training procedure, and a sampling approach as input.
Subsequently, each bootstrapping iteration first samples a subset of features  ′ together with their
corresponding ground truth labels ′ . With this sampled set of features, a LambdaMART model is
trained for which the feature importance is calculated and added to the return vector . For the training
of the LambdaMART models, we use the LightGBM [28] implementation in PyTerrier [29]. We do
not tune the hyperparameters of LambdaMART but use the hyperparameters from a diferent project
without adoption [30]. We sample the featured  ′ by randomly sorting the feature records and selecting
a random subset of 25 features.</p>
      <p>
        To incorporate the bootstrapped feature importance scores into the feature selection, we include
them into an optimization criterion that can be optimized by quantum annealers and by simulated
annealing. Therefore, we use the quadratic unconstrained binary optimization (QUBO) formulation
that minimizes the following objective [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]:
      </p>
      <p>⃗ ·  · ⃗ = ∑︁  ·  + ∑︁ , ·  ·</p>
      <p>&lt;</p>
      <p>
        Where ∑︀  ·  is the linear part of the QUBO and ∑︀&lt; , ·  ·  is the quadratic part. The
oficial starting point of the shared task fills the linear part of the QUBO with the negative mutual
information between a feature and the ground truth label and the quadratic part with the negative
conditional mutual information between two features and the ground truth label [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. To incorporate
our bootstrapped feature importance, we use the following formulation for the linear part:
      </p>
      <p>Where  is the number of bootstraps,  is the importance of feature  in the -th bootstrapped model,
and || is the overall importance. Analogously, we implement the quadratic part of the bootstrapping
QUBO via:
 ·  = ∑︁</p>
      <p>=1 ||
, ·  ·  = ∑︁  +</p>
      <p>=1 ||</p>
      <p>Where  is the number of bootstraps,  is the importance of feature  in the -th bootstrapped model,
 is the importance of feature  in the -th bootstrapped model, and || is the overall importance. In
both bootstrapping equations, we skip for a feature  or a feature combination ,  bootstraps that do
not include the feature because it was not sampled.</p>
      <p>To summarize the points above, we have four parts to build QUBO formulations, two from the original
mutual information formulation, and two from our new bootstrapping formulation. We combine them
to produce three systems that we run on simulated and quantum annealing:
mi-linear-bootstrapped-boost-3 This QUBO uses the linear part of our bootstrapping formulation
and the quadratic part from the original conditional mutual information. We multiple the
bootstrapping scores with 3 as this factor provided results on a similar scale then the previous mutual
information (identified by manual inspection).
mi-linear-and-quadratic-bootstrapped-boost-3 This QUBO uses the linear and quadratic part of
our bootstrapping formulation. We multiple the bootstrapping scores with 3 as this factor provided
results on a similar scale then the previous mutual information (identified by manual inspection).
mi-bootstrap-mixture This QUBO uses the average of the mutual information and our bootstrapping
variant for the linear and quadratic part.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>
        We provide evaluations of our methods compared to the baseline of using all features on the MQ2007
and Istella [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] dataset. We report the results in terms of nDCG@10, reporting the 25-th, the 50-th,
mi-bootstrap-mixture 0.114
mi-linear-and-quadratic-bootstrapped-boost-3 0.126
mi-linear-bootstrapped-boost-3 0.118
Baseline All Features
and the 75-th quantile ( .25,  .50, respectively  .75) and the Mean of the nDCG@10 for all our three
approaches for simulated annealing and quantum annealing.
      </p>
      <p>Table 1 shows the results for the MQ2007 dataset. We observe that all feature selection approaches
slightly improve upon the baseline of selecting all features, with the bootstrapping variants
outperforming the mixed variant and the QUBO that uses the linear and quadratic bootstrapping part is the most
efective one, for simulated and quantum annealing.</p>
      <p>Table 2 shows the results for the Istella dataset. We observe that all feature selection approaches
are substantially less efective then the baseline of using all features. It is interesting future work to
investigate how this can be resolved.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>We presented the Open Web Search (OWS) team’s submission to the QuantumCLEF shared task at
CLEF 2023. The motivation behind our approach was that LambdaMART models trained on shufled
datasets might choose diferent features as important ones. Therefore, we repeatedly train LambdaMART
models on randomized feature sets and measure the importance of the features in the trained model. For
the MQ2007 dataset, our approach substantially outperforms the baseline, while for the Istella dataset,
simply selecting all features is substantially more efective than our feature selection. For future work,
we believe that accurately determining the number of to-be-selected features is an important next step,
as this would help to not reduce the efectiveness in the Istella scenario.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work has received funding from the European Union’s Horizon Europe research and innovation
program under grant agreement No 101070014 (OpenWebSearch.EU, https://doi.org/10.3030/101070014).
[13] J. Zobel, How reliable are the results of large-scale information retrieval experiments?, in: W. B.</p>
      <p>Croft, A. Mofat, C. J. van Rijsbergen, R. Wilkinson, J. Zobel (Eds.), SIGIR ’98: Proceedings of the
21st Annual International ACM SIGIR Conference on Research and Development in Information
Retrieval, August 24-28 1998, Melbourne, Australia, ACM, 1998, pp. 307–314. URL: https://doi.org/
10.1145/290941.291014. doi:10.1145/290941.291014.
[14] J. Savoy, Statistical inference in retrieval efectiveness evaluation, Inf. Process. Manag. 33 (1997)
495–512. URL: https://doi.org/10.1016/S0306-4573(97)00027-7. doi:10.1016/S0306-4573(97)
00027-7.
[15] M. D. Smucker, J. Allan, B. Carterette, A comparison of statistical significance tests for information
retrieval evaluation, in: M. J. Silva, A. H. F. Laender, R. A. Baeza-Yates, D. L. McGuinness, B. Olstad,
Ø. H. Olsen, A. O. Falcão (Eds.), Proceedings of the Sixteenth ACM Conference on Information
and Knowledge Management, CIKM 2007, Lisbon, Portugal, November 6-10, 2007, ACM, 2007, pp.
623–632. URL: https://doi.org/10.1145/1321440.1321528. doi:10.1145/1321440.1321528.
[16] T. Sakai, Evaluating evaluation metrics based on the bootstrap, in: E. N. Efthimiadis, S. T. Dumais,
D. Hawking, K. Järvelin (Eds.), SIGIR 2006: Proceedings of the 29th Annual International ACM
SIGIR Conference on Research and Development in Information Retrieval, Seattle, Washington,
USA, August 6-11, 2006, ACM, 2006, pp. 525–532. URL: https://doi.org/10.1145/1148170.1148261.
doi:10.1145/1148170.1148261.
[17] T. Sakai, On the reliability of information retrieval metrics based on graded relevance, Inf. Process.</p>
      <p>Manag. 43 (2007) 531–548. URL: https://doi.org/10.1016/j.ipm.2006.07.020. doi:10.1016/J.IPM.
2006.07.020.
[18] J. Zobel, L. Rashidi, Corpus bootstrapping for assessment of the properties of efectiveness measures,
in: M. d’Aquin, S. Dietze, C. Hauf, E. Curry, P. Cudré-Mauroux (Eds.), CIKM ’20: The 29th ACM
International Conference on Information and Knowledge Management, Virtual Event, Ireland,
October 19-23, 2020, ACM, 2020, pp. 1933–1952. URL: https://doi.org/10.1145/3340531.3411998.
doi:10.1145/3340531.3411998.
[19] G. V. Cormack, T. R. Lynam, Statistical precision of information retrieval evaluation, in: E. N.</p>
      <p>Efthimiadis, S. T. Dumais, D. Hawking, K. Järvelin (Eds.), SIGIR 2006: Proceedings of the 29th
Annual International ACM SIGIR Conference on Research and Development in Information
Retrieval, Seattle, Washington, USA, August 6-11, 2006, ACM, 2006, pp. 533–540. URL: https:
//doi.org/10.1145/1148170.1148262. doi:10.1145/1148170.1148262.
[20] I. Guyon, A. Elisseef, An introduction to variable and feature selection, J. Mach. Learn. Res. 3
(2003) 1157–1182. URL: http://jmlr.org/papers/v3/guyon03a.html.
[21] M. B. Shirzad, M. R. Keyvanpour, A systematic study of feature selection methods for learning to
rank algorithms, Int. J. Inf. Retr. Res. 8 (2018) 46–67. URL: https://doi.org/10.4018/IJIRR.2018070104.
doi:10.4018/IJIRR.2018070104.
[22] A. Gigli, C. Lucchese, F. M. Nardini, R. Perego, Fast feature selection for learning to rank, in:
B. Carterette, H. Fang, M. Lalmas, J. Nie (Eds.), Proceedings of the 2016 ACM on International
Conference on the Theory of Information Retrieval, ICTIR 2016, Newark, DE, USA, September
12- 6, 2016, ACM, 2016, pp. 167–170. URL: https://doi.org/10.1145/2970398.2970433. doi:10.1145/
2970398.2970433.
[23] M. F. Dacrema, F. Moroni, R. Nembrini, N. Ferro, G. Faggioli, P. Cremonesi, Towards feature
selection for ranking and classification exploiting quantum annealers, in: E. Amigó, P. Castells,
J. Gonzalo, B. Carterette, J. S. Culpepper, G. Kazai (Eds.), SIGIR ’22: The 45th International ACM
SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11
15, 2022, ACM, 2022, pp. 2814–2824. URL: https://doi.org/10.1145/3477495.3531755. doi:10.1145/
3477495.3531755.
[24] X. Geng, T. Liu, T. Qin, H. Li, Feature selection for ranking, in: W. Kraaij, A. P. de Vries, C. L. A.</p>
      <p>Clarke, N. Fuhr, N. Kando (Eds.), SIGIR 2007: Proceedings of the 30th Annual International
ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam,
The Netherlands, July 23-27, 2007, ACM, 2007, pp. 407–414. URL: https://doi.org/10.1145/1277741.
1277811. doi:10.1145/1277741.1277811.
[25] G. Hua, M. Zhang, Y. Liu, S. Ma, L. Ru, Hierarchical feature selection for ranking, in: M. Rappa,
P. Jones, J. Freire, S. Chakrabarti (Eds.), Proceedings of the 19th International Conference on
World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26-30, 2010, ACM, 2010, pp.
1113–1114. URL: https://doi.org/10.1145/1772690.1772830. doi:10.1145/1772690.1772830.
[26] K. D. Naini, I. S. Altingövde, Exploiting result diversification methods for feature selection in
learning to rank, in: M. de Rijke, T. Kenter, A. P. de Vries, C. Zhai, F. de Jong, K. Radinsky, K. Hofmann
(Eds.), Advances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014,
Amsterdam, The Netherlands, April 13-16, 2014. Proceedings, volume 8416 of Lecture Notes in
Computer Science, Springer, 2014, pp. 455–461. URL: https://doi.org/10.1007/978-3-319-06028-6_41.
doi:10.1007/978-3-319-06028-6\_41.
[27] A. Pasin, M. F. Dacrema, P. Cremonesi, N. Ferro, qclef: A proposal to evaluate quantum
annealing for information retrieval and recommender systems, in: A. Arampatzis, E. Kanoulas,
T. Tsikrika, S. Vrochidis, A. Giachanou, D. Li, M. Aliannejadi, M. Vlachos, G. Faggioli, N. Ferro
(Eds.), Experimental IR Meets Multilinguality, Multimodality, and Interaction - 14th International
Conference of the CLEF Association, CLEF 2023, Thessaloniki, Greece, September 18-21, 2023,
Proceedings, volume 14163 of Lecture Notes in Computer Science, Springer, 2023, pp. 97–108. URL:
https://doi.org/10.1007/978-3-031-42448-9_9. doi:10.1007/978-3-031-42448-9\_9.
[28] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, T.-Y. Liu, LightGBM: A Highly Eficient</p>
      <p>Gradient Boosting Decision Tree, Advances in Neural Information Processing Systems 30 (2017).
[29] C. Macdonald, N. Tonellotto, Declarative experimentation in information retrieval using pyterrier,
in: Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information
Retrieval, 2020, pp. 161–168.
[30] D. Alexander, M. Fröbe, G. Hendriksen, F. Schlatt, M. Hagen, D. H. ad Martin Potthast, A. P. de Vries,
Team openwebsearch at clef 2024: Longeval, in: G. Faggioli, N. Ferro, P. Galuščáková, A. G. S.
de Herrera (Eds.), Working Notes of CLEF 2024 - Conference and Labs of the Evaluation Forum
(CLEF 2024), Grenoble, France, September 9th to 12th, 2024, CEUR Workshop Proceedings, 2024.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Liu</surname>
          </string-name>
          , Learning to Rank for Information Retrieval, Springer,
          <year>2011</year>
          . URL: https://doi.org/10.1007/ 978-3-
          <fpage>642</fpage>
          -14267-3. doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -14267-3.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. F.</given-names>
            <surname>Nogueira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Yates</surname>
          </string-name>
          ,
          <article-title>Pretrained Transformers for Text Ranking: BERT and Beyond</article-title>
          ,
          <source>Synthesis Lectures on Human Language Technologies</source>
          , Morgan &amp; Claypool Publishers,
          <year>2021</year>
          . URL: https://doi.org/10.2200/S01123ED1V01Y202108HLT053. doi:
          <volume>10</volume>
          .2200/ S01123ED1V01Y202108HLT053.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Dato</surname>
          </string-name>
          , S. MacAvaney,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Nardini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Perego</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tonellotto</surname>
          </string-name>
          ,
          <article-title>The istella22 dataset: Bridging traditional and neural learning to rank evaluation</article-title>
          , in: E. Amigó,
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Carterette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Culpepper</surname>
          </string-name>
          , G. Kazai (Eds.),
          <source>SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , Madrid, Spain,
          <source>July 11 - 15</source>
          ,
          <year>2022</year>
          , ACM,
          <year>2022</year>
          , pp.
          <fpage>3099</fpage>
          -
          <lpage>3107</lpage>
          . URL: https://doi.org/10.1145/3477495.3531740. doi:
          <volume>10</volume>
          .1145/3477495.3531740.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Günther</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Probst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <article-title>The Power of Anchor Text in the Neural Retrieval Era</article-title>
          , in: M.
          <string-name>
            <surname>Hagen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Verberne</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Seifert</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Balog</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Nørvåg</surname>
          </string-name>
          , V. Setty (Eds.),
          <source>Advances in Information Retrieval. 44th European Conference on IR Research (ECIR</source>
          <year>2022</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Dacrema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          , Quantumclef - quantum computing at CLEF, in: N.
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Tonellotto</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lipani</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
          </string-name>
          , I. Ounis (Eds.),
          <source>Advances in Information Retrieval - 46th European Conference on Information Retrieval</source>
          ,
          <string-name>
            <surname>ECIR</surname>
          </string-name>
          <year>2024</year>
          , Glasgow, UK, March
          <volume>24</volume>
          -28,
          <year>2024</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>V</given-names>
          </string-name>
          , volume
          <volume>14612</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2024</year>
          , pp.
          <fpage>482</fpage>
          -
          <lpage>489</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>031</fpage>
          -56069-9_
          <fpage>66</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -56069-9\_
          <fpage>66</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Ferrari</given-names>
            <surname>Dacrema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          , N. Ferro,
          <string-name>
            <surname>QuantumCLEF</surname>
          </string-name>
          <year>2024</year>
          :
          <article-title>Overview of the Quantum Computing Challenge for Information Retrieval and Recommender Systems at CLEF</article-title>
          , in: Working Notes of the Conference and
          <article-title>Labs of the Evaluation Forum (CLEF</article-title>
          <year>2024</year>
          ), Grenoble, France,
          <source>September 9th to 12th</source>
          ,
          <year>2024</year>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Ferrari</given-names>
            <surname>Dacrema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <source>Overview of QuantumCLEF</source>
          <year>2024</year>
          :
          <article-title>The Quantum Computing Challenge for Information Retrieval and Recommender Systems at CLEF, in: Experimental IR Meets Multilinguality</article-title>
          , Multimodality, and Interaction - 15th
          <source>International Conference of the CLEF Association, CLEF</source>
          <year>2024</year>
          , Grenoble, France, September 9-
          <issue>12</issue>
          ,
          <year>2024</year>
          , Proceedings,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Dacrema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <article-title>Quantum computing for information retrieval and recommender systems</article-title>
          , in: N.
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Tonellotto</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lipani</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
          </string-name>
          , I. Ounis (Eds.),
          <source>Advances in Information Retrieval - 46th European Conference on Information Retrieval</source>
          ,
          <string-name>
            <surname>ECIR</surname>
          </string-name>
          <year>2024</year>
          , Glasgow, UK, March
          <volume>24</volume>
          -28,
          <year>2024</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>V</given-names>
          </string-name>
          , volume
          <volume>14612</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2024</year>
          , pp.
          <fpage>358</fpage>
          -
          <lpage>362</lpage>
          . URL: https: //doi.org/10.1007/978-3-
          <fpage>031</fpage>
          -56069-9_
          <fpage>47</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -56069-9\_
          <fpage>47</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Efron</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tibshirani</surname>
          </string-name>
          , An Introduction to the Bootstrap, Springer,
          <year>1993</year>
          . URL: https://doi.org/10. 1007/978-1-
          <fpage>4899</fpage>
          -4541-9. doi:
          <volume>10</volume>
          .1007/978-1-
          <fpage>4899</fpage>
          -4541-9.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Gienapp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <article-title>Bootstrapped nDCG Estimation in the Presence of Unjudged Documents</article-title>
          ,
          <source>in: Advances in Information Retrieval. 45th European Conference on IR Research (ECIR</source>
          <year>2023</year>
          ), volume
          <volume>13980</volume>
          of Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2023</year>
          , pp.
          <fpage>313</fpage>
          -
          <lpage>329</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -28244-7_
          <fpage>20</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>C.</given-names>
            <surname>Buckley</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dimmick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Soborof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Voorhees</surname>
          </string-name>
          ,
          <article-title>Bias and the limits of pooling for large collections</article-title>
          ,
          <source>Inf. Retr</source>
          .
          <volume>10</volume>
          (
          <year>2007</year>
          )
          <fpage>491</fpage>
          -
          <lpage>508</lpage>
          . URL: https://doi.org/10.1007/s10791-007-9032-x. doi:
          <volume>10</volume>
          . 1007/S10791-007-9032-X.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Voorhees</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Craswell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <article-title>Too many relevants: Whither cranfield test collections?</article-title>
          , in: E. Amigó,
          <string-name>
            <given-names>P.</given-names>
            <surname>Castells</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gonzalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Carterette</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Culpepper</surname>
          </string-name>
          , G. Kazai (Eds.),
          <source>SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , Madrid, Spain,
          <source>July 11 - 15</source>
          ,
          <year>2022</year>
          , ACM,
          <year>2022</year>
          , pp.
          <fpage>2970</fpage>
          -
          <lpage>2980</lpage>
          . URL: https://doi.org/10.1145/3477495. 3531728. doi:
          <volume>10</volume>
          .1145/3477495.3531728.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>