<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Adaptive Re-Ranking as an Information-Seeking Agent</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sean MacAvaney</string-name>
          <email>sean.macavaney@glasgow.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicola Tonellotto</string-name>
          <email>nicola.tonellotto@unipi.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Craig Macdonald</string-name>
          <email>craig.macdonald@glasgow.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Glasgow</institution>
          ,
          <country country="UK">United Kingdom</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Pisa</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <abstract>
        <p>Re-ranking systems are typically limited by the recall of the initial retrieval function. A recent work proposed adaptive re-ranking, which modifies the re-ranking loop to progressively prioritise documents likely to receive high scores based on the highest scoring ones thus far. The original work framed this process as an incarnation of the well-established clustering hypothesis. In this work, we argue that the approach can also be framed as an information-seeking agent. From this perspective, we explore several variations of the graph-based adaptive re-ranking algorithm and find that there is substantial room for improvement by modifying the agent. However, the agents that we explore are more sensitive to the new parameters they introduce than the simple-yet-efective approach proposed in the original adaptive re-ranking work.</p>
      </abstract>
      <kwd-group>
        <kwd>Neural re-ranking</kwd>
        <kwd>Clustering hypothesis</kwd>
        <kwd>Retrieval agents</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        the incremental nature of re-ranking documents, typically done in batches to make the best use
of specialised hardware like GPUs and TPUs. After each batch, information is gained about
which documents get the highest relevance score, which can be leveraged to make a more
informed decision about which documents to rank next (a process the authors call Adaptive
Re-Ranking). By making use of the clustering hypothesis [
        <xref ref-type="bibr" rid="ref10 ref11 ref9">9, 10, 11</xref>
        ] and pre-computed similarity
scores between documents (via a corpus graph), documents that are similar to the highest-ranked
documents can be identified and prioritised, thereby allowing documents missed in the first
stage to be retrieved. This particular formulation of Adaptive Re-Ranking is referred to as
Graph-based Adaptive Re-ranking (G A R ). The variant of G A R proposed by MacAvaney et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]
used a simple strategy for balancing (a) the scoring of batches of existing candidate documents
(identified by the first stage retriever) and (b) discovering new candidates (by scoring documents
similar to the top ranked results so far). Their approach simply alternates between (a) and (b),
batch by batch. We refer to this strategy as A l t e r n a t e .
      </p>
      <p>We argue that the approach for re-ranking can be thought of as an information-seeking agent
and explained in terms of user browsing strategies. For instance, an agent using the traditional
re-ranking approach simply scans down a list of search results looking for items that are
relevant to its information need. Meanwhile, an agent using the A l t e r n a t e Adaptive approach
alternates between exploring documents from the search results and similar documents to the
most relevant ones it found so far. We observe that this may not be the process by which a
reasonable human user would traverse search results; for instance, a user may only look at
similar documents to those that are suficiently relevant. Therefore, in this work we explore a
variety of agent-based strategies for adaptive re-ranking, inspired by behaviours that a human
may take when seeking information.</p>
      <p>In summary, we explore new agent-based strategies for the recently-proposed Graph-based
Adaptive Re-Ranking approach. We establish an upper-bound performance for smarter agents
and explore several other strategies. In doing so, we demonstrate the potential breadth of
possible strategies that can be applied to GAR.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background and Related Work</title>
      <p>
        Cascading architectures are commonly deployed in information retrieval. In principle, these
are driven by an eficiency-efectiveness tradeof – the most efective models cannot be applied
on all documents in the index in a timely manner [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Hence, cheap ranking stages (with high
recall) are typically combined with more expensive ranking stages (to obtain higher precision).
Ranking cascades are common in web search using learning-to-rank, where an initial retrieval
stage uses a cheap ranking function, commonly BM25, to identify a candidate set of documents.
This is followed by a feature computation stage, and then application of the learned model [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Limiting the number of documents retrieved in the first stage – which we call the budget in this
paper – has the efect of reducing recall, but enhances overall eficiency [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Later, Tonellotto
et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] proposed a selective approach that could adjust the budget on a per-query basis for
eficient &amp; efective retrieval.
      </p>
      <p>
        More recently, multi-stage ranking has been commonly used for applying neural re-ranking
models. For instance, cross-encoder neural models [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] must be applied jointly on the query
and the document to compute a relevance score. Since the cross-encoder scoring function can
be computationally expensive, re-ranking is often limited to a predefined maximum number
documents that the system is willing to re-rank for each query (i.e., a re-ranking budget, such
as 1000). Hence, the application of a first (so-called sparse) retriever to obtain a candidate
set is necessary for such models. However, this limits the recall of the candidate set, in that
relevant documents not matching the query terms (and which may obtain a high score from the
re-ranking cross-encoder) will not be retrieved.
      </p>
      <p>To address this gap, several recent neural ranking models of have been used proposed: (i)
dense retrieval models based on bi-encoders and (ii) learned sparse models.</p>
      <p>
        In bi-encoder-based dense retrieval models [
        <xref ref-type="bibr" rid="ref4 ref5 ref6 ref7">4, 5, 6, 7</xref>
        ], the documents are pre-encoded into
embedded representations, and nearest neighbour lookup is performed based on an embedded
representation of the query [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. However, online nearest neighbour search is more expensive
than use of an inverted index-based sparse retrieval, hence inverted indexes are adopted for
ifrst-stage retrieval. Of note, ColBERT [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] uses a multi-stage dense retrieval architecture – an
approximate nearest neighbour search, followed by an exact neural scoring.
      </p>
      <p>Learned sparse models encompass the use of the PLMs upon the document representation to
make a refined representation that can be indexed by existing sparse inverted indexing tools. For
instance, doc2query [18, 19] enriches the sparse document representation with possible queries
that the document could be retrieved for. DeepCT [20], uniCOIL [21, 22], DeepImpact [23, 24]
and SPLADE [25, 26] all use a PLM to generate a new sparse representation of the document.</p>
      <p>
        This work focuses on a recently-proposed complementary approach: Graph-Based Adaptive
Re-Ranking (GAR) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Rather than limiting the candidate pool of documents for re-ranking to
those from the first stage, GAR augments the re-ranking loop to identify documents that are
similar to the highest-scoring ones identified so far. In the following sections, we cover this
process in more detail (Section 3) and identify that this process can be seen as an
informationseeking agent (Section 4). We use this formulation to propose alternative adaptive re-ranking
agents and compare their efectiveness (Section 6).
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Graph-based Adaptive Re-Ranking</title>
      <p>GAR is formulated as a re-ranking process. In a re-ranking process, the documents from an
initial candidate set,  0, are re-ranked by a scoring function, Score(), to obtain a final ranking
of  1 documents. The scoring function can be expensive to apply, for instance a cross-encoder
neural model such as monoT5. This means that eficiency constraints may limit the number
of applications of Score() that can be undertaken, which is called the re-ranking budget. In
practice, batches of documents are scored at the same time, making efective use of GPU or
TPU hardware.</p>
      <p>A typical is non-adaptive re-ranking is shown in Algorithm 1, where an initial ranking
(prioritised by rank) is scored in batches until the re-ranking budget is fully consumed.</p>
      <p>In contrast, GAR proposed an adaptive approach to re-ranking, where highly scored
documents are used as seeds for nearest neighbour retrieval, rather than continuing down the ranking.
To perform this eficiently, a graph of document-document similarity (  ) is created at index
time, which records the nearest documents for each given document. Since this is precomputed
do
 0 ←  0 ⧵ 
while | 1| &lt; 
Algorithm 1 Non-Adaptive Re-Ranking</p>
      <sec id="sec-3-1">
        <title>Input: Initial ranking  0, batch size  , budget</title>
        <p>←</p>
        <sec id="sec-3-1-1">
          <title>Score(top  documents from  , subject to  )</title>
          <p>Score(top  documents from  0, subject to  )
and small, it can be used eficiently at ranking time to identify more documents to score.</p>
          <p>In particular, after each batch  of documents is scored, the nearest documents to those in
the batch are added to a “frontier” set of documents  . The frontier is prioritised by the scores
of the re-ranked documents, so the neighbours of the highest-scored documents are explored
ifrst. The process then alternates between scoring the top documents from the initial ranking
 0 and scoring the top documents in the frontier  . Algorithm 2 shows this process.</p>
          <p>We call this A l t e r n a t e , due to the manner in which it alternates between the initial ranking
and the frontier  . In the following, we link GAR with intelligent agents, and shows us how
this allows further agent-based implementations.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Re-Ranking Agents</title>
      <p>We argue that GAR re-ranking approaches can be interpreted as information-seeking agents,
which peruse retrieved documents, and decide how to act based on the relevance of the
documents found (such actions could be selecting more documents from the candidate set, or
selecting similar documents to those already scored highly). In doing so, they act like a user
who was deciding how to progress through a ranked list.</p>
      <sec id="sec-4-1">
        <title>Algorithm 2 A l t e r n a t e Re-Ranking [8]</title>
        <sec id="sec-4-1-1">
          <title>Output: Re-Ranked pool  1 Input: Initial ranking  0, batch size  , budget  , corpus graph</title>
          <p>We treat existing re-ranking strategies as information-seeking agents, and in the following
we introduce new alternative agents. In all cases, we have the following: an environment which
consists of an initial ranking of candidate documents, denoted as  0, a graph of
documentdocument similarities  , and the final retrieved documents list  1; and a set of available actions,
namely scoring documents from  0 and placing in  1, and scoring documents from  and
placing into  1. Some agents may also maintain a new state, such as a prioritised frontier  of
documents collected from the graph. We argue that such a fully-capable agent can be classified
as a model-based reflex agent within the classes of [27]. Indeed, a scoring model (e.g., monoT5)
estimates relevance, which is only partially observable. It can choose actions based on the state,
i.e., the current frontier.</p>
          <p>In the following, we list four possible agents based on the GAR formulation.
4.1. T w o P h a s e
A simple alternative to A l t e r n a t e is to operate in two sequential phases. In the first phase,
all documents up to rank  &lt;  are scored. In the second phase, the remaining re-ranking
budget  −  is utilised by prioritising the neighbours of the highest-scoring documents. The
behaviour models a user who first investigates numerous pages of search results to find the most
relevant content, and then goes back to check for similar documents to the best ones identified.
Algorithm 3 shows this process. We consider two variations of this T w o P h a s e strategy: one
where the frontier continues to be updated based on the scores of the documents in the second
phase (T w o P h a s e - R e f i n e ), and one where the frontier only considers the first phase documents
(T w o P h a s e - F i x e d ).
4.2. T h r e s h o l d
The T h r e s h o l d agent only explores the corpus graph for suficiently relevant documents. In
this setting, the initial ranking pool is updated to include the neighbours of scored documents</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>Algorithm 3 T w o P h a s e Re-Ranking</title>
        <p>Input: Initial ranking  0, batch size  , budget  , corpus graph  , first phase size 
Output: Re-Ranked pool  1
 1 ← ∅
do
 ← Score(top  from  0, subject to  )
 1 ←  1 ∪ 
 0 ←  0 ⧵ 
while | 1| &lt; 
 ← Neighbours( 1,  )
do
 ← Score(top  from  , subject to  )
 1 ←  1 ∪ 
 ←  ⧵ 
 ←  ∪ ( Neighbours(, ) ⧵  1)
while | 1| &lt; 
▷ If “Refine” variant, otherwise skip this step
Algorithm 4 T h r e s h o l d Re-Ranking
Input: Initial ranking  0, batch size  , budget  , corpus graph</p>
        <p>Score(top  from  0, subject to  )
 0 ←  0∪ (Neighbours(documents from  that satisfy  , G) ⧵ 1)
do
while | 1| &lt; 

do
  0 ← ∞
 ← ∞
that exceed a relevance score threshold  . By adding these documents to the pool with infinite
priority, they are always scored before other documents in the pool. This agent is akin to a
searcher who, as they traverse a ranked list, investigate similar documents to ones that they
deem suficiently relevant as they are encountered. Importantly, this agent considers the
absolute
relevance score of documents, rather than the relative ordering of retrieved documents so far,
which is used by techniques like A l t e r n a t e and T w o P h a s e . Algorithm 4 shows this process.
4.3. G r e e d y
We next explore a strategy that attempts to identify which pool of documents – the initial
ranking or the frontier – is has recently given the most relevant document. We approach this
as a G r e e d y strategy, by always choosing the pool with the largest maximum score from the
most recent batch. This process is akin to a searcher who switches between searching through
a ranked list and the documents close to the best ones they found so far based on the most
relevant document they identified recently from each pool. Algorithm</p>
      </sec>
      <sec id="sec-4-3">
        <title>5 shows this process.</title>
      </sec>
      <sec id="sec-4-4">
        <title>Algorithm 5 G r e e d y Re-Ranking</title>
        <sec id="sec-4-4-1">
          <title>Output: Re-Ranked pool  1 Input: Initial ranking  0, batch size  , budget  , corpus graph</title>
          <p>←MaxScore(B)
 ←</p>
        </sec>
      </sec>
      <sec id="sec-4-5">
        <title>Score(top  documents from  , subject to  )</title>
        <p />
        <p>if 
4.4. Oracle
Finally, to explore whether there is further room for improvement in adaptive re-ranking, we
apply an oracle strategy. This strategy makes use of the relevance assessments (qrels) to choose
which batch of documents to score. At each step, both the initial ranking and the frontier
are scored. Then, the algorithm proceeds by adding only the batch that improves the ranking
efectiveness (e.g., through a measure like nDCG) the most at that step. Naturally, this algorithm
is not useful in practice, given that it relies on ground-truth relevant knowledge.1 However, it
helps us establish whether there is further room for improvement in adaptive re-ranking agents.
Algorithm 6 shows this process.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experimental Setup</title>
      <p>We conduct experiments using various adaptive re-ranking agents to answer the following
research questions:
RQ1 Is there room for further improvement in adaptive re-ranking strategies?
RQ2 Can smarter adaptive re-ranking strategies outperform the A l t e r n a t e approach?
RQ3 How sensitive are alternative adaptive re-ranking strategies to their parameters?</p>
      <p>We conduct experiments using the TREC Deep Learning 2019 and 2020 passage ranking
datasets [28, 29]. We use 2019 as our “development” data, leaving 2020 as a held-out test set.
For both datasets, we report nDCG and Recall@1000 (with a minimum relevance threshold
of 2) performance. We fix our study to a MonoT5 (base) scoring function [ 30], a re-ranking
budget  = 1000 , and a batch size  = 16 . We explore four initial ranking functions: BM25,
TCTColBERT [31, 32], Doc2Query [19], and SPLADE++ (C o C o n d e n s e r - E n s e m b l e D i s t i l version) [26].
1Of course, a perfect agent would ignore the scoring function entirely and just return documents by their human
assessments to get a perfect ranking. We limit the usage of the assessments for this agent to the decision of which
batch to score to help us identify whether better decisions about which batches to score can further improve
adaptive re-ranking efectiveness.</p>
      <p>Pipeline
BM25»MonoT5
TCT»MonoT5
D2Q»MonoT5
SPLADE»MonoT5</p>
      <p>Agent
Non-Adaptive
Oracle
Alternate
TwoPhase-Fixed
TwoPhase-Refine
Threshold
Greedy
Non-Adaptive
Oracle
Alternate
TwoPhase-Fixed
TwoPhase-Refine
Threshold
Greedy
Non-Adaptive
Oracle
Alternate
TwoPhase-Fixed
TwoPhase-Refine
Threshold
Greedy
Non-Adaptive
Oracle
Alternate
TwoPhase-Fixed
TwoPhase-Refine
Threshold
Greedy
GAR25</p>
      <p>GAR</p>
      <p>TREC DL 2020 (test)
GAR25</p>
      <p>GAR
nDCG
performing (non-oracle) agent in section is listed in bold. Significant diferences compared to the
Non-Adaptive, Oracle, and A d a p t i v e systems are marked with 
, respectively (paired t-test,  &lt; 0.05 ).
distribution, from 0.022 to 0.20 (by 0.02).3
Following the original G A R paper, we use two corpus graphs, based on BM25 similarity and
TCT-ColBERT.</p>
      <p>As we will show in Section 6, T w o P h a s e and T h r e s h o l d are rather sensitive to their parameters,
so the results we present use the parameters that yield the highest nDCG performance on TREC
 values ranging from 100 to 900 (by 100). For
DL 2019. For T w o P h a s e , we test first-stage cutof
T h r e s h o l d , we first identify the distribution of scores using the MS MARCO dev (small) dataset,
re-ranking 100 BM25 results. We then pick threshold  values based on percentiles from this</p>
    </sec>
    <sec id="sec-6">
      <title>6. Results and Analysis</title>
      <p>2Here, 0.02 represents the top 2% of scores.
3This range was chosen after an initial study of orders of magnitude from 0.001 to 0.3 identified it as the most
promising.</p>
      <p>We first explore whether there exists room for improvement by exploring alternative agents
(RQ1). In all cases, the Oracle agent, which has knowledge of the relevance assessments
when choosing which batches of documents to score, has a higher nDCG performance than
all other agents. The diference in performance can often be considerable; for instance, the
best-performing non-oracle agent over a the TCT pipeline using a BM25 graph has an nDCG of
0.733, while the oracle achieves an nDCG of 0.793. Diferences between non-oracle agents and
the Oracle are not always significant, however. We note that while nDCG is often improved
using either oracle technique, recall often sufers compared to alternative agents. Perhaps this
is unsurprising, given that the oracles aim to directly optimise nDCG (recall is only considered
insofar as it would improve the nDCG). Significant improvements compared to the Oracle are
found more frequently when measuring nDCG using TCT corpus graph (GAR ) than the BM25
graph (GAR25 ).</p>
      <p>In summary, to answer RQ1, by exploring an oracle agent, we find that there is considerable
room for improvement in adaptive re-ranking. In particular, exploration of the semantic graph
(GAR ) by non-oracle agents is likely sub-optimal and has room for considerable improvement.</p>
      <p>Next, we look to answer RQ2 and RQ3. We find that each of the proposed alternative agents to
be the most efective in at least one measurement. In one case, an alternative agent significantly
outperforms the existing A l t e r n a t e agent (T h r e s h o l d when re-ranking BM25 results using
GAR ). Further, there are 11 cases where alternative strategies outperform the non-adaptive,
while A l t e r n a t e does not. However, there are sometimes significant reductions in efectiveness
when using the alternative agents (18 cases, all of which on the held-out DL20 data). There is
also no clear pattern in which strategy is generally most efective. These results answer RQ2:
while some strategies can outperform A l t e r n a t e , we do not find a clearly superior strategy
among the agents we test.</p>
      <p>
        Given that significant reductions are present when using the held-out DL20 data, we dive
deeper into the performance of T w o P h a s e and T h r e s h o l d by plotting their performance for the
parameters they introduce (T w o P h a s e adds the first stage cutof  and T h r e s h o l d adds the relevance
threshold  ). Figure 1 present the nDCG performance. We see that in all cases, performance
varies considerably as the parameters are adjusted. Further, the optimal parameters for DL19
are often diferent than the optimal parameters for DL20, suggesting that the alternative agents
are not very robust to the selection parameter selection. This is in particular contrast with the
robustness of the A l t e r n a t e approach to parameters, shown by MacAvaney et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. These
results answer RQ3: T w o P h a s e and T h r e s h o l d introduce parameters that are more sensitive than
those present in the A l t e r n a t e strategy.
      </p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusions &amp; Outlook</title>
      <p>In this work we investigated the recently-proposed Adaptive Re-Ranking process from the
perspective of information-seeking agents. We find that there is considerable room to improve
adaptive re-ranking by changing the agent strategy. We further proposed several alternative
strategies, and found that although they are promising, they are more sensitive to their
parameters than the simple A l t e r n a t e approach. This work sets the stage for future work that
investigates additional agent strategies, such as those inspired by multi-armed bandit models.
100 200 300 400 500 600 700 800 900</p>
      <p>Cutoff k
100 200 300 400 500 600 700 800 900</p>
      <p>Cutoff k</p>
      <p>T w o P h a s e - F i x e d
DL19 GARBM25</p>
      <p>DL19 GARTCT
DL20 GARBM25</p>
      <p>DL20 GARTCT
100 200 300 400 500 600 700 800 900</p>
      <p>Cutoff k
100 200 300 400 500 600 700 800 900</p>
      <p>Cutoff k</p>
      <p>T h r e s h o l d
DL19 GARBM25</p>
      <p>DL19 GARTCT</p>
      <p>T w o P h a s e - R e f i n e
DL19 GARBM25</p>
      <p>DL19 GARTCT
DL20 GARBM25</p>
      <p>DL20 GARTCT</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgements</title>
      <p>Sean and Craig acknowledge EPSRC grant EP/R018634/1: Closed-Loop Data Science for Complex,
Computationally- &amp; Data-Intensive Analytics.
[18] R. Nogueira, W. Yang, J. Lin, K. Cho, Document expansion by query prediction, Preprint:
arXiv:1904.08375 (2019).
[19] R. Nogueira, J. Lin, From doc2query to docTTTTTquery, Online preprint (2019).
[20] Z. Dai, J. Callan, Context-aware sentence/passage term importance estimation for first
stage retrieval, Preprint: arXiv:1910.10687 (2019).
[21] L. Gao, Z. Dai, J. Callan, COIL: Revisit Exact Lexical Match in Information Retrieval with</p>
      <p>Contextualized Inverted List, in: Proc. NAACL-HLT, 2021, pp. 3030–3042.
[22] J. Lin, X. Ma, A few brief notes on DeepImpact, COIL, and a conceptual framework for
information retrieval techniques, Preprint: arXiv:2106.14807 (2021).
[23] A. Mallia, O. Khattab, T. Suel, N. Tonellotto, Learning passage impacts for inverted indexes,
in: Proc. SIGIR, 2021.
[24] A. Mallia, J. Mackenzie, T. Suel, N. Tonellotto, Faster Learned Sparse Retrieval with Guided</p>
      <p>Traversal, in: Proc. SIGIR, 2022, p. 5.
[25] T. Formal, B. Piwowarski, S. Clinchant, SPLADE: Sparse Lexical and Expansion Model for</p>
      <p>First Stage Ranking, in: Proc. SIGIR, 2021, p. 2288–2292.
[26] T. Formal, C. Lassance, B. Piwowarski, S. Clinchant, Splade v2: Sparse lexical and expansion
model for information retrieval, 2021.
[27] S. Russell, P. Norvig, Artificial Intelligence: A Modern Approach, 3 ed., Prentice Hall, 2010.
[28] N. Craswell, B. Mitra, E. Yilmaz, D. Campos, E. Voorhees, Overview of the trec 2019 deep
learning track, in: TREC 2019, 2019.
[29] N. Craswell, B. Mitra, E. Yilmaz, D. Campos, Overview of the trec 2020 deep learning track,
in: TREC, 2020.
[30] R. Nogueira, Z. Jiang, R. Pradeep, J. Lin, Document ranking with a pretrained
sequence-tosequence model, in: Proc. EMNLP, 2020.
[31] S.-C. Lin, J.-H. Yang, J. J. Lin, Distilling dense representations for ranking using
tightlycoupled teachers, ArXiv abs/2010.11386 (2020).
[32] S.-C. Lin, J.-H. Yang, J. Lin, In-batch negatives for knowledge distillation with
tightlycoupled teachers for dense retrieval, in: RepL4NLP-2021 Workshop, 2021, pp. 163–173.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Nogueira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Yates</surname>
          </string-name>
          ,
          <article-title>Pretrained Transformers for Text Ranking: BERT and Beyond</article-title>
          ,
          <source>Synthesis Lectures on Human Language Technologies</source>
          , Morgan &amp; Claypool Publishers,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Ou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ju</surname>
          </string-name>
          ,
          <article-title>Weakly supervised co-training of query rewriting andsemantic matching for e-commerce</article-title>
          ,
          <source>in: Proc. WSDM</source>
          ,
          <year>2019</year>
          , p.
          <fpage>402</fpage>
          -
          <lpage>410</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Macdonald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tonellotto</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Ounis</surname>
          </string-name>
          ,
          <article-title>Eficient &amp; efective selective query rewriting with eficiency predictions</article-title>
          ,
          <source>in: Proc. SIGIR</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>495</fpage>
          -
          <lpage>504</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>O.</given-names>
            <surname>Khattab</surname>
          </string-name>
          , M. Zaharia,
          <article-title>ColBERT: Eficient and Efective Passage Search via Contextualized Late Interaction over BERT</article-title>
          ,
          <source>in: Proc. SIGIR</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.</given-names>
            <surname>Karpukhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Oğuz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Min</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Edunov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          , W.-t. Yih,
          <article-title>Dense passage retrieval for open-domain question answering</article-title>
          ,
          <year>2020</year>
          . URL: https://arxiv.org/abs/
          <year>2004</year>
          .04906.
          <source>doi:1 0 . 4 8</source>
          <volume>5 5</volume>
          <fpage>0</fpage>
          <string-name>
            <surname>/ A R X I</surname>
          </string-name>
          <article-title>V . 2 0 0 4 . 0 4 9 0 6</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.-F.</given-names>
            <surname>Tang</surname>
          </string-name>
          , J. Liu,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bennett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Overwijk</surname>
          </string-name>
          ,
          <article-title>Approximate nearest neighbor negative contrastive learning for dense text retrieval</article-title>
          ,
          <source>in: Proceedings of ICLR</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Ma,
          <article-title>Optimizing Dense Retrieval Model Training with Hard Negatives</article-title>
          ,
          <source>in: Proc. SIGIR</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1503</fpage>
          -
          <lpage>1512</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>MacAvaney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tonellotto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Macdonald</surname>
          </string-name>
          ,
          <article-title>Adaptive re-ranking with a corpus graph</article-title>
          ,
          <source>in: 31st ACM International Conference on Information and Knowledge Management</source>
          ,
          <year>2022</year>
          . URL: https://arxiv.org/abs/2208.08942.
          <source>doi:1 0 . 1 1</source>
          <volume>4 5 / 3 5 1 1 8 0 8 . 3 5 5 7 2 3 1 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>N.</given-names>
            <surname>Jardine</surname>
          </string-name>
          ,
          <string-name>
            <surname>C. J. van Rijsbergen</surname>
          </string-name>
          ,
          <article-title>The use of hierarchic clustering in information retrieval</article-title>
          ,
          <source>Information storage and retrieval 7</source>
          (
          <year>1971</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>O.</given-names>
            <surname>Kurland</surname>
          </string-name>
          ,
          <article-title>The cluster hypothesis in information retrieval</article-title>
          ,
          <source>in: Proc. SIGIR</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E. M.</given-names>
            <surname>Voorhees</surname>
          </string-name>
          ,
          <article-title>The cluster hypothesis revisited</article-title>
          ,
          <source>in: Proc. SIGIR</source>
          ,
          <year>1985</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>N.</given-names>
            <surname>Tonellotto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Macdonald</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Ounis</surname>
          </string-name>
          ,
          <article-title>Eficient query processing for scalable web search</article-title>
          ,
          <source>Foundations and Trends in Information Retrieval</source>
          <volume>12</volume>
          (
          <year>2018</year>
          )
          <fpage>319</fpage>
          -
          <lpage>492</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>T</surname>
          </string-name>
          .-Y. Liu,
          <article-title>Learning to rank for information retrieval</article-title>
          ,
          <source>Found. Trends Inf. Retr</source>
          .
          <volume>3</volume>
          (
          <year>2009</year>
          )
          <fpage>225</fpage>
          -
          <lpage>331</lpage>
          . URL: https://doi.org/10.1561/1500000016.
          <source>doi:1 0 . 1 5</source>
          <volume>6 1 / 1 5 0 0 0 0 0 0 1 6 .</volume>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>C.</given-names>
            <surname>Macdonald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. L. T.</given-names>
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Ounis</surname>
          </string-name>
          ,
          <article-title>The whens and hows of learning to rank for web search</article-title>
          ,
          <source>Information Retrieval</source>
          <volume>16</volume>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>N.</given-names>
            <surname>Tonellotto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Macdonald</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Ounis</surname>
          </string-name>
          ,
          <article-title>Eficient and efective retrieval using selective pruning</article-title>
          ,
          <source>in: Proc. WSDM</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>R.</given-names>
            <surname>Nogueira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Cho</surname>
          </string-name>
          , Passage Re-ranking
          <string-name>
            <surname>with</surname>
            <given-names>BERT</given-names>
          </string-name>
          , arXiv preprint
          <year>1901</year>
          .
          <volume>04085</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Johnson</surname>
          </string-name>
          , M. Douze,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jegou</surname>
          </string-name>
          ,
          <article-title>Billion-scale similarity search with gpus</article-title>
          ,
          <source>IEEE Trans. Big Data</source>
          <volume>7</volume>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>