<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Accelerating Learned Sparse Indexes Via Term Impact Decomposition⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Discussion Paper</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joel Mackenzie</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Antonio Mallia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alistair Mofat</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matthias Petri</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Amazon Alexa</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Melbourne</institution>
          ,
          <country country="AU">Australia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Queensland</institution>
          ,
          <country country="AU">Australia</country>
        </aff>
      </contrib-group>
      <issue>0</issue>
      <abstract>
        <p>Novel inverted index-based learned sparse ranking models provide more efective, but less eficient, retrieval performance compared to traditional ranking models like BM25. In this paper, we introduce a technique we call postings clipping to improve the query eficiency of learned representations. Our technique amplifies the benefit of dynamic pruning query processing techniques by accounting for changes in term importance distributions of learned ranking models. The new clipping mechanism accelerates top- retrieval without any loss in efectiveness.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Sparse term importance representations such as DeepImpact [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and uniCOIL [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] have enabled
the use of efective transformer-based text representations that can match the efectiveness of
recent dense text representations [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ] while still being supported by inverted indexes and their
query operations. This is of importance as inverted indexes have been optimized to provide
search functionality in distributed settings at web-scale through 40 years of research, providing
a variety of time, space, and retrieval quality tradeofs; while also supporting eficient updates,
advanced querying modes such as phrase matching or filtering, and good scalability, all of
which are crucial in real-world settings [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ].
      </p>
      <p>
        One of the key indexing techniques that enables eficient top-  query processing is storing
additional metadata about index term importance scores (also referred to as impacts), seeking
to facilitate the bypassing of the majority of the matching documents, and thus allow faster
retrieval than would be possible via exhaustive disjunctive processing. For example, dynamic
pruning algorithms such as MaxScore [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and WAND [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] store the index-wide maximum impact
of each term; at query-time, these impacts can be used to rapidly estimate document scores,
allowing documents that have no prospect of entering the current min-heap of  results to be
bypassed.
      </p>
      <p>BM25
0 2 4 6 8 10 14 16 18 20 21 22 23 24</p>
      <p>List Bucket [b ]</p>
      <p>DocT5Query DeepImpact uniCOIL</p>
      <p>
        Traditional similarity models such as BM25 guarantee that frequent terms have low
importance scores, a symbiotic relationship that allows fast query processing. On the other hand,
recent transformer-based learned term importance techniques such as DeepImpact [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] are not
constrained by term occurrence frequency when assigning importance scores to terms in
documents.
      </p>
      <p>Indeed, learned representations assign high term importance to even very frequent terms,
whereas BM25 always assigns low importance to such terms. This divergent behavior
substantially reduces the ability of MaxScore and WAND to bypass low-impact documents during
querying, with both techniques relying on maximum list-wise impact scores to prune the search
space. Figure 1 highlights the pervasive nature of this issue.</p>
      <p>We present a new form of impact decomposition that we call postings clipping and adapt
dynamic pruning mechanisms to enable eficient query processing for learned term importance
schemes. When integrated into the retrieval engine, postings clipping allows faster top-
term-based querying, with negligible increases to index storage costs, and no efect on result
quality.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Postings Clipping</title>
      <p>
        Several authors have proposed explicitly or implicitly splitting postings lists into two (or
more) parts, a high-impact segment ℋ() and a low-impact segment ℒ() to facilitate eficient
processing; see, for example, Strohman and Croft [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], Ding and Suel [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], Daoud et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ],
Daoud et al. [14], Kane and Tompa [15] and Mackenzie et al. [16].
      </p>
      <p>Our proposal – denoted postings clipping – is illustrated in Figure 2. Rather than partitioning
the set of postings in ℐ across ℒ() and ℋ(), every posting remains in ℒ(), and we “clip” the
high-impact postings by slicing them into two parts, and forming a posting pair. The base part
remains in ℒ() as a posting with an impact equal to ℒ(), the maximum score contribution
permitted in ℒ(); and the second component of the pair becomes a new posting in ℋ(), to
account for the “trimmed” part of the original impact value, and retain the same total.</p>
      <p>This arrangement has the singular advantage of no additional upper bounds management
t
t
(a) List Splitting</p>
      <p>Ut</p>
      <sec id="sec-2-1">
        <title>Single List</title>
      </sec>
      <sec id="sec-2-2">
        <title>Low List (t)</title>
      </sec>
      <sec id="sec-2-3">
        <title>High list</title>
        <p>(t)
(smart bounds) being needed, nor equivalent run-time manipulation of score estimates. Such
adjustments in the list splitting approach of Kane and Tompa [15] to correct for the constraint
that no document can appear in both ℒ() and ℋ(), and hence that ℒ() + ℋ() is an
overestimate (by an addend of ℒ()) of ’s true upper bound . With postings clipping, ℋ()
is instead set to the maximum residual amount across all of ’s postings, and hence we have
 = ℒ() + ℋ(). In turn, that means that when queries are being processed the lists ℒ()
and ℋ() can be treated as if they were derived from completely independent terms, with all
interactions between them handled by the underlying processing logic, be that MaxScore, WAND,
or BMW. That is, while there are more total postings to be stored and processed, the change
from list splitting with smart bounds to postings clipping substantially simplifies the query-time
processing logic.</p>
        <p>As an orthogonal enhancement, priming can be applied whenever any high-impact list
contains  or more postings, |ℋ()| ≥ . If that holds, then
 0 = max{ℒ() |  ∈  ∧ |ℋ()| ≥ }
(1)
can be used as an initial value for the heap bound, without risking top- integrity. In combination,
the result is that – as we demonstrate in Section 3 – quite dramatic reductions in query processing
times for learned sparse retrieval models can be achieved.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Experiments</title>
      <p>We now describe experiments that quantify the benefits arising from the postings clipping
approach. Our experiments make use of both MSMARCO-v1 (8.8 million passages) and
MSMARCOv2 (138.4 million passages) collections, four representative ranking algorithms, and the PISA [17]
query processing system which was recently shown to outperform the commonly used Anserini
system for document-at-a-time retrieval over learned sparse indexes [18].</p>
      <p>Index Size Since clipping is applied only to postings with more than 256 elements, and even
then only adds 1/64 as many new postings, the space overhead compared to the default index is
negligible. For instance, the largest overhead of 600 MiB to the ≈ 33 GiB index for the uniCOIL
model on MSMARCO-v2 represents an increase of only 1.8%.</p>
      <p>Query Speed Table 1 presents query processing times recorded for the MSMARCO-v1
collection and DeepImpact retrieval, with response latency measured as average milliseconds per
query, and with the three blocks of values corresponding to dynamic query pruning approaches.</p>
      <p>The last row in each block shows the combination of postings clipping, with 1/64 postings
taken into ℋ(), in conjunction with priming (and length-based ordering for MaxScore). The
fastest query time in each of the six sections is highlighted in blue.</p>
      <p>As can be seen, for DeepImpact retrieval, the fastest querying is achieved by MaxScore pruning
with postings clipping. That combination takes less than half the time of standard MaxScore
processing. The gains from posting clipping are less for WAND and VBMW, in part because both
algorithms exhibit greater sensitivity to doubling the number of query terms.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusion</title>
      <p>To keep up with increasingly large volumes of data, search practitioners require sophisticated
structures and processing algorithms, so that response times can remain plausible. In this paper,
we have demonstrated the speed benefits that arise through the use of a novel technique we
call postings clipping. We have established new benchmarks for querying speed, with minimal
costs overheads, for both shallow  = 10 and deep  = 1000 retrieval.
[14] C. M. Daoud, E. S. de Moura, D. Fernandes, A. S. da Silva, C. Rossi, A. Carvalho, Waves: A
fast multi-tier top- query processing algorithm, Information Retrieval 20 (2017) 292–316.
doi:10.1007/s10791-017-9298-6.
[15] A. Kane, F. W. Tompa, Split-lists and initial thresholds for WAND-based search, in: Proc.</p>
      <p>ACM Int. Conf. on Research and Development in Information Retrieval (SIGIR), 2018, pp.
877–880. doi:10.1145/3209978.3210066.
[16] J. Mackenzie, M. Petri, A. Mofat, Anytime ranking on document-ordered indexes, ACM</p>
      <p>Trans. on Information Systems 40 (2022) 13:1–13:32. doi:10.1145/3467890.
[17] A. Mallia, M. Siedlaczek, J. Mackenzie, T. Suel, PISA: Performant indexes and search
for academia, in: Proc. OSIRRC at SIGIR 2019, 2019, pp. 50–56. URL: http://ceur-ws.org/
Vol-2409/docker08.pdf.
[18] J. Mackenzie, A. Trotman, J. Lin, Wacky weights in learned sparse representations and
the revenge of score-at-a-time query evaluation, arXiv:2110.11540 (2021). URL: https:
//arxiv.org/abs/2110.11540.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Mackenzie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mallia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mofat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Petri</surname>
          </string-name>
          ,
          <article-title>Accelerating learned sparse indexes via term impact decomposition, in: Findings of the Association for Computational Linguistics</article-title>
          : EMNLP,
          <year>2022</year>
          , pp.
          <fpage>2830</fpage>
          -
          <lpage>2842</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .findings-emnlp.
          <volume>205</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mallia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Khattab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tonellotto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Suel</surname>
          </string-name>
          ,
          <article-title>Learning passage impacts for inverted indexes</article-title>
          ,
          <source>in: Proc. ACM Int. Conf. on Research and Development in Information Retrieval (SIGIR)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1723</fpage>
          -
          <lpage>1727</lpage>
          . doi:
          <volume>10</volume>
          .1145/3404835.3463030.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Callan</surname>
          </string-name>
          , COIL:
          <article-title>Revisit exact lexical match in information retrieval with contextualized inverted list</article-title>
          ,
          <source>in: Proc. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>3030</fpage>
          -
          <lpage>3042</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .naacl-main.
          <volume>241</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <article-title>A few brief notes on DeepImpact, COIL, and a conceptual framework for information retrieval techniques</article-title>
          ,
          <source>arXiv:2106.14807</source>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2106. 14807.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V.</given-names>
            <surname>Karpukhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Oguz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Min</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Edunov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          , W. Yih,
          <article-title>Dense passage retrieval for open-domain question answering</article-title>
          ,
          <source>in: Proc. Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>6769</fpage>
          -
          <lpage>6781</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2020</year>
          . emnlp-main.
          <volume>550</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ding</surname>
          </string-name>
          , J. Liu, K. Liu,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. X.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <surname>RocketQA:</surname>
          </string-name>
          <article-title>An optimized training approach to dense passage retrieval for open-domain question answering</article-title>
          ,
          <source>in: Proc. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>5835</fpage>
          -
          <lpage>5847</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .naacl-main.
          <volume>466</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>K. M. Risvik</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Chilimbi</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Tan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Kalyanaraman</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Anderson</surname>
          </string-name>
          ,
          <article-title>Maguro, a system for indexing and searching over very large text collections</article-title>
          ,
          <source>in: Proc. Conf. on Web Search and Data Mining (WSDM)</source>
          ,
          <year>2013</year>
          , pp.
          <fpage>727</fpage>
          -
          <lpage>736</lpage>
          . doi:
          <volume>10</volume>
          .1145/2433396.2433486.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>N.</given-names>
            <surname>Tonellotto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Macdonald</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Ounis</surname>
          </string-name>
          ,
          <article-title>Eficient query processing for scalable web search</article-title>
          ,
          <source>Foundations &amp; Trends in Information Retrieval</source>
          <volume>12</volume>
          (
          <year>2018</year>
          )
          <fpage>319</fpage>
          -
          <lpage>500</lpage>
          . doi:
          <volume>10</volume>
          .1561/ 1500000057.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>H. R.</given-names>
            <surname>Turtle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Flood</surname>
          </string-name>
          ,
          <article-title>Query evaluation: Strategies and optimizations</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>31</volume>
          (
          <year>1995</year>
          )
          <fpage>831</fpage>
          -
          <lpage>850</lpage>
          . doi:
          <volume>10</volume>
          .1016/
          <fpage>0306</fpage>
          -
          <lpage>4573</lpage>
          (
          <issue>95</issue>
          )
          <fpage>00020</fpage>
          -
          <lpage>H</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A. Z.</given-names>
            <surname>Broder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Carmel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Herscovici</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zien</surname>
          </string-name>
          ,
          <article-title>Eficient query evaluation using a two-level retrieval process</article-title>
          ,
          <source>in: Proc. ACM Int. Conf. on Information and Knowledge Management (CIKM)</source>
          ,
          <year>2003</year>
          , pp.
          <fpage>426</fpage>
          -
          <lpage>434</lpage>
          . doi:
          <volume>10</volume>
          .1145/956863.956944.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Strohman</surname>
          </string-name>
          , W. B.
          <string-name>
            <surname>Croft</surname>
          </string-name>
          ,
          <article-title>Eficient document retrieval in main memory</article-title>
          ,
          <source>in: Proc. ACM Int. Conf. on Research and Development in Information Retrieval (SIGIR)</source>
          ,
          <year>2007</year>
          , pp.
          <fpage>175</fpage>
          -
          <lpage>182</lpage>
          . doi:
          <volume>10</volume>
          .1145/1277741.1277774.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Suel</surname>
          </string-name>
          , Faster top-
          <article-title>document retrieval using block-max indexes</article-title>
          ,
          <source>in: Proc. ACM Int. Conf. on Research and Development in Information Retrieval (SIGIR)</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>993</fpage>
          -
          <lpage>1002</lpage>
          . doi:
          <volume>10</volume>
          .1145/2009916.2010048.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>C. M. Daoud</surname>
            , E. S. de Moura,
            <given-names>A. L.</given-names>
          </string-name>
          <string-name>
            <surname>Carvalho</surname>
            ,
            <given-names>A. S.</given-names>
          </string-name>
          <article-title>da</article-title>
          <string-name>
            <surname>Silva</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Fernandes</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Rossi</surname>
          </string-name>
          , Fast top-
          <article-title>preserving query processing using two-tier indexes</article-title>
          ,
          <source>Information Processing &amp; Management</source>
          <volume>52</volume>
          (
          <year>2016</year>
          )
          <fpage>855</fpage>
          -
          <lpage>872</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.ipm.
          <year>2016</year>
          .
          <volume>03</volume>
          .005.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>