<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Developing the Quantum Probability Ranking Principle</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Guido Zuccon</string-name>
          <email>guido@dcs.gla.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Leif Azzopardi</string-name>
          <email>leif@dcs.gla.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computing Science, University of Glasgow</institution>
          ,
          <addr-line>Scotland</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computing Science, University of Glasgow</institution>
          ,
          <addr-line>Scotland</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2010</year>
      </pub-date>
      <fpage>27</fpage>
      <lpage>28</lpage>
      <abstract>
        <p>In this work, we summarise the development of a ranking principle based on quantum probability theory, called the Quantum Probability Ranking Principle (QPRP), and we also provide an overview of the initial experiments performed employing the QPRP. The main di erence between the QPRP and the classic Probability Ranking Principle, is that the QPRP implicitly captures the dependencies between documents by means of \quantum interference". Subsequently, the optimal ranking of documents is not based solely on documents' probability of relevance but also on the interference with the previously ranked documents. Our research shows that the application of quantum theory to problems within information retrieval can lead to consistently better retrieval e ectiveness, while still being simple, elegant and tractable.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        The idea of using quantum theory in information retrieval
(IR) was formally put forward by van Rijsbergen [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] in 20041.
In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], the main thesis of this seminal book is to use
quantum theory as a bridge between the three mainstream IR
approaches; i.e. vector space models, logic models and
probability models. While this direction has been largely
unexplored, recently there has been a spate of work which aims
to develop quantum inspired or quantum based information
retrieval models [
        <xref ref-type="bibr" rid="ref1 ref11 ref13 ref2 ref3 ref4 ref5 ref6 ref7 ref8">1, 2, 3, 4, 5, 6, 8, 7, 13, 11</xref>
        ].
      </p>
      <p>
        In this work, we report on the the development the
Quantum Probability Ranking Principle [
        <xref ref-type="bibr" rid="ref12 ref14">14, 12</xref>
        ]. The ranking
principle is derived by developing an analogy between the
famous double-slit experiment and document ranking. The
double slit experiment was conducted to demonstrate that
kolmogorovian probability fails to adequately describe the
outcome of physical phenomena, and this motivated the
development of quantum probability theory which
incorporates the quantum interference between events.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], it is hypothesized that this quantum interference
can be used to account for the interdependence between
documents and their associated judgements. In certain tasks,
1Prior to this, van Rijsbergen gave talks as early as 1996 on
the topic.
the relevance of a document may depend on the previous
documents already assessed, for example in the novelty and
diversity tracks. In sub-topic retrieval, the IR system has
to provide a document ranking which covers all the
possible facets (subtopics) relevant to the user's information
need as soon as possible in the ranking. Consequently,
following the traditional Probability Ranking Principle, where
document dependence is ignored, leads to sub-optimal
performance [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]2. In [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], we perform a series of experiments
that also indicate this is the case, and further show that the
QPRP leads to better empirical performance. This is
because within the QPRP the interdependence between
documents is naturally accounted for through the quantum
interference, and the QPRP suggests that documents ranked
until position n 1 interfere with the degree of relevance of
the document ranked at position n. Intuitively, documents
expressing diverse information have higher degree of
interference than documents that are similar. For the same reason,
documents containing novel information might strongly
interfere with documents ranked in previous positions. Even
contrary information might be captured by the interference
term: documents containing content contrary to the one
presented at the previous rank positions might trigger a revision
of user's beliefs about the topic.
      </p>
      <p>The remainder of this paper is as follows: the next
section brie y outlines the QPRP. Section 3 presents the main
results from the study recently performed on sub-topic
retrieval. Finally, we conclude in Section 4 by outlining the
directions for further work using the quantum based ranking
principle.
2.</p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], the Quantum Probability Ranking Principle is
proposed and it's derivation is based on an analogy with the
famous double slit experiment. The resultant of this work
was the following formulation: when ranking documents, the
IR system has to maximise the total satisfaction of the user
given the document ranking, achievable by maximising the
total probability of the ranking. Using the quantum law of
total probability, the resultant ranking strategy impose to
select at each rank position a document d such that
d = arg max P (di) +
(1)
      </p>
      <p>X
dx2RA</p>
      <p>Idx;di
where RA is the list of documents already ranked and Idx;di
is the interference between documents dx and di. Note that
2This has led to arguments for the development of a new
ranking principle.
the traditional PRP is equivalent to the QPRP when the
interference is null, i.e. Idx;di = 0; 8dx; di 2 C, the
documents corpus. In physics, the interference indicates the
amount and kind of interaction between waves. If two waves
strongly interact with each other, then the absolute value of
their interference is high, and vice versa low. The
interaction can generate two di erent outcomes: either increase the
e ect generated by the sum of the two waves (constructive
interference, I &gt; 0) or decrease it (destructive interference,
I &lt; 0). In IR, the interference Idx;di could be negative or
positive, and thus demote or promote a document in the
ranking depending on the context. For instance in sub-topic
retrieval it would be sensible if documents related to the
same subtopics negatively interfere, lowering the chances to
rank both of them at high positions. This scenario is
discussed in the next section.</p>
      <p>
        THE QPRP IN SUBTOPIC RETRIEVAL
In [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], the QPRP is empirically tested and validated on
the subtopic retrieval task. The ranking under the QPRP
was compared with the rankings of models which upholds
the PRP, and also against state-of-the-art strategies for subtopic
retrieval, i.e. MMR and Portfolio Theory (PT) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        The main point of this experimentation was to
determine whether the inherent document interdependence could
be accounted for by the interference component within the
QPRP. Intuitively, the interference component depends upon
both the inter-document dependencies and the document's
relevance probabilities. Since it is not possible to estimate
the interference component directly from the text statistics,
for the experiments reported in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], we have used the
Pearson's correlation between interfering documents to compute
the interference. We performed the empirical investigation
over the TREC subtopic retrieval track, which includes
documents from the Financial Times of London contained in
TREC 6,7 and 8 collections and 20 ad-hoc retrieval topics,
composed of subtopics, from the TREC interactive tracks.
We retrieved documents and generated the initial
probability distribution using Okapi BM25: this represented the
PRP ranking. Afterwards we re-ranked the documents
according to three di erent strategies: our QPRP method
and two state-of-the-art techniques for subtopic retrieval,
i.e. MMR and PT, which required parameters tuning. The
experiments were repeated varying the level of retrieval
cuto and the length of the queries.
      </p>
      <p>From the experimental results3, we found that (1) the
QPRP improves upon PRP baselines for all levels of
Sprecision and S-recall, (2) the QPRP outperforms MMR and
PT across most levels, (3) the QPRP consistently
outperforms other strategies across all topics when considering
SMRR@100%, meaning that on each topic the QPRP returns
complete coverage of all subtopics at a rank lower than all
the other strategies. And, unlike MMR and PT, no tuning
or training is required!</p>
    </sec>
    <sec id="sec-2">
      <title>CONCLUSIONS</title>
      <p>In this paper we have reported about the recent
introduction of a novel ranking strategy, the QPRP, based on
quantum probability and inspired by an analogy with the double
slits experiment in physics. The QPRP naturally encodes
3Experimental results are available online at http://www.
dcs.gla.ac.uk/~guido/qprpresults.html
the interdependence between documents through quantum
interference. The new ranking strategy has been
empirically investigated, showing that the QPRP consistently
outperforms both the PRP and state-of-the-art approaches, i.e.
MMR and PT, without requiring parameter tuning. This
suggests that the use of Quantum Theory to model
processes within information retrieval can lead to substantial
improvements in retrieval e ectiveness.</p>
      <p>Future work examining the utility and applications of the
Quantum Probability Ranking Principle will be directed
towards:
impact of the Pearson's correlation coe cient as a mean
to approximate interference;
alternative estimations of the interference;
how to derive a complex amplitude distribution from
the document corpus;
the relationships between interference in the quantum
probability framework and conditional probabilities in
Kolmogorovian probability theory; and,
how to apply the QPRP paradigm to other retrieval
tasks, e.g. ad-hoc retrieval.</p>
      <p>Acknowledgements: We would like to thank Keith van
Rijsbergen for his collaboration, support and mentoring, and Massimo
Melucci for his comments and suggestions. This work has been
funded by the EPSRC Renaissance project (EP/F014384/1) and
the Royal Society International Joint Project JP080734.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Arafat</surname>
          </string-name>
          .
          <article-title>Foundations research in information retrieval inspired by quantum theory</article-title>
          .
          <source>PhD thesis</source>
          , University of Glasgow,
          <year>December 2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Arafat</surname>
          </string-name>
          and
          <string-name>
            <surname>C. J. van Rijsbergen.</surname>
          </string-name>
          <article-title>Quantum theory and the nature of search</article-title>
          .
          <source>In QI '07</source>
          , pages
          <fpage>114</fpage>
          {
          <fpage>121</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Flender</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kitto</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Bruza</surname>
          </string-name>
          .
          <article-title>Beyond ontology in information systems</article-title>
          .
          <source>In QI 2009</source>
          , pages
          <fpage>276</fpage>
          {
          <fpage>288</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hou</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Song</surname>
          </string-name>
          .
          <article-title>Characterizing pure high-order entanglements in lexical semantic spaces via information geometry</article-title>
          .
          <source>In QI 2009</source>
          , pages
          <fpage>237</fpage>
          {
          <fpage>250</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A. F.</given-names>
            <surname>Huertas-Rosero</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Azzopardi</surname>
          </string-name>
          , and
          <string-name>
            <surname>C. J. van Rijsbergen.</surname>
          </string-name>
          <article-title>Eraser lattices and semantic contents</article-title>
          .
          <source>In QI 2009</source>
          , pages
          <fpage>266</fpage>
          {
          <fpage>275</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Melucci</surname>
          </string-name>
          .
          <article-title>A basis for information retrieval in context</article-title>
          .
          <source>ACM TOIS</source>
          ,
          <volume>26</volume>
          (
          <issue>3</issue>
          ):1{
          <fpage>41</fpage>
          ,
          <year>June 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Piwowarski</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Lalmas</surname>
          </string-name>
          .
          <article-title>A quantum-based model for interactive information retrieval</article-title>
          .
          <source>In ICTIR '09</source>
          , pages
          <fpage>224</fpage>
          {
          <fpage>231</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>B.</given-names>
            <surname>Piwowarski</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Lalmas</surname>
          </string-name>
          .
          <article-title>Structured information retrieval and quantum theory</article-title>
          .
          <source>In QI '09</source>
          , pages
          <fpage>289</fpage>
          {
          <fpage>298</fpage>
          ,
          <string-name>
            <surname>March</surname>
          </string-name>
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>C. J. van Rijsbergen</surname>
          </string-name>
          .
          <source>The Geometry of Information Retrieval</source>
          . Cambridge University Press,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhu</surname>
          </string-name>
          .
          <article-title>Portfolio theory of information retrieval</article-title>
          .
          <source>In SIGIR '09</source>
          , pages
          <fpage>115</fpage>
          {
          <fpage>122</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G.</given-names>
            <surname>Zuccon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Azzopardi</surname>
          </string-name>
          , and
          <string-name>
            <surname>C. J. van Rijsbergen.</surname>
          </string-name>
          <article-title>Revisiting logical imaging for information retrieval</article-title>
          .
          <source>In SIGIR '09</source>
          , pages
          <fpage>766</fpage>
          {
          <fpage>767</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Zuccon</surname>
          </string-name>
          and
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Azzopardi</surname>
          </string-name>
          .
          <article-title>Using the Quantum Probability Ranking Principle to Rank Interdependent Documents</article-title>
          .
          <source>In ECIR</source>
          <year>2010</year>
          ,
          <year>2010</year>
          . to appear.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Zuccon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Azzopardi</surname>
          </string-name>
          , and
          <string-name>
            <surname>C. J. van Rijsbergen.</surname>
          </string-name>
          <article-title>A formalization of logical imaging for information retrieval using quantum theory</article-title>
          .
          <source>In DEXA '08</source>
          , pages
          <issue>3{8</issue>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>G.</given-names>
            <surname>Zuccon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. A.</given-names>
            <surname>Azzopardi</surname>
          </string-name>
          , and
          <string-name>
            <surname>K. van Rijsbergen.</surname>
          </string-name>
          <article-title>The quantum probability ranking principle for information retrieval</article-title>
          .
          <source>In ICTIR '09</source>
          , pages
          <fpage>232</fpage>
          {
          <fpage>240</fpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>