<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Representing, Tracking and Revising the User's Knowledge: A Search Result Filter Framework</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Discussion Paper</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dima El-Zein</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Célia da-Costa-Pereira</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>This paper presents a framework for a cognitive agent in information retrieval that personalizes the list of returned documents based on what it believes about the user's knowledge. Throughout the interactions between the IR system and the user, the agent builds its beliefs about the user's knowledge by extracting keywords from the content of the documents read by the user. The agent's belief base, which corresponds to the user model, contains also “contextual rules” that allow deriving new beliefs about the user's knowledge. The agent is therefore able to compare its own beliefs with the content conveyed by a to-be-proposed document, and thus understand if the document really contains useful information for the user or not. Finally, in case of beliefs' inconsistency, the agent revises its belief base to restore consistency.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Search Filter</kwd>
        <kwd>Information Retrieval</kwd>
        <kwd>Cognitive Agent</kwd>
        <kwd>Knowledge Extraction</kwd>
        <kwd>Belief Revision</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In the domain of information retrieval, it is not always suficient to return the information
responding only to the query. It is believed that users can be considered as cognitive agents
having their own beliefs and knowledge about the world [1]. They try to fulfill needs for
information by requesting queries and acquire new information by examining the results. In
consequence, the search results must also respond to the user beliefs, knowledge, and search
goals. Considering the user’s cognitive components in the domain of Information Retrieval has
been set as one of the “major challenges” by the IR community in 2018 [2].</p>
      <p>In this paper, we propose an Information Retrieval filter framework that uses the content of
the documents read by the user to learn about his/her knowledge. This cognitive awareness is
employed to personalize the returned documents with respect to what the user already knows.
To our knowledge, there are no research dealing with the content of the documents read by the
user as his/her acquired knowledge.</p>
      <p>The framework we have proposed in [3] works as follows. For every submitted query: () the
system sends the user’s query to the search engine and receives a list of documents relevant
to the query (ii) the agent examines the content of the documents in the list and measures the
similarity between each document and the set of beliefs (iii) the agent returns a filtered list
according to the similarity results (iv) the user reads a proposed document (v) the agent adds the
keywords representing the read document as new beliefs (vi) A reasoning cycle is performed to
derive new beliefs and revise the belief base if needed.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Proposed Framework</title>
      <p>The agent is modeled as a Rule-based agent that consists of beliefs (ground literals) and rules
(Horn clauses). The beliefs represent what the agent believes about its user’s knowledge. When
the agent has  in its belief base, it believes that the user knows that  is true. If the belief
base contains ¬, then the agent believes the user knows that  is not true. On the other
side, rules are the relationships between beliefs, that are used to derive new beliefs from the
agent’s existing ones. Rules have the form of 1 &amp; 2 . . . &amp;  →  where 1, 2, . . . , 
( ≥ 1)   are literals.  is called the derived belief, and each belief  is a premise of the
rule. The &amp; symbol represents the logical and operator. During an agent’s reasoning cycle, if
all the premises of the rule are satisfied (the premise exists in the belief base), the rule is fired
and  is added to the belief base. The rules will be considered static, their extraction/origin will
not be discussed in this paper.</p>
      <p>The agent acquires its beliefs about the user’s knowledge from the documents the user has
read. When the user reads a document , the agent extracts the content of the document and
considers it as an acquired knowledge by the user. We propose applying the RAKE - Rapid
Automatic Keyword Extraction Algorithm - [4] as an easy and understandable method, to extract
the set of scored keywords representing the document. Those keywords will be associated
with the agent’s extracted beliefs. The belief base components are then both, the extracted and
derived beliefs, and the rules.</p>
      <p>We consider that a belief is gradual and an agent might have beliefs more entrenched (or
accepted) than others. We define a “degree” for beliefs to measure this entrenchment:
Definition 1. The degree of a belief  is the degree to which the agent believes the user is
knowledgeable about . It is represented by a decimal ranging between 0 and 1, where 0 means the
lowest degree –the agent believes the user has absolutely no knowledge about , and 1 means the
highest degree –the agent believes the user has the maximum knowledge about .</p>
      <p>
        Let us define document  = {(1; 1) . . . , (; )} as a set of tuples where  is the keyword
extracted by RAKE and  is its related score;  will be associated with an extracted belief 
whose degree is calculated as follows:
( ) =  · max∈ ( ) .
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
      </p>
      <p>In Equation 1 the RAKE score of an extracted keyword is normalized then multiplied by an
adjustment factor  ∈ [0, 1] that weakens the magnitude of the degrees. The adjustment factor
may vary based on diferent characteristics such as the trust on the document’s source, for
example.</p>
      <p>This equation allows the calculation of the degree for extracted beliefs only. As for derived
beliefs, their degrees will depend on the degree of premises that derived them. For that reason,
we track the dependency between the beliefs by following the approach proposed by Alechina
et al. [5] in which the dependency between beliefs is tracked as follows. For every fired rule
instance, a Justification  will record: (i) the derived belief and (ii) a support list, , which
contains the premises of the rules. The dependency information of a belief has the form of two
lists: dependencies and justifications . A dependencies list records the justifications of a belief,
and a justifications list contains all the Justifications where the belief is a member of a support.</p>
      <p>The degree value of a derived belief , (), is equal to that of its highest quality
justification.</p>
      <p>Definition 2.</p>
      <p>
        () = max{(0), . . . , ()}
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
Definition 3. The quality of justification  , ( ), is equal to the degree of the least entrenched
belief in its support list.
      </p>
      <p>
        ( ) = min{() :  ∈ support of  }
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        )
      </p>
      <p>For example, suppose an agent has two beliefs  and  with degree equal to
0.5 and 0.7 respectively. The belief base has also a rule  &amp;  → . It
means that if the agent “believes" in stars and planets, it will believe in galaxies. When the
rule is fired, a Justification 1 denoted as (, [, ]) will be added;
galaxies is the derived belief and [planets, stars] is the support list. The quality of 1 is equal to
{(); ()} = 0.7. 1 is in the dependencies list of  and
in the justifications list both  and .</p>
      <p>While the agent is acquiring more information about the user, it is adding more beliefs to its
belief base. The beliefs might be new, already existing, or contradicting with the existing ones;
that calls for the need of revising beliefs to ensure the belief base is consistent.</p>
      <p>Belief revision is the process of modifying the belief base to maintain its consistency whenever
new information becomes available. We follow the AGM belief revision theory [6] that defines
postulates a rational agent should satisfy when performing belief revision. We consider a belief
base  and a new piece of information .  is inconsistent, when both  and ¬ are in
(), or () = ⊥, or both  and ¬ are logical consequences of . Three operators are
considered: Expansion  + : adds a new belief  that does not contradict with the existing
beliefs. Contraction  ÷ : removes a belief  and all other beliefs that logically imply/entail
it. Revision  * : adds a belief  as long as it does not cause a contradiction in .</p>
      <p>In our framework, if the addition of a belief  will cause inconsistencies in K (because of the
existence of a ¬), the priority/preference is given to the belief with the higher degree: In case
 has the higher degree, the revision operation starts with minimal changes in  to make it
consistent with , contracts ¬ , then adds . If ¬ was a derived belief, we do not contract
other beliefs that derived ¬, as long they are consistent with the remaining beliefs (minimal
change) – coherence approach [7]. In other words, we only contract the belief in question with
its related justification(s), without contracting neither the rule’s premises nor the rule itself. In
case ¬ had the higher degree, then the addition of  is discarded.</p>
      <p>The filtering process is based on the similarity (, ) between the agent’s set of beliefs
 = {(1; (1)), . . . , (; ())} and the content of a proposed document  =
{(1; 1), . . . , (; )} to be proposed to the user. We propose a similarity measure that
considers the degrees of the intersected beliefs and the knowledge in the document. The
formula is inspired by the similarity function proposed by Lau et al. in [8]. Let us consider ,
the set of keywords appearing both in  and in  defined by  = { ∈  : (, ) &gt;
0 ∨ (, ¬) &gt; 0}.</p>
      <p>(, ) =</p>
      <p>0 otherwise.
(4)</p>
      <p>The (, ) = (), if  ∈ ; and 0 otherwise. The similarity formula “rewards”
the documents containing common keywords with the set  and penalizes those containing
keywords whose corresponding negated beliefs are in .</p>
      <p>We set a cutof value  for (, ) that allows to decide whether the knowledge inside
a document is similar to a set of beliefs or not. The filter is used according to the intended
application: when the purpose of framework is reinforcing the user’s knowledge, then
documents that are “close” to the agent’s beliefs will be returned. The documents having a similarity
score greater than the cutof will be returned to the user. Contrarily, when the framework is
employed for novelty, the documents having similarity below the cutof will be returned.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Conclusion</title>
      <p>
        This paper proposed an innovative framework for a rule-based information retrieval agent
which relies on its cognitive abilities to learn about the user’s knowledge. This information is
used to propose new/relevant documents accordingly. The components of the framework are: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        )
Rule-based module, modeling the agent’s beliefs and rules. It performs inference reasoning about
the user’s knowledge, calculates the entrenchment degrees and tracks the dependency between
them. It also revises the beliefs if needed, to maintain consistency. (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) Knowledge extractor
module, extracting knowledge from the documents read by the user. (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) Result filtering module,
compares the content of the potential to-be-proposed documents to the user’s knowledge and
select the “useful” ones to be returned to the user.
      </p>
      <p>For future work, we aim to take into account the confidence in the sources of the documents,
which will probably afect the degree of entrenchment of a belief. Another possible extension
could be to integrate some semantic analysis to deal with semantically similar content.
[4] S. Rose, D. Engel, N. Cramer, W. Cowley, Automatic keyword extraction from individual
documents, Text mining: applications and theory 1 (2010) 1–20.
[5] N. Alechina, M. Jago, B. Logan, Preference-based belief revision for rule-based agents,</p>
      <p>Synthese 165 (2008) 159–177.
[6] C. E. Alchourrón, P. Gärdenfors, D. Makinson, On the logic of theory change: Partial meet
contraction and revision functions, The journal of symbolic logic 50 (1985) 510–530.
[7] P. Gärdenfors, Belief revision: An introduction, Cambridge Tracts in Theoretical Computer</p>
      <p>Science, Cambridge University Press, 1992, pp. 1–28.
[8] R. Y. Lau, P. D. Bruza, D. Song, Belief revision for adaptive information retrieval, in:
Proceedings of the 27th annual international ACM SIGIR conference on Research and
development in information retrieval, 2004, pp. 130–137.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. da Costa</given-names>
            <surname>Móra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. G. P.</given-names>
            <surname>Lopes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Vicari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Coelho</surname>
          </string-name>
          ,
          <article-title>Bdi models and systems: Bridging the gap</article-title>
          .,
          <source>in: ATAL</source>
          ,
          <year>1998</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>27</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Culpepper</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Diaz</surname>
          </string-name>
          , M. D. Smucker, Research frontiers in
          <source>information retrieval: Report from the third strategic workshop on information retrieval in lorne (SWIRL</source>
          <year>2018</year>
          ),
          <source>SIGIR Forum 52</source>
          (
          <year>2018</year>
          )
          <fpage>34</fpage>
          -
          <lpage>90</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>El Zein</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>da Costa Pereira, A cognitive agent framework in information retrieval: Using user beliefs to customize results</article-title>
          ,
          <source>in: The 23rd International Conference on Principles and Practice of Multi-Agent Systems</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>