<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>RULK: A Framework for Representing User Knowledge in Search-as-Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Arthur Câmara</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dima El-Zein</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Célia da-Costa-Pereira</string-name>
          <email>Celia.DA-COSTA-PEREIRA@univ-cotedazur.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Search-As-Learning, Interactive IR, Retrieval system, User Knowledge</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Delft University of Technology</institution>
          ,
          <addr-line>Delft</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Université Côte d'Azur</institution>
          ,
          <addr-line>Labortatoire I3S, CNRS, UMR 7271</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <abstract>
        <p>When learning something new, users generally resort to search systems. By issuing queries and examining search results, users acquire their knowledge from the content of result pages - or documents - until they satisfy their information needs on a given subject. The search system, in return, should be able to support the user during this session by retrieving documents that are, at the same time, relevant to the user query and suitable according to what the user already knows about a topic. Especially for the latter, having a method that can accurately estimate user knowledge is crucial. To tackle that, we propose RULK, a framework for representing the user's knowledge throughout their search sessions. The intuition behind RULK is simple. By keeping an internal representation of the user's knowledge that gets updated as the user progresses on their search session, the framework estimates how much the user knows (or still does not know) about a given topic. We implement two variations of RULK, one based on keywords and one using large language models, and show that their estimations of user knowledge are correlated with actual user knowledge, as measured on a real learning search system. Therefore, RULK clears the path to future learning-focused search systems to provide an even better experience for users.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction and Related</title>
    </sec>
    <sec id="sec-2">
      <title>Work</title>
      <p>
        In recent years, it has become increasingly common for users to use search systems, such as
Web search engines, as learning tools. More often than not, a user with a learning goal (e.g.,
Learning about ethics) resorts to such systems for finding documents that will help them satisfy
their learning goals [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. By submitting queries, evaluating returned rankings, and reading
documents, users interactively explore documents retrieved by the system, acquiring knowledge
until they become satisfied with their knowledge of that subject. This interactive process of
search and knowledge acquisition is also known as Search-As-Learning (SAL) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>https://www.abcamara.com (A. Câmara); https://www.i3s.unice.fr/~elzein/ (D. El-Zein);</p>
      <p>
        Many recent information retrieval systems consider these learning-oriented goals when
retrieving documents [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6, 7</xref>
        ]. One of the main influences of these work comes from the
concept of the “Anomalous State of Knowledge” introduced by Belkin et al. [8]. According to
this concept, people have an internal knowledge model of the world. That model is constantly
evolving and changing as they acquire more knowledge. Suppose they perceive an anomaly in
that model (e.g., missing or contradictory information) during this process. In that case, they
are compelled to seek information, perhaps using a search system, to solve such anomaly.
      </p>
      <p>However, most current search systems are optimised to retrieve documents relevant to a single
query rather than considering a more comprehensive view of the user session and knowledge [9].
Nevertheless, a user with a learning goal will likely need multiple rounds of interactions with
the system, leading to considerably longer sessions. Moreover, as their session progresses, the
users’ cognitive state changes, impacting their behaviour [10, 11, 12].</p>
      <p>Therefore, representing the users’ cognitive (or knowledge) state and updating it during a
search session is gaining attention from researchers. By estimating a user’s knowledge gain,
a search system can help a user reach their learning goals faster by, for instance, retrieving
documents relevant not only to a single query but also to their current knowledge on the
topic [13, 14].</p>
      <p>In order to help future systems to solve that need, we propose a novel framework for
Representing User Learning and Knowledge (R U L K ) to tackle this problem. We combine ideas
already present in some SAL systems to represent and update a user’s state [15, 16, 17, 18] and
add a novel piece to these systems: An Estimator, capable of, given the user’s state, predicting
their current knowledge on the topic.</p>
      <p>Usually, existing systems that implement some knowledge representation have (at least)
two components: A Feature Extractor transforms clicked documents into features, and an
Updater updates an internal representation of the users’ knowledge state. These components
are typically implemented by either extracting weighted keywords from documents [15, 16, 7]
or embedding the documents into some semantic space [13, 18].</p>
      <p>In R U L K , we add a third component, the Estimator, that estimates the users’ knowledge gains
in their session (i.e., how much they learned). It does so by comparing the knowledge state
maintained by the Updater to a “target” knowledge (e.g., a list of keyphrases relevant to a topic).
While predicting users’ knowledge is not new, our framework proposes not only to predict
but also to track users’ knowledge state throughout their sessions. While not considered here,
we do encourage future instantiations to expand on R U L K by including previous works on user
knowledge prediction, like Yu et al. [19] and Liu et al. [20], that used users’ behaviour features,
and Yu et al. [21] that also included web features.</p>
      <p>To demonstrate how one could implement R U L K in SAL systems, we implement two variations
of it: R U L K KW , using keywords representations inspired by El Zein and da Costa Pereira [15, 16],
and R U L K LM , using embeddings generated by a large language model, inspired by Câmara et al.
[18]. Using logs from real-world users’ interaction with a learning-focused search system and
associated user learning scores, we study how each implementation behaves, especially when
estimating user knowledge. Therefore, we aim to answer the following research questions:
RQ1 Are the estimations produced by R U L K correlated to actual users’ knowledge?
RQ2 How do R U L K KW and R U L K LM behave as users’ sessions grow longer?</p>
      <p>Clicked Document
Target Knowledge</p>
      <p>Feature
Extractor</p>
      <p>Updater
σ</p>
      <p>Estimator</p>
      <p>The RULK Framework

vector  ⃗ to get an estimation of the user’s knowledge gain in the session ( ̃).
into  ⃗ by  . Next,  updates the current state  ⃗ with  ⃗ . Finally,  compares  ⃗ to a target knowledge</p>
      <p>By answering these questions, we show that R U L K can estimate a user’s knowledge gain, with
R U L K KW demonstrating a higher (6%) correlation with assessed learning gains. We also show how
R U L K KW can better handle longer sessions, implying a more robust and stable option.</p>
    </sec>
    <sec id="sec-3">
      <title>2. The R U L K Framework</title>
      <sec id="sec-3-1">
        <title>2.1. R U L K Components</title>
        <p>Our framework is composed of three main components: the Feature Extractor ( ), the Updater
( ) and the Estimator ( ). These components interact with each other in multiple situations.
For instance, when estimating the user’s current knowledge,  uses the user’s state provided by
 , which, in its turn, uses the documents’ features provided by  to keep the user’s state.</p>
        <p>Another way they interact is when estimating the user’s knowledge gain.  requires reference
knowledge to estimate how much the user’s knowledge of a topic  has evolved throughout
their session. This reference should represent the knowledge state at which we would consider a
learning task as “achieved”. We refer to this reference knowledge target as the “target knowledge
state” ( ⃗ ), which is also generated by  from a reference document that ideally matches the
knowledge level the user wants to reach.</p>
        <p>In this section, we set out each of the components that make up the R U L K framework and
leave to Section 2.2 the details of two possible instantiations of R U L K , R U L K KW and R U L K LM , and how
they implement each component. We also show an overview of R U L K in Figure 1.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Feature Extractor ( )</title>
        <p>During users’ interactions with search systems, the content of the
pages they read is the main contributor to their knowledge and learning gains. These pages can
be any content the user interacts with during their search session (e.g., Web pages, textbooks,
videos, or courses). Without loss of generality, here we focus on the textual content of the pages
read by the user, referring to it as a document  . Then, given  , a Feature Extractor  encodes it
into  ⃗ , a vector with fixed size  :</p>
        <p>⃗ =  ().</p>
        <p>By fixing the size  of the vectors and encoding all documents in the same embedding space
(e.g., BERT or TF-IDF),  allows R U L K to compare and combine documents easily.</p>
        <p>As mentioned before,  also plays a role when modelling the “target knowledge state” ( ⃗ ).
However, selecting what the target knowledge is is not trivial. Therefore, for simplicity and
discussion, we assume that our SAL system has access to a “reference document” that covers
essential subtopics and themes for  (e.g., a textbook on  ) used as the target knowledge. We
can then generate  ⃗ by encoding this document according to Equation 1.</p>
        <p>Updater ( ) R U L K tracks the user’s knowledge through an internal state represented by a
vector  ⃗ having the same length  as the  ⃗ embeddings produced by  . Following El Zein and
da Costa Pereira [15],  updates  ⃗ after the user reads a document, assuming that they were
able to absorb the content of that document:
′ ⃗</p>
        <p>=  ( ⃗ ,  ⃗ ),
 ̃= ( ⃗  ,  ⃗ ),
where ′ ⃗ is the updated  ⃗ vector—or updated state of the user’s knowledge—after reading a
document  . The document is represented by  ⃗ , generated by Equation 1; and  is a function
that takes  ⃗ and  ⃗ and combines them into an updated representation of the users’ knowledge.
Estimator ( ) R U L K then estimates the users’ knowledge gain on the topic during the session
by comparing the user’s current knowledge state,  ⃗ , to the target  ⃗ :
Where  ̃ is an estimation of the user’s knowledge gain in the session and  is implemented as a
similarity function (e.g., cosine similarity). The intuition behind  is that the user, by progressing
in their session, “moves” their knowledge state ( ⃗ ) towards the target ( ⃗ ). As both vectors are
in the same embedding space, we interpret the similarity between  ⃗ and  ⃗ as an estimate of
how close the user is to acquiring the knowledge contained in  ⃗ . 1</p>
      </sec>
      <sec id="sec-3-3">
        <title>2.2. Implementing R U L K</title>
        <p>We show two possible implementations for deploying R U L K in a search system. The first,
called R U L K KW , relies on keyword-based feature extraction, while the second, R U L K LM , uses Large
Language Models, like BERT. The main diference between the two implementations is in what
semantic space they use when representing documents and the user’s knowledge. Importantly,
both share the same Estimator ( ), implemented as a cosine similarity between  ⃗ and  ⃗ :
 ̃≈  ⃗ ⋅  ⃗ . (4)</p>
        <p>| ⃗ | | ⃗ |
1Note that  ⃗ is a reference of the knowledge a user can acquire based on the target document chosen. While some
users may not be interested in reaching all of the knowledge from the target, others may be interested in going
beyond the knowledge contained in the target document.
(1)
(2)
(3)
R U L K KW :</p>
        <p>We adopt the user’s learning model named vocabulary learning [22], which takes
place at the lower level of Bloom’s taxonomy [23]. This model considers that a user achieves
their need to learn a topic  when they learn a set of related vocabulary keywords.</p>
        <p>For a topic  , we define the target keyword set   = { 1, … ,   }. The keywords are extracted
from a reference document. Once the keywords to be learned are defined, the number of
occurrences of every keyword the user has to read then must also be decided. To define
the target knowledge state,  embeds the reference document as an occurrence vector. The
target knowledge state is then represented as  ⃗
= {
1, … , 
 }, where   is the number of
occurrences of the keyword that the user must read before the framework considers the user to
have learned it.</p>
        <p>Similarly, a document  is embedded by  by counting the occurrences of the keywords   in
it, resulting in  ⃗ = {
1, … , 
 }, where</p>
        <p>is the number of occurrences of   in  .</p>
        <p>
          As for the Updater  , in line with [
          <xref ref-type="bibr" rid="ref6">6, 16</xref>
          ], we consider that the user’s knowledge increases
monotonically. That is, as they read documents  represented by  ⃗ ,  adds the count of the
        </p>
        <p>
          Many recent works have shown that Large Language Models (LLM) perform
extraordinarily well in capturing the semantic meaning of texts [
          <xref ref-type="bibr" rid="ref7">24</xref>
          ]. Furthermore, in the Information
Retrieval domain, transformers-based language models [
          <xref ref-type="bibr" rid="ref8">25</xref>
          ], mainly based on BERT [
          <xref ref-type="bibr" rid="ref9">26</xref>
          ], have
been shown to excel in multiple tasks [
          <xref ref-type="bibr" rid="ref10">27</xref>
          ], even if not fine-tuned in that specific domain [
        </p>
        <p>Given their success, we assess whether such models can act as a Feature Extractor ( ) for
R U L K . Thus, we implement a BERT-based variant of the framework, called R U L K LM , inspired by
the method proposed by Câmara et al. [13] to track user exploration of a topic.</p>
        <p>Both the target knowledge  ⃗ and clicked document’s embedding  ⃗ are represented by an
embedding of fixed length  , as generated by the same language model. Given a document  (or,
conversely, a reference document) with  sentences { 1,  2 …   },  generates, for each sentence   ,
an embedding of size  given by:
 ⃗


= ([]; 
 ∶ ; [ ]),
between  ⃗ and  ⃗ , as shown in Equation 4.2
where ; is a concatenation,  the maximum input size of the model and []
and [ ]
special BERT tokens.  ⃗ (conversely,  ⃗ ) is then given by an element-wise sum over all  ⃗ .</p>
        <p>The Updater  is then a simple element-wise sum over all elements of  ⃗ and  ⃗ . As  ⃗ and
 ⃗ are vectors in the same embedding space,  , similarly to R U L K KW , is also the cosine similarity</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Validating R U L K</title>
      <p>In this section, we discuss how we validated R U L K by describing our dataset, metrics and
implementation details for our two instantiations of R U L K , R U L K KW and R U L K LM 3.
2We also experimented with other options for  (e.g. averaging the embeddings instead of summing) and  (e.g.
Using euclidean distance on a normalised vectors), and all options yielded very similar results.
3Our implementations can be found at https://github.com/ArthurCamara/RULK_SAL
(5)
are
Number of users per topic
Number of topics
Number of queries
Number of documents clicked
Number of snippets seen
Documents Clicked per query
Session duration (minutes)
Document dwell time (seconds)
Pre-test scores (   )
Post-test scores (   )
Actual Learning Gain (ALG)
Realised Potential Learning (RPL)</p>
      <p>
        Using our proposed approaches, we apply the R U L K framework to estimate the user’s
knowledge gain,  ̃. We analyse the logs from real users from a previous study to represent each user’s
current knowledge state  ⃗ as it is updated along their search session. We initiate  ⃗ as a vector
of zeros for all users. (c.f. Section 5 for a discussion on our assumptions and caveats).
Dataset To analyse how our implementations of R U L K behave, especially when estimating
users’ knowledge in a search session, we test R U L K KW and R U L K LM on a publicly available dataset of
SAL sessions built with the logs from the study by Câmara et al. [13] 4. The authors collected this
dataset with a search system implemented on top of SearchX [
        <xref ref-type="bibr" rid="ref13">30</xref>
        ], a framework for Interactive
Information Retrieval research. We also show some statistics about the dataset in Table 1.
      </p>
      <p>The dataset contains the interaction logs of 126 crowd-workers. The authors asked workers
to search about a given topic for at least 45 minutes to collect it. The logged interactions contain
behavioural features, issued queries, and clicked documents. At the start of the study, the system
measured the previous knowledge of each participant on two randomly selected topics and
selected the topic the user demonstrated a lower knowledge of as their target topic  . Finally, at
the end of the search session, the user’s knowledge was measured again, allowing us to calculate
their actual learning knowledge during the session.</p>
      <p>
        As the dataset does not contain the textual contents of the 1107 unique clicked documents,
we used the Wayback Machine API 5 to fetch the documents when the study was conducted
(August 2020). Of all the documents, 33 did not have a snapshot available and were discarded
from our experiments (i.e., we do not consider their impact on user’s knowledge).
Actual Knowledge Gain Measurement The dataset contains self-reported users’
knowledge before and after the search session,    and    respectively. The authors measured
4The data is available at https://github.com/ArthurCamara/CHIIR21-SAL-Scaffolding
5https://web.archive.org/


these values with a Vocabulary Knowledge Scale (VKS) test [
        <xref ref-type="bibr" rid="ref14 ref15">31, 32</xref>
        ], a commonly used method
to measure user knowledge [
        <xref ref-type="bibr" rid="ref16 ref17 ref18">33, 34, 35, 7</xref>
        ]. For that, the researchers presented the users with a
4-point scale questionnaire, asking about their familiarity with ten keywords selected by the
authors from the Wikipedia article related to the topic.
      </p>
      <p>We can measure a user’s learning during their session by computing the diference between
 

and</p>
      <p>. We follow the authors of the original study by using the Realised Potential
Learning (RPL) as the primary metric to measure user knowledge gain. It is defined as follows:
 =
  =
  =
10
10
10 =1
10 =1
 

1
1
∑ (0,</p>
      <p>(  ) −    (  ))
∑ 2 −    (  )
(6)
know it ( (
the term means ( (
where 
is the Absolute Learning Gain of a user,  
the Maximum Learning Gain (i.e., the
maximum amount of new knowledge a user can acquire, given what they already know), and
 (</p>
      <p>) the score of the user for the i-th term.</p>
      <p>The score of a given term  ( (</p>
      <p>)) ranges from 0 to 2. It is measured by asking the user to
rate their familiarity with  on a 4-point scale. In this scale, selecting values 1 or 2 mean that
the user does not know the term ( (</p>
      <p>) = 0). Finally, selecting value 3 means they partially
) = 1) and, by selecting 4, the user indicates that they fully understand what
 ) = 2). Therefore, RPL represents the fraction of knowledge the user
acquired from the total knowledge they could obtain in their session. Here we use RPL and
reported knowledge gain interchangeably.</p>
      <p>Consider a user unfamiliar with all terms from the topic that, after their session, indicates
that they partially learned all terms (i.e.,  

= 0 and  

= 1 for all terms). In this case,
their Absolute Learning Gain 
their Maximum Learning Gain  
(i.e. their average added knowledge per term) is 1, while
is 2 (i.e. they could raise their knowledge by 2 points in all
terms). Therefore, their Realised Potential Learning  
is given by   =


= 0.5.</p>
      <sec id="sec-4-1">
        <title>Target knowledge</title>
        <p>
          The topics used in the original user study came from the list of topics
used in the CAR track from TREC 2018 [
          <xref ref-type="bibr" rid="ref19">36</xref>
          ]. In that track, each topic is the title of a Wikipedia
article from a 2018 dump. Therefore, both of our implementations use these Wikipedia texts,
from the same 2018 dump as the original paper, as “reference documents” for generating  ⃗ .
Furthermore, we use the same dump as in the original paper. It’s also worth mentioning that,
in the user study which originated this dataset, the authors filtered Wikipedia and clones of
Wikipedia from the search results during the user study. Therefore, no document proposed to
the user in the page results of the experiment came from Wikipedia or a similar page.
        </p>
        <sec id="sec-4-1-1">
          <title>R U L K KW implementation</title>
          <p>
            We implement the R U L K KW ’s model by extracting the keywords of  ⃗
using the Yet Another Keyword Extraction (YAKE) method [
            <xref ref-type="bibr" rid="ref20">37</xref>
            ] on the Wikipedia texts. This
method is a lightweight unsupervised automatic keyword extraction method that relies on
statistical features extracted from documents to select the most important keywords of a text.
We set the maximum n-gram size at 3; we noticed, however, that all the keyphrases extracted
by YAKE, for all topics, were 1-gram. We also choose a value of  = 10 as the size of  ⃗ ,  ⃗ , and
 ⃗ ; hence the top 10 keywords in the Wikipedia pages are considered for the vectors. To avoid
keywords that are too similar, we stem the keywords after being extracted by YAKE (using the
Porter Stemmer of the NLTK Python library [
            <xref ref-type="bibr" rid="ref21">38</xref>
            ]). As this can lead to duplicate keywords (e.g.,
water and waters have the same stem, “water”), we remove duplicated stems and add the next
more relevant keyword until ten keywords per topic are left. The same stemming process was
also performed on the clicked documents when generating their embeddings  ⃗ .
R U L K LM implementation We implement R U L K LM ’s  as a BERT-based Language Model.
Specifically, we use a MiniLM [
            <xref ref-type="bibr" rid="ref22">39</xref>
            ] model with 6 layers and a hidden layer’s dimension of 384. The
model was also fine-tuned on the MsMarco dataset [
            <xref ref-type="bibr" rid="ref23">40</xref>
            ], as made available in the SBERT
framework [
            <xref ref-type="bibr" rid="ref12">29</xref>
            ] 6. The embeddings  ⃗ and  ⃗ have a fixed length of  = 384 . We split the documents
into sentences using the NLTK ’s implementation of the Punkt Sentence Tokenizer, feed each
sentence individually into the  and sum their respective embeddings 7.
          </p>
          <p>Mixing R U L K KW and R U L K LM R U L K can be easily extended. To show that, and that our proposed
implementations are not the only ones possible, but rather examples, we propose to combine both
methods. The intuition is simple. While R U L K KW may be useful for capturing some characteristics
of the read documents, R U L K LM can be useful for other types of characteristics. Therefore, we
implement an interpolated estimator  , parameterised by  , defined as:
 RULKLM+KW = ()  ̃RULKLM + (1 − )  ̃RULKKW ,
(7)
where  ̃RULK is the estimated knowledge gain of the respective R U L K implementation.
Experimentally, we found that  = 0.38 yields the better results.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Results</title>
      <p>Recall that we proposed two research questions, measuring diferent aspects of R U L K :</p>
      <sec id="sec-5-1">
        <title>RQ1: Are the estimations produced by R U L K correlated to actual users’ knowledge?</title>
        <p>To answer our first research question, we test the validity of the R U L K by computing the Pearson’s
correlation between the actual knowledge gains of a user and the estimated knowledge gain  ̃,
as measured by each implementation’s  . As our main goal in this paper is to propose R U L K , this
section is mainly descriptive of observed results rather than an in-depth analysis. Therefore,
we leave it to future work the why of these results and how to improve them.</p>
        <p>
          As shown in Table 2, on the set of all users, R U L K KW has a considerable correlation with both
RPL and ALG. Additionally, as shown in the last three columns, the estimated knowledge gain
generated by all our implementations are highly correlated among themselves. This implies
6https://huggingface.co/sentence-transformers/msmarco-MiniLM-L6-cos-v5
7Other common approaches, such as truncating the document, averaging the sentences, and only using the sentence
most similar to the user’s query (MaxP [
          <xref ref-type="bibr" rid="ref24">41</xref>
          ]), but they all resulted to worst or similar performance
Method
RULKKW
RULKLM
RULKKW+LM
        </p>
        <p>0.7831
0.9662
0.7831</p>
        <p>0.9169
that, regardless of implementation, R U L K is a viable option for representing user knowledge. It
is also interesting to note that this repeats for both ALG and RPL. Finally, as a testament to the
lfexibility of R U L K , and reinforcing the idea that L M and K W models capture diferent characteristics
of texts, R U L K KW+LM shows an even higher correlation to actual learning gains when compared to
either implementation in isolation.</p>
        <sec id="sec-5-1-1">
          <title>RQ2: How does R U L K KW and R U L K LM behave as users’ sessions grow longer? To assess the</title>
          <p>second research question, we split users into quartiles based on the length of their sessions as
given by three measurements: Number of queries issued, number of documents clicked, and
session duration in minutes. We show the results of this analysis in Figure 2.</p>
          <p>We can better understand when each implementation fails by measuring how R U L K KW and
R U L K LM behave for diferent users. From Figure 2, we can see that R U L K KW and R U L K LM perform
diferently for diferent types of users. The most interesting result, however, is that the
knowledge estimations of R U L K LM for users with the highest number of interactions are generally
considerably worst in the language-model implementation. For instance, considering the
number of queries, R U L K KW outperforms R U L K LM by 75% in the highest split. We also observe this
advantage for R U L K KW when considering the number of documents clicked (37% of diference)
and session duration (70% in the second-to-last split and 17% in the last). It indicates that, as
the sessions grow longer and the users click on more documents, R U L K KW can better handle the
increasing magnitude of  ⃗ . However, the gap diminished greatly by employing our proposed
R U L K KW+LM method. It shows that, while one approach is better in some scenarios, an approach
that considers multiple features leads to a more robust implementation of R U L K .</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusions and Future Work</title>
      <p>This paper introduced R U L K , a framework for Representing User Learning and Knowledge,
containing three main components: The Feature Extractor  generates embeddings from the
documents clicked by users. The Updater  maintains a vector representing the user’s knowledge,
and the Estimator  estimates how much knowledge the user gained during their search session.
We hope that R U L K can pave the way for applications that benefit from this information, like
ranking functions that can better find documents according to the user’s current knowledge.</p>
      <p>
        This work used simplified assumptions regarding the user’s knowledge acquisition and
learning progress. We want to acknowledge these so that future work can use these as a starting
point. First, we assumed that users can assimilate all of the knowledge encoded in  ⃗ and that
no forgetting takes place. However, factors like the user’s learning rate, familiarity with the
subject, or the time spent reading the content can impact how well the user absorbs knowledge
from documents. Future work can incorporate these factors into R U L K by changing Equation 2
to consider these factors. Another simplifying assumption we made is to consider that the user
has no previous knowledge about the topic (i.e. we initialise  ⃗ as a vector of 0s), although it
has been observed that users have at least some knowledge about the topic they are searching
for [
        <xref ref-type="bibr" rid="ref17">16, 13, 34, 42</xref>
        ].
      </p>
      <p>To demonstrate how other works can implement R U L K , we implemented two variants ourselves:
R U L K KW and R U L K LM , each of them relying on a diferent method for knowledge representation.
While the former uses extracted keywords, the latter uses a transformers-based language model
to generate semantic representations of texts. Through our experiments, we show that both
implementations can, to a certain degree, estimate the actual user knowledge gains, with R U L K KW
leading to a slightly higher correlation with self-reported knowledge gains. We hope that our
framework can be helpful for researchers aiming to incorporate users’ knowledge in search
systems, primarily when focusing on learning (Search-as-Learning) and that our work can spark
discussion on how to estimate and track user knowledge.
Proceedings of the 40th international ACM SIGIR conference on research and development
in information retrieval, 2017, pp. 555–564.
[7] R. Syed, K. Collins-Thompson, Optimizing search results for human learning goals,</p>
      <p>Information Retrieval Journal 20 (2017) 506–523.
[8] N. J. Belkin, R. N. Oddy, H. M. Brooks, Ask for information retrieval: Part i. background
and theory, J. Documentation 38 (1982) 61–71.
[9] A. Hassan Awadallah, R. White, P. Pantel, S. Dumais, Y.-M. Wang, Supporting complex
search tasks, in: Proc. 23rd ACM CIKM, 2014, pp. 829–838.
[10] N. J. Belkin, The cognitive viewpoint in information science, Journal of information
science 16 (1990) 11–15.
[11] M. J. Bates, The design of browsing and berrypicking techniques for the online search
interface, Online review (1989).
[12] P. Ingwersen, Information retrieval interaction, volume 246, Taylor Graham London, 1992.
[13] A. Câmara, N. Roy, D. Maxwell, C. Hauf, Searching to learn with instructional scafolding,
Proceedings of the 2021 Conference on Human Information Interaction and Retrieval
(2021).
[14] N. Roy, A. Câmara, D. Maxwell, C. Hauf, Incorporating widget positioning in interaction
models of search behaviour, Proceedings of the 2021 ACM SIGIR International Conference
on Theory of Information Retrieval (2021).
[15] D. El Zein, C. da Costa Pereira, A cognitive agent framework in information retrieval:
Using user beliefs to customize results, in: International Conference on Principles and
Practice of Multi-Agent Systems, Springer, 2020, pp. 325–333.
[16] D. El Zein, C. da Costa Pereira, User’s knowledge and information needs in information
retrieval evaluation, in: Proceedings of the 30th ACM Conference on User Modeling,
Adaptation and Personalization, 2022, pp. 325–333.
[17] R. Syed, K. Collins-Thompson, Exploring document retrieval features associated with
improved short-and long-term vocabulary learning outcomes, in: Proceedings of the 2018
conference on human information interaction &amp; retrieval, 2018, pp. 191–200.
[18] A. Câmara, D. Maxwell, C. Hauf, Searching, learning, and subtopic ordering: A
simulationbased analysis, in: Ecir, 2022, pp. 142–156.
[19] R. Yu, U. Gadiraju, P. Holtz, M. Rokicki, P. Kemkes, S. Dietze, Predicting user knowledge
gain in informational search sessions, The 41st International ACM SIGIR Conference on
Research &amp; Development in Information Retrieval (2018).
[20] J. Liu, C. Liu, N. J. Belkin, Predicting information searchers’ topic knowledge at diferent
search stages, J. Assoc. Inf. Sci. Technol. 67 (2016) 2652–2666.
[21] R. Yu, R. Tang, M. Rokicki, U. Gadiraju, S. Dietze, Topic-independent modeling of user
knowledge in informational search sessions, Inf. Retr. J. 24 (2021) 240–268.
[22] P. Bailey, L. Jiang, User task understanding: a web search engine
perspective, 2012. URL: https://www.microsoft.com/en-us/research/publication/
user-task-understanding-a-web-search-engine-perspective/, presentation
delivered at the NII Shonan: Whole-Session Evaluation of Interactive Information Retrieval
Systems workshop. 8-11 October 2012, Shonan, Japan.
[23] B. S. Bloom, Taxonomy of educational objectives: The classification of educational goals,</p>
      <p>Cognitive domain (1956).
19–24.
[42] Gadiraju, R. Yu, S. Dietze, P. Holtz, Analyzing knowledge gain of users in informational
search sessions on the web, Proceedings of the 2018 Conference on Human Information
Interaction &amp; Retrieval (2018).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Selwyn</surname>
          </string-name>
          ,
          <article-title>An investigation of diferences in undergraduates' academic use of the internet, Active learning in higher education 9 (</article-title>
          <year>2008</year>
          )
          <fpage>11</fpage>
          -
          <lpage>22</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Biddix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <article-title>Convenience or credibility? a study of college student online research behaviors</article-title>
          ,
          <source>The Internet &amp; Higher Education</source>
          <volume>14</volume>
          (
          <year>2011</year>
          )
          <fpage>175</fpage>
          -
          <lpage>182</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Collins-Thompson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hansen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hauf</surname>
          </string-name>
          ,
          <article-title>Search as learning</article-title>
          ,
          <source>Dagstuhl Reports</source>
          <volume>7</volume>
          (
          <year>2017</year>
          )
          <fpage>135</fpage>
          -
          <lpage>162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Byström</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Järvelin</surname>
          </string-name>
          ,
          <article-title>Task complexity afects information seeking</article-title>
          and use,
          <source>Inf. Process. Manag</source>
          .
          <volume>31</volume>
          (
          <year>1995</year>
          )
          <fpage>191</fpage>
          -
          <lpage>213</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Harbarth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Delsing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Richtscheid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Yücepur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Feldmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Akhavanfarm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Manske</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Othlinghaus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. U.</given-names>
            <surname>Hoppe</surname>
          </string-name>
          ,
          <article-title>Learning by tagging - supporting constructive learning in video-based environments</article-title>
          , in: DeLFI, volume P-
          <volume>284</volume>
          of Lni, Gesellschaft für Informatik e.V.,
          <year>2018</year>
          , pp.
          <fpage>105</fpage>
          -
          <lpage>116</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>R.</given-names>
            <surname>Syed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Collins-Thompson</surname>
          </string-name>
          ,
          <article-title>Retrieval algorithms optimized for human learning</article-title>
          , in:
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>A.</given-names>
            <surname>Tamkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brundage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Clark</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Ganguli</surname>
          </string-name>
          ,
          <article-title>Understanding the capabilities, limitations, and societal impact of large language models</article-title>
          ,
          <source>CoRR abs/2102</source>
          .02503 (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          ,
          <source>in: Nips</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>5998</fpage>
          -
          <lpage>6008</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , ArXiv abs/
          <year>1810</year>
          .04805 (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Nogueira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Yates</surname>
          </string-name>
          ,
          <article-title>Pretrained transformers for text ranking: BERT and beyond</article-title>
          , CoRR abs/
          <year>2010</year>
          .06467 (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>N.</given-names>
            <surname>Thakur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rücklé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>BEIR: A heterogeneous benchmark for zero-shot evaluation of information retrieval models</article-title>
          ,
          <source>in: Thirty-fith Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2)</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>N.</given-names>
            <surname>Reimers</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Sentence-bert: Sentence embeddings using siamese bert-networks, in: Emnlp/ijcnlp (1), Association for Computational Linguistics</article-title>
          ,
          <year>2019</year>
          , pp.
          <fpage>3980</fpage>
          -
          <lpage>3990</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Putra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Moraes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hauf</surname>
          </string-name>
          , Searchx:
          <article-title>Empowering collaborative search research</article-title>
          , in: Sigir, Acm,
          <year>2018</year>
          , pp.
          <fpage>1265</fpage>
          -
          <lpage>1268</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [31]
          <string-name>
            <surname>M. B. Wesche</surname>
            ,
            <given-names>T. S.</given-names>
          </string-name>
          <string-name>
            <surname>Paribakht</surname>
          </string-name>
          ,
          <article-title>Assessing second language vocabulary knowledge: Depth versus breadth</article-title>
          .,
          <source>Canadian Modern Language Review-revue Canadienne Des Langues Vivantes</source>
          <volume>53</volume>
          (
          <year>1996</year>
          )
          <fpage>13</fpage>
          -
          <lpage>40</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [32]
          <string-name>
            <surname>K. A. D. Stahl</surname>
            ,
            <given-names>M. A.</given-names>
          </string-name>
          <string-name>
            <surname>Bravo</surname>
          </string-name>
          ,
          <article-title>Contemporary classroom / vocabulary assessment / for content areas</article-title>
          ,
          <source>The Reading Teacher</source>
          <volume>63</volume>
          (
          <year>2010</year>
          )
          <fpage>566</fpage>
          -
          <lpage>578</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>S.</given-names>
            <surname>Salimzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Gadiraju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hauf</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. van Deursen</surname>
          </string-name>
          ,
          <article-title>Exploring the feasibility of crowdpowered decomposition of complex user questions in text-to-sql tasks</article-title>
          , in: Ht, Acm,
          <year>2022</year>
          , pp.
          <fpage>154</fpage>
          -
          <lpage>165</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>N.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Moraes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hauf</surname>
          </string-name>
          ,
          <article-title>Exploring users' learning gains within search sessions</article-title>
          ,
          <source>Proceedings of the 2020 Conference on Human Information Interaction and Retrieval</source>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [35]
          <string-name>
            <surname>H. L. O'Brien</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Kampen</surname>
            ,
            <given-names>A. W.</given-names>
          </string-name>
          <string-name>
            <surname>Cole</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Brennan</surname>
          </string-name>
          ,
          <article-title>The role of domain knowledge in search as learning</article-title>
          , in: Chiir, Acm,
          <year>2020</year>
          , pp.
          <fpage>313</fpage>
          -
          <lpage>317</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>L.</given-names>
            <surname>Dietz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Gamari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dalton</surname>
          </string-name>
          ,
          <string-name>
            <surname>N.</surname>
          </string-name>
          <article-title>Craswell, TREC complex answer retrieval overview</article-title>
          , in: Trec, volume
          <volume>500</volume>
          -331 of NIST Special Publication,
          <source>National Institute of Standards and Technology (NIST)</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>R.</given-names>
            <surname>Campos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Mangaravite</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pasquali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Jorge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Nunes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <article-title>Yake! collectionindependent automatic keyword extractor</article-title>
          ,
          <source>in: European Conference on Information Retrieval</source>
          , Springer,
          <year>2018</year>
          , pp.
          <fpage>806</fpage>
          -
          <lpage>810</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bird</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Klein</surname>
          </string-name>
          , E. Loper,
          <article-title>Natural language processing with Python: analyzing text with the natural language toolkit, ”</article-title>
          <string-name>
            <surname>O'Reilly Media</surname>
          </string-name>
          , Inc.”,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhou</surname>
          </string-name>
          , Minilm:
          <article-title>Deep self-attention distillation for task-agnostic compression of pre-trained transformers</article-title>
          ,
          <source>in: Advances in Neural Information Processing Systems</source>
          , volume
          <volume>33</volume>
          ,
          <year>2020</year>
          , pp.
          <fpage>5776</fpage>
          -
          <lpage>5788</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>N.</given-names>
            <surname>Craswell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mitra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Yilmaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Campos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          , MS MARCO:
          <article-title>benchmarking ranking models in the large-data regime</article-title>
          , in: Sigir, Acm,
          <year>2021</year>
          , pp.
          <fpage>1566</fpage>
          -
          <lpage>1576</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>Z. A.</given-names>
            <surname>Yilmaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <article-title>Applying BERT to document retrieval with birch</article-title>
          ,
          <source>in: Emnlp/ijcnlp (3)</source>
          ,
          <source>Association for Computational Linguistics</source>
          ,
          <year>2019</year>
          , pp.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>