<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Integrating domain knowledge differences into modeling user clicks on search result pages</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Saraschandra Karanam</string-name>
          <email>s.karanam@uu.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Herre van Oostendorp</string-name>
          <email>h.vanoostendorp@uu.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Utrecht University</institution>
          ,
          <addr-line>Utrecht</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Computational cognitive models developed so far do not incorporate any e ect of individual di erences in domain knowledge of users in predicting user clicks on search result pages. We address this problem using a cognitive model of information search which enables us to use two semantic spaces having low (general semantic space) and high (special semantic space) amount of medical and health related information to represent respectively the low and high knowledge of users in this domain. Simulations on six di cult information search tasks and subsequent matching with actual behavioural data from 48 users (divided into low and high domain knowledge groups based on a domain knowledge test) were conducted. Results showed that the e cacy of modeling user selections on search results (in terms of the number of matches between users and the model and the mean semantic similarity values of the matched search results) is higher with the special semantic space compared to the general semantic space for high domain knowledge participants while for low domain knowledge participants it is the other way around. Implications for support tools that can be built based on these models are discussed.</p>
      </abstract>
      <kwd-group>
        <kwd>Information systems ! Personalization</kwd>
        <kwd>Relevance assessment</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Search systems are typically characterized as a tool to
retrieve relevant information on a target page from the
Internet in response to an user query. These systems are e cient
only for a certain type of tasks such as look-up tasks or
factoid questions (\What is the distance between Mars and
Earth" or \Which is the highest mountain in Europe") and
Search as Learning (SAL), July 21, 2016, Pisa, Italy. The copyright for this paper
remains with its authors. Copying permitted for private and academic purposes.
are not very optimal for other kind of tasks that involve
knowledge discovery, comprehension and learning. Many
times, important information that is needed to solve the
main search problem is present in the intermediate pages
leading to the target page [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. In such cases, it is
important to evaluate information on each page and take
decisions, which hyperlink or search result to click next based
on the information that is already processed. The process
of information search therefore can be conceived as a
process that involves learning or at least knowledge acquisition.
Users acquire new knowledge not only at the end of an
information search process after reaching the target page, but
also during processing intermediate search results and
webpages before they reach the target page. Learning from such
contextual information as users perform search and
navigation tasks on the web, involves complex cognitive processes
that dynamically in uence the evaluation of link texts and
web contents [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]. Search engines do not lay any emphasis
on these intermediate steps and are largely focused only on
the step involving retrieval of relevant information. They
also ignore the in uence of cognitive factors such as domain
knowledge [
        <xref ref-type="bibr" rid="ref17 ref3">17, 3</xref>
        ] on the cognitive processes underlying
information search and navigation and follow a one-size- ts-all
model.
      </p>
      <p>
        In this paper, we focus on the di erences in
information search behavior due to the individual di erences in the
domain knowledge of users. It is known that users with
high domain knowledge have more appropriate mental
representations and higher activation degrees of concepts and
stronger connections between di erent concepts in the
conceptual space compared to users with low domain
knowledge [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. A number of experiments investigating the role
of domain knowledge on information search and navigation
performance have been conducted in the cognitive
psychology community. For example, in a recent study by [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ],
domain experts were found to nd more correct answers in
shorter time and via a path closer to the optimum path than
non-experts. This di erence was stronger as the di culty of
the task increased. Higher domain knowledge enables a user
to formulate more appropriate queries and comprehend the
search results and the content in the websites better, which
in turn, enables them to take informed decisions regarding
which hyperlink or a search result to click next. Domain
experts are also known to evaluate search results more
thoroughly and click more often on relevant search results
compared to non-experts. This is because their higher domain
knowledge enables them to di erentiate between a relevant
and a non-relevant search result better [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        However, understanding behavioral di erences through
laboratory experiments is not only expensive and not scalable
but also time consuming. Simulation of user interactions
with information retrieval systems therefore has been an
active area of research. Among the many click models
developed by researchers from the information retrieval
community [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], only a few take into account cognitive aspects [
        <xref ref-type="bibr" rid="ref22 ref24 ref7">24,
22, 7</xref>
        ]. Moreover, they provide only limited process
description. We therefore, employ computational cognitive models
in our research which are relevant in this context as they
enable us to model di erences in cognitive factors (such as
domain knowledge) underlying any cognitive function(s) (such
as comprehension of search results, arriving at a relevance
estimate of search results and selecting one of the search
results to click). Also, the focus of computational cognitive
models is on the process that leads to the target information
and are therefore more capable of providing opportunities to
incorporate behavioral di erences due to variations in
cognitive factors.
      </p>
      <p>
        The main research question of the current study was: how
to incorporate the di erences in the domain knowledge
levels of users into computational cognitive models that
predict click behaviour on search results? Would such a model
predict user clicks on search engine result pages (SERPs,
henceforth) better than a model that does not incorporate
di erentiated domain knowledge levels of users? Outcomes
of this study would have implications for the support tools
for enhancing information search performance, that can be
built based on the computational cognitive models [
        <xref ref-type="bibr" rid="ref11 ref23">23, 11</xref>
        ].
2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>OUR APPROACH</title>
      <p>We brie y introduce the computational cognitive model
called CoLiDeS that we use in our research and next to that
explain our approach to incorporate di erentiated domain
knowledge into CoLiDeS.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Cognitive model</title>
      <p>
        CoLiDeS, or Comprehension-based Linked Model of
Deliberate Search, developed by Kitajima et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] explains
user navigation behaviour on websites. It divides user
navigation behavior into four stages of cognitive processing:
parsing the webpage into high-level schematic regions,
focusing on one of those schematic regions, elaboration /
comprehension of the screen objects (e.g. hypertext links) within
that region, and evaluating and selecting the most
appropriate screen object (e.g. hypertext link) in that region.
CoLiDeS is based on Information Foraging Theory [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] and
connects to the Construction-Integration reading model of
Kintsch [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. The notion of information scent, de ned as
the estimate of the value or cost of information sources
represented by proximal cues (such as hyperlinks), is central to
CoLiDeS. It is operationalized as the semantic similarity
between the user goal and each of the hyperlinks. The model
predicts that the user is most likely to click on that
hyperlink which has the highest semantic similarity value with the
user goal, i.e., the highest information scent. This process
is repeated for every new page until the user reaches the
target page. CoLiDeS uses Latent Semantic Analysis (LSA,
henceforth) introduced by [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] to compute the semantic
similarities. LSA is an unsupervised machine learning technique
that employs singular value decomposition to build a high
dimensional semantic space using a large corpus of
documents that is representative of the knowledge of the target
user group. The semantic space contains representation of
terms from the corpus in a low number of dimensions,
typically between 250 and 350 and are orthogonal, abstract and
latent [
        <xref ref-type="bibr" rid="ref16 ref18">16, 18</xref>
        ]. CoLiDeS has been successful in simulating
and predicting user link selections, though the websites and
web-pages used were very restricted. The model has also
been successfully applied in nding usability problems, by
predicting links that would be unclear to users [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. CoLiDeS
model has recently been extended to predict user clicks on
search result pages [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. Please note that the CoLiDeS
modeling so far does not incorporate any e ect of individual
differences in the domain knowledge of users and that is what
we will study in the current paper.
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Creation of Semantic Spaces</title>
      <p>
        When using LSA, it is known that the initial corpus of
documents used to create the semantic space in uences the
nal similarity values obtained to a large extent [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Several
factors determine the choice of the corpus and the semantic
space. First and foremost, is the language of the corpus.
In our case, since we are running our experiments in The
Netherlands with Dutch participants, we need a Dutch
semantic space. Secondly, the corpus of documents should
be representative of the knowledge levels of the target user
group. Since the focus of our research is modeling
information search behaviour of older adults compared to younger
adults, we need two corpora that could accurately
characterize the di erence in the knowledge levels of younger and
older adults. We have seen already that older adults have
higher crystallized intelligence or general knowledge and
vocabulary than younger adults. Also, since older adults read
more health related information and are more concerned
with their health, we assume that their health and medical
knowledge would be elaborated than that of younger adults.
Our goal is to build two semantic spaces that are as close as
possible to the above assumptions.
      </p>
      <p>
        We collated two di erent corpora (general corpus and
special corpus, each consisting of 70,000 articles in Dutch)
varying in the amount of medical and health related information.
The general corpus, representing the knowledge of low
domain knowledge users had 90% news articles and 10%
medical and health related articles whereas the special corpus,
representing the knowledge of high domain knowledge users
had 60% news articles and 40% medical and health related
articles. After removing all the stop words, these two
corpora were used to create two semantic spaces using Gallito
[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]: a general semantic space using the general corpus
(average article size: 435 words) and a special semantic space
using the special corpus (average article size: 403 words).
Following settings were used to create the semantic spaces:
300 dimensions, entire article as the window and log-entropy
weighting. Also, a word was included in the nal matrix only
if it occurred in at least 6 articles.
2.3
      </p>
    </sec>
    <sec id="sec-5">
      <title>Evaluation of Semantic Spaces</title>
      <p>
        We used two biomedical data sets [
        <xref ref-type="bibr" rid="ref20 ref6">6, 20</xref>
        ] commonly used
to evaluate measures for computing semantic relevance in
the medical information retrieval community. In the rst
dataset [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], created in collaboration with Mayo Clinic
experts, we have averaged similarity measures on a set of 30
medical terms assessed by a group of 3 physicians, who were
experts in rheumatology and 9 medical coders who were
aware about the concept of semantic similarity on a scale
of 1 (low in similarity) to 4 (high in similarity). The
correlation between physician judgements was 0.68, and that
between the medical coders was 0.78. In the second dataset
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], a set of 36 word pairs extracted from MeSH
repository were assessed on a scale of 0 (low in similarity) to 1
(high in similarity), by 8 medical experts. The word pairs
in both datasets were translated to Dutch by 3 experts
and agreement among them was very high. We dropped
two word-pairs from each data set (antibiotic-allergy and
cholangiocarcinoma-colonoscopy from Pederson's dataset and
meningitis-tricuspid atresia and measles-rubeola from
Hliaoutakis's dataset) as they were not in the two corpora
designed by us. So, we were left with 28 word pairs from
Pedersen's dataset and 34 word pairs from Hliaoutakis's dataset.
Next, we computed the semantic similarity between the
remaining word pairs from both data sets and computed the
correlation with the expert ratings. We expected the
similarity values from the special semantic space to be more highly
correlated with the expert ratings than the similarity values
from the general semantic space as the former was designed
to contain greater medical and health related information.
The correlation values obtained are shown in Table 1.
      </p>
      <p>Analysing the correlation values from Table 1, we found
that the special semantic space gave a signi cantly higher
correlation with Hliaoutakis's dataset and Pedersen's Coders
data set and a marginally higher correlation with Pedersen's
Physicians dataset, compared to the general semantic space.
Based on these outcomes, we were able to con rm that the
special semantic space has health and medical knowledge
better represented than the general semantic space.
2.4</p>
    </sec>
    <sec id="sec-6">
      <title>Behavioral Data Collection</title>
      <p>Actual behavioural data was collected from 48
participants (18 females, 30 males, average age: 48.79) in a
laboratory experiment. Participants were rst presented with a
domain knowledge test on the topic of health in which they
had to answer twelve multiple choice questions. A correct
answer was scored 1 and a wrong answer was scored 0. They
were then presented with six information search tasks in
random order speci cally from the domain of health in order to
examine the behavioural di erences in click behaviour of the
participants, if any, because of the individual di erences in
their knowledge of the health domain. To solve these tasks,
they had to formulate queries using their knowledge and
understanding of the task, the answer was not present in one
location or a website and often they had to evaluate
information from multiple websites. For instance, for the task
\Elbert, 76 years old has been su ering for few years from
burning sensation while passing urine. He passes urine more
often than normal at night and complains of a feeling that the
bladder is not empty completely. Lately, he also developed
acute pain in the hip, lower back and pelvis region. He also
lost 12 kilos in the last 6 months. What problem could he
be su ering from?", users had to formulate multiple queries
such as \kidney stones pain in the back", \burning sensation
when urinating", \urinary infection" to nd the answer. The
answer to this task \prostate cancer" was also not found
easily in the snippets of the search results of the queries, unless
the query was very speci c.</p>
      <p>Participants were allowed to use only Google's search
engine. All the queries generated by the users, the
corresponding search engine result pages and the URLs opened by them
were logged in the backend. There were in total 738 queries
and 724 clicks.
3.</p>
    </sec>
    <sec id="sec-7">
      <title>MODEL SIMULATIONS</title>
      <p>
        We followed the same methodology as authors in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
who extended the CoLiDeS model to predict user clicks on
SERPs. Simulations of CoLiDeS were run using both the
general and the special semantic spaces on each query and
its corresponding search results using the same methodology
followed by Karanam et al., [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] on navigating in a mock-up
website on the human body. We consider each SERP as a
page of a website. And each of the search engine results as
a hyperlink within a page of a website. The problem of
predicting which search engine result to click is now equivalent
to the problem of predicting which hyperlink to click within
a page of a website. Therefore, the process of computing
information scent and predicting which search result to click
remains the same as in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. For the time being, we used
the user-generated query as a representation of local goal or
the understanding of the user at any point of time and
semantic similarity values were computed from it. The main
steps we followed in simulating CoLiDeS on interacting with
the SERPs are the following: (a) the semantic similarity
between the query and the title and the snippet combination
of a search result was computed, (b) this was repeated for
all the remaining titles and snippets on a SERP. The title
and snippet combination with the highest semantic
similarity value with the query was selected by the model, and (c)
nally, this process was repeated for all the queries of a task
and for all the tasks of a participant and nally for all the
participants. (see [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] for details of the procedure).
      </p>
      <p>After running the main simulation steps a) to c) we had
available the model predictions on all the queries of all the
tasks and we could compare these with the actual
selections of real participants. Please note that the CoLiDeS
model can predict only one search result per query using
this methodology becase CoLiDeS does not possess a
backtracking mechanism whereas users in reality click on more
than one search result per query.</p>
    </sec>
    <sec id="sec-8">
      <title>4. SIMULATION RESULTS</title>
      <p>We divided the participants into two groups of high (25
participants) and low (23 participants) prior domain
knowledge (PDK) by taking the median score on the prior domain
knowledge test. We used two metrics to evaluate the e cacy
of modeling: number of matches per task between the model
and the actual participant behaviour and the LSA value of
the matches in our analysis. For both metrics, a 2 (Semantic
Space: General vs. Special) X 2 (Prior Domain Knowledge
(PDK): High vs. Low) mixed model ANOVA was conducted
with semantic space as within-subjects variable and prior
)
k
s
a
tr
e
p
(
se1.0
h
c
t
a
m
fo0.8
r
e
b
m
nu0.6
n
a
e
M
(a)
SemanticSpace</p>
      <p>General
Special
SemanticSpace</p>
      <p>General
Special</p>
      <sec id="sec-8-1">
        <title>Low High</title>
        <p>Domain Knowledge Level</p>
      </sec>
      <sec id="sec-8-2">
        <title>Low High Domain Knowledge Level</title>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Number of matches per task</title>
      <p>For each query and its corresponding SERP, the number of
matches between the model predictions and the actual
participant behavior is computed. This gives us an indication
of how many of the total number of actual participant clicks
per task did the model successfully predict. The main e ects
of semantic space and prior domain knowledge were not
signi cant (p&gt;.05). However, the interaction of semantic space
and prior domain knowledge was signi cant F (1,46) = 7.5,
p&lt;.01 (Figure 1a).
4.2</p>
    </sec>
    <sec id="sec-10">
      <title>LSA value of matched search result</title>
      <p>For each match between the model and the actual
participant click, the LSA value of the match is determined using
the two di erent semantic spaces. Data of 2 participants
from the low domain knowledge group and 3 participants
from the high domain knowledge group had to be dropped
as there were no matches with the actual behaviour for these
participants. The main e ect of semantic space was
signi cant F (1,41) = 8.88, p&lt;.005. The main e ect of prior
domain knowledge was not signi cant (p&gt;.05). The
interaction of semantic space and prior domain knowledge was
tending towards signi cance F (1,41) = 2.9, p&lt;.09 (Figure
1b).</p>
      <p>Taking all together, Figure 1a shows that for participants
with high domain knowledge, the number of matches was
signi cantly higher with the special semantic space whereas
for participants with low domain knowledge, the number of
matches was signi cantly higher with the general
semantic space. From Figure 1b, we can see that the special
semantic space matched user behaviour with a signi cantly
higher LSA value, especially for participants with high
domain knowledge.</p>
    </sec>
    <sec id="sec-11">
      <title>CONCLUSIONS</title>
      <p>Indeed the results show that the modeling should take
into account individual di erences in domain knowledge and
adapt the semantic space to these di erences: with high
domain knowledge participants the e cacy of the modeling (in
terms of the number of matches and the LSA values of the
matched search results) is higher with the special semantic
space compared to the general semantic space while for low
domain knowledge participants it is the other way around. A
possible explanation for the interaction e ect is that the
special and the general semantic spaces give appropriate
similarity values as assessed by users with high (more precise)
and low (less precise) domain knowledge respectively. It is
important to note that these interaction e ects are lost when
semantic space is not used as a factor in the analysis. That
is, if we would not have used semantic space as a factor,
we would have concluded that there is no di erence in the
model's performance between the participants with high and
low domain knowledge levels. This would have been a hasty
conclusion because when we included semantic space as a
factor in the analysis, there was an e ect of PDK, but it was
dependent on the type of semantic space.</p>
      <p>
        Overall, our outcomes suggest that using appropriate
semantic spaces - a semantic space with high domain
knowledge represented for high domain knowledge users and a
semantic space with low domain knowledge represented for
low domain knowledge users - gives better prediction
outcomes. Improved predictive capacity of these models would
lead to more accurate model-generated support for search
and navigation which, in turn, would lead to enhanced
information seeking performance, as two studies have already
shown [
        <xref ref-type="bibr" rid="ref11 ref23">11, 23</xref>
        ]. For each task, navigation support was
generated by recording the step-by-step decisions made by the
cognitive model which in turn are based on the semantic
relatedness of hyperlinks to the user goal (given by a task
description). The model predictions were presented to the
user in the form of visually highlighted hyperlinks. In both
studies, the navigation performance of participants who
received such support was found to be more structured and
less disoriented compared to participants who did not
receive such support. This was found to be true, especially for
participants with a particular cognitive de cit: such as low
spatial ability.
      </p>
      <p>
        Model generated support for information search and
navigation contributes to the knowledge acquisition process as
it helps the users in e ciently ltering unnecessary
information. It gives them more time to process and evaluate
relevant information during the intermediate stages of
clicking on search results and web-pages within websites before
reaching the target page. This helps in reducing user's e ort
in turn lessening cognitive load. This can lead to better
comprehension and retention of relevant material (because
contextual information relevant to the user's goal is emphasized
by model generated support), thereby, leading to higher
incidental learning outcomes. Concerning precising the
modeling itself, we are currently running experiments with the
more advanced model CoLiDeS+ [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] which was found to be
more e cient than CoLiDeS in locating the target page on
real websites [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. CoLiDeS+ incorporates contextual
information in addition to information scent and implements
backtracking strategies and therefore can predict more than
one click on a SERP. Lastly, the domain of health has been
used only as an example and we think that these results
would be generalizable to any domain.
      </p>
    </sec>
    <sec id="sec-12">
      <title>ACKNOWLEDGMENTS</title>
      <p>This research was supported by Netherlands Organization
for Scienti c Research (NWO), ORA-Plus project MISSION
(464-13-043), and carried out in collaboration with
University of Toulouse and University of Illinois.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Blackmon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Mandalia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Polson</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Kitajima</surname>
          </string-name>
          .
          <article-title>Automating usability evaluation: Cognitive walkthrough for the web puts lsa to work on real-world hci design problems</article-title>
          . In T. K. Landauer,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>McNamara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dennis</surname>
          </string-name>
          , and W. Kintsch, editors,
          <source>Handbook of Latent Semantic Analysis</source>
          , pages
          <volume>345</volume>
          {
          <fpage>375</fpage>
          . Lawrence Erlbaum Associates Mahwah, NJ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chuklin</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Markov</surname>
          </string-name>
          , and M. de Rijke.
          <article-title>Click models for web search</article-title>
          .
          <source>Synthesis Lectures on Information Concepts</source>
          ,
          <source>Retrieval, and Services</source>
          ,
          <volume>7</volume>
          (
          <issue>3</issue>
          ):1{
          <fpage>115</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Cole</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , C. Liu,
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Belkin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Gwizdka</surname>
          </string-name>
          .
          <article-title>Knowledge e ects on document selection in search results pages</article-title>
          .
          <source>In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , pages
          <volume>1219</volume>
          {
          <fpage>1220</fpage>
          . ACM,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>W.-T.</given-names>
            <surname>Fu</surname>
          </string-name>
          .
          <article-title>From plato to the world wide web: Information foraging on the internet</article-title>
          . In M. T. Peter,
          <string-name>
            <given-names>T. H.</given-names>
            <surname>Thomas</surname>
          </string-name>
          , and W. R. Trevor, editors,
          <source>Cognitive Search</source>
          , pages
          <volume>283</volume>
          {
          <fpage>299</fpage>
          . MIT Press,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W.-T.</given-names>
            <surname>Fu</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Dong</surname>
          </string-name>
          .
          <article-title>Collaborative indexing and knowledge exploration: A social learning model</article-title>
          .
          <source>IEEE Intelligent Systems, (1):</source>
          <volume>39</volume>
          {
          <fpage>46</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hliaoutakis</surname>
          </string-name>
          .
          <article-title>Semantic similarity measures in mesh ontology and their application to information retrieval on medline</article-title>
          .
          <source>Master's thesis</source>
          ,
          <source>Technical Univ. of Crete</source>
          , Dept. of Electronic and Computer Engineering, Crete, Greece,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , W. Chen,
          <string-name>
            <given-names>G.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <article-title>Characterizing search intent diversity into click models</article-title>
          .
          <source>In Proceedings of the 20th International Conference on World Wide Web</source>
          , pages
          <volume>17</volume>
          {
          <fpage>26</fpage>
          . ACM,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>G.</given-names>
            <surname>Jorge-Botana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Leon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Olmos</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Escudero.</surname>
          </string-name>
          <article-title>Latent semantic analysis parameters for essay evaluation using small-scale corpora</article-title>
          .
          <source>Journal of Quantitative Linguistics</source>
          ,
          <volume>17</volume>
          (
          <issue>1</issue>
          ):1{
          <fpage>29</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>I.</given-names>
            <surname>Juvina</surname>
          </string-name>
          and H. van Oostendorp.
          <article-title>Modeling semantic and structural knowledge in web navigation</article-title>
          .
          <source>Discourse Processes</source>
          ,
          <volume>45</volume>
          (
          <issue>4-5</issue>
          ):
          <volume>346</volume>
          {
          <fpage>364</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Karanam</surname>
          </string-name>
          ,
          <string-name>
            <surname>H. van Oostendorp</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W. T.</given-names>
            <surname>Fu</surname>
          </string-name>
          .
          <article-title>Performance of computational cognitive models of web-navigation on real websites</article-title>
          .
          <source>Journal of Information Science</source>
          ,
          <volume>42</volume>
          (
          <issue>1</issue>
          ):
          <volume>94</volume>
          {
          <fpage>113</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Karanam</surname>
          </string-name>
          ,
          <string-name>
            <surname>H. van Oostendorp</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Indurkhya</surname>
          </string-name>
          .
          <article-title>Towards a fully computational model of web-navigation</article-title>
          .
          <source>In Modern Approaches in Applied Intelligence</source>
          , pages
          <fpage>327</fpage>
          {
          <fpage>337</fpage>
          . Springer,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Karanam</surname>
          </string-name>
          ,
          <string-name>
            <surname>H. van Oostendorp</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Indurkhya</surname>
          </string-name>
          .
          <article-title>Evaluating colides+ pic: the role of relevance of pictures in user navigation behaviour</article-title>
          .
          <source>Behaviour &amp; Information Technology</source>
          ,
          <volume>31</volume>
          (
          <issue>1</issue>
          ):
          <volume>31</volume>
          {
          <fpage>40</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Karanam</surname>
          </string-name>
          , H. van Oostendorp,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sanchiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chevalier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>W. T.</given-names>
            <surname>Fu</surname>
          </string-name>
          .
          <article-title>Modeling and predicting information search behavior</article-title>
          .
          <source>In Proceedings of the 5th International Conference on Web Intelligence</source>
          ,
          <article-title>Mining and Semantics, page 7</article-title>
          . ACM,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>W.</given-names>
            <surname>Kintsch</surname>
          </string-name>
          .
          <article-title>Comprehension: A paradigm for cognition</article-title>
          . Cambridge university press,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Kitajima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. H.</given-names>
            <surname>Blackmon</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P. G.</given-names>
            <surname>Polson</surname>
          </string-name>
          .
          <article-title>A comprehension-based model of web navigation and its application to web usability analysis</article-title>
          .
          <source>People and Computers</source>
          , pages
          <volume>357</volume>
          {
          <fpage>374</fpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>T. K. Landauer</surname>
            ,
            <given-names>D. S.</given-names>
          </string-name>
          <string-name>
            <surname>McNamara</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Dennis</surname>
            , and
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Kintsch</surname>
          </string-name>
          .
          <article-title>Handbook of latent semantic analysis</article-title>
          .
          <source>Mahwah</source>
          ,NJ: Erlbaum,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>S.</given-names>
            <surname>Monchaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Amadieu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chevalier</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Marine</surname>
          </string-name>
          .
          <article-title>Query strategies during information searching: E ects of prior domain knowledge and complexity of the information problems to be solved</article-title>
          .
          <source>Information Processing &amp; Management</source>
          ,
          <volume>51</volume>
          (
          <issue>5</issue>
          ):
          <volume>557</volume>
          {
          <fpage>569</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Olmos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Jorge-Botana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Leon</surname>
          </string-name>
          ,
          <string-name>
            <surname>and I. Escudero.</surname>
          </string-name>
          <article-title>Transforming selected concepts into dimensions in latent semantic analysis</article-title>
          .
          <source>Discourse Processes</source>
          ,
          <volume>51</volume>
          (
          <issue>5-6</issue>
          ):
          <volume>494</volume>
          {
          <fpage>510</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>C.</given-names>
            <surname>Olston</surname>
          </string-name>
          and
          <string-name>
            <given-names>E. H.</given-names>
            <surname>Chi</surname>
          </string-name>
          .
          <article-title>Scenttrails: Integrating browsing and searching on the web</article-title>
          .
          <source>ACM Transactions on Computer-Human Interaction (TOCHI)</source>
          ,
          <volume>10</volume>
          (
          <issue>3</issue>
          ):
          <volume>177</volume>
          {
          <fpage>197</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>T.</given-names>
            <surname>Pedersen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. V.</given-names>
            <surname>Pakhomov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Patwardhan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. G.</given-names>
            <surname>Chute</surname>
          </string-name>
          .
          <article-title>Measures of semantic similarity and relatedness in the biomedical domain</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          ,
          <volume>40</volume>
          (
          <issue>3</issue>
          ):
          <volume>288</volume>
          {
          <fpage>299</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>P.</given-names>
            <surname>Pirolli</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Card</surname>
          </string-name>
          . Information foraging.
          <source>Psychological review</source>
          ,
          <volume>106</volume>
          (
          <issue>4</issue>
          ):
          <fpage>643</fpage>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          .
          <article-title>Personalized click model through collaborative ltering</article-title>
          .
          <source>In Proceedings of the fth ACM International Conference on Web Search and Data Mining</source>
          , pages
          <volume>323</volume>
          {
          <fpage>332</fpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>H. van Oostendorp and I.</given-names>
            <surname>Juvina</surname>
          </string-name>
          .
          <article-title>Using a cognitive model to generate web navigation support</article-title>
          .
          <source>International Journal of Human-Computer Studies</source>
          ,
          <volume>65</volume>
          (
          <issue>10</issue>
          ):
          <volume>887</volume>
          {
          <fpage>897</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Xing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-Y.</given-names>
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , S. Ma, and
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhang</surname>
          </string-name>
          .
          <article-title>Incorporating user preferences into click models</article-title>
          .
          <source>In Proceedings of the 22nd ACM International Conference on Conference on Information &amp; Knowledge Management</source>
          , pages
          <volume>1301</volume>
          {
          <fpage>1310</fpage>
          . ACM,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>