=Paper= {{Paper |id=Vol-1647/SAL2016_paper_6 |storemode=property |title=Integrating Domain Knowledge Differences into Modeling User Clicks on Search Result Pages |pdfUrl=https://ceur-ws.org/Vol-1647/SAL2016_paper_6.pdf |volume=Vol-1647 |authors=Saraschandra Karanam,Herre van Oostendorp |dblpUrl=https://dblp.org/rec/conf/sigir/KaranamO16 }} ==Integrating Domain Knowledge Differences into Modeling User Clicks on Search Result Pages== https://ceur-ws.org/Vol-1647/SAL2016_paper_6.pdf

Integrating domain knowledge differences into modeling
user clicks on search result pages

Saraschandra Karanam Herre van Oostendorp
Utrecht University Utrecht University
Utrecht, The Netherlands Utrecht, The Netherlands
s.karanam@uu.nl h.vanoostendorp@uu.nl

ABSTRACT are not very optimal for other kind of tasks that involve
Computational cognitive models developed so far do not knowledge discovery, comprehension and learning. Many
incorporate any effect of individual differences in domain times, important information that is needed to solve the
knowledge of users in predicting user clicks on search result main search problem is present in the intermediate pages
pages. We address this problem using a cognitive model leading to the target page [19]. In such cases, it is impor-
of information search which enables us to use two semantic tant to evaluate information on each page and take deci-
spaces having low (general semantic space) and high (special sions, which hyperlink or search result to click next based
semantic space) amount of medical and health related infor- on the information that is already processed. The process
mation to represent respectively the low and high knowl- of information search therefore can be conceived as a pro-
edge of users in this domain. Simulations on six difficult cess that involves learning or at least knowledge acquisition.
information search tasks and subsequent matching with ac- Users acquire new knowledge not only at the end of an in-
tual behavioural data from 48 users (divided into low and formation search process after reaching the target page, but
high domain knowledge groups based on a domain knowl- also during processing intermediate search results and web-
edge test) were conducted. Results showed that the efficacy pages before they reach the target page. Learning from such
of modeling user selections on search results (in terms of the contextual information as users perform search and naviga-
number of matches between users and the model and the tion tasks on the web, involves complex cognitive processes
mean semantic similarity values of the matched search re- that dynamically influence the evaluation of link texts and
sults) is higher with the special semantic space compared to web contents [4, 5]. Search engines do not lay any emphasis
the general semantic space for high domain knowledge par- on these intermediate steps and are largely focused only on
ticipants while for low domain knowledge participants it is the step involving retrieval of relevant information. They
the other way around. Implications for support tools that also ignore the influence of cognitive factors such as domain
can be built based on these models are discussed. knowledge [17, 3] on the cognitive processes underlying in-
formation search and navigation and follow a one-size-fits-all
model.
CCS Concepts In this paper, we focus on the differences in informa-
•Information systems → Personalization; Relevance tion search behavior due to the individual differences in the
assessment; domain knowledge of users. It is known that users with
high domain knowledge have more appropriate mental rep-
resentations and higher activation degrees of concepts and
Keywords stronger connections between different concepts in the con-
Cognitive Modeling, Information Search, Domain Knowl- ceptual space compared to users with low domain knowl-
edge, Corpus, Semantic Space. edge [14]. A number of experiments investigating the role
of domain knowledge on information search and navigation
performance have been conducted in the cognitive psychol-
1. INTRODUCTION ogy community. For example, in a recent study by [17],
Search systems are typically characterized as a tool to re- domain experts were found to find more correct answers in
trieve relevant information on a target page from the Inter- shorter time and via a path closer to the optimum path than
net in response to an user query. These systems are efficient non-experts. This difference was stronger as the difficulty of
only for a certain type of tasks such as look-up tasks or the task increased. Higher domain knowledge enables a user
factoid questions (“What is the distance between Mars and to formulate more appropriate queries and comprehend the
Earth” or “Which is the highest mountain in Europe”) and search results and the content in the websites better, which
in turn, enables them to take informed decisions regarding
which hyperlink or a search result to click next. Domain
experts are also known to evaluate search results more thor-
oughly and click more often on relevant search results com-
pared to non-experts. This is because their higher domain
knowledge enables them to differentiate between a relevant
Search as Learning (SAL), July 21, 2016, Pisa, Italy. The copyright for this paper
and a non-relevant search result better [3].
remains with its authors. Copying permitted for private and academic purposes.
However, understanding behavioral differences through lab- user group. The semantic space contains representation of
oratory experiments is not only expensive and not scalable terms from the corpus in a low number of dimensions, typi-
but also time consuming. Simulation of user interactions cally between 250 and 350 and are orthogonal, abstract and
with information retrieval systems therefore has been an ac- latent [16, 18]. CoLiDeS has been successful in simulating
tive area of research. Among the many click models devel- and predicting user link selections, though the websites and
oped by researchers from the information retrieval commu- web-pages used were very restricted. The model has also
nity [2], only a few take into account cognitive aspects [24, been successfully applied in finding usability problems, by
22, 7]. Moreover, they provide only limited process descrip- predicting links that would be unclear to users [1]. CoLiDeS
tion. We therefore, employ computational cognitive models model has recently been extended to predict user clicks on
in our research which are relevant in this context as they en- search result pages [13]. Please note that the CoLiDeS mod-
able us to model differences in cognitive factors (such as do- eling so far does not incorporate any effect of individual dif-
main knowledge) underlying any cognitive function(s) (such ferences in the domain knowledge of users and that is what
as comprehension of search results, arriving at a relevance we will study in the current paper.
estimate of search results and selecting one of the search
results to click). Also, the focus of computational cognitive 2.2 Creation of Semantic Spaces
models is on the process that leads to the target information When using LSA, it is known that the initial corpus of
and are therefore more capable of providing opportunities to documents used to create the semantic space influences the
incorporate behavioral differences due to variations in cog- final similarity values obtained to a large extent [8]. Several
nitive factors. factors determine the choice of the corpus and the semantic
The main research question of the current study was: how space. First and foremost, is the language of the corpus.
to incorporate the differences in the domain knowledge lev- In our case, since we are running our experiments in The
els of users into computational cognitive models that pre- Netherlands with Dutch participants, we need a Dutch se-
dict click behaviour on search results? Would such a model mantic space. Secondly, the corpus of documents should
predict user clicks on search engine result pages (SERPs, be representative of the knowledge levels of the target user
henceforth) better than a model that does not incorporate group. Since the focus of our research is modeling informa-
differentiated domain knowledge levels of users? Outcomes tion search behaviour of older adults compared to younger
of this study would have implications for the support tools adults, we need two corpora that could accurately charac-
for enhancing information search performance, that can be terize the difference in the knowledge levels of younger and
built based on the computational cognitive models [23, 11]. older adults. We have seen already that older adults have
higher crystallized intelligence or general knowledge and vo-
2. OUR APPROACH cabulary than younger adults. Also, since older adults read
We briefly introduce the computational cognitive model more health related information and are more concerned
called CoLiDeS that we use in our research and next to that with their health, we assume that their health and medical
explain our approach to incorporate differentiated domain knowledge would be elaborated than that of younger adults.
knowledge into CoLiDeS. Our goal is to build two semantic spaces that are as close as
possible to the above assumptions.
2.1 Cognitive model We collated two different corpora (general corpus and spe-
CoLiDeS, or Comprehension-based Linked Model of De- cial corpus, each consisting of 70,000 articles in Dutch) vary-
liberate Search, developed by Kitajima et al. [15] explains ing in the amount of medical and health related information.
user navigation behaviour on websites. It divides user nav- The general corpus, representing the knowledge of low do-
igation behavior into four stages of cognitive processing: main knowledge users had 90% news articles and 10% med-
parsing the webpage into high-level schematic regions, focus- ical and health related articles whereas the special corpus,
ing on one of those schematic regions, elaboration / compre- representing the knowledge of high domain knowledge users
hension of the screen objects (e.g. hypertext links) within had 60% news articles and 40% medical and health related
that region, and evaluating and selecting the most appro- articles. After removing all the stop words, these two cor-
priate screen object (e.g. hypertext link) in that region. pora were used to create two semantic spaces using Gallito
CoLiDeS is based on Information Foraging Theory [21] and [18]: a general semantic space using the general corpus (av-
connects to the Construction-Integration reading model of erage article size: 435 words) and a special semantic space
Kintsch [14]. The notion of information scent, defined as using the special corpus (average article size: 403 words).
the estimate of the value or cost of information sources rep- Following settings were used to create the semantic spaces:
resented by proximal cues (such as hyperlinks), is central to 300 dimensions, entire article as the window and log-entropy
CoLiDeS. It is operationalized as the semantic similarity be- weighting. Also, a word was included in the final matrix only
tween the user goal and each of the hyperlinks. The model if it occurred in at least 6 articles.
predicts that the user is most likely to click on that hyper-
link which has the highest semantic similarity value with the 2.3 Evaluation of Semantic Spaces
user goal, i.e., the highest information scent. This process We used two biomedical data sets [6, 20] commonly used
is repeated for every new page until the user reaches the to evaluate measures for computing semantic relevance in
target page. CoLiDeS uses Latent Semantic Analysis (LSA, the medical information retrieval community. In the first
henceforth) introduced by [16] to compute the semantic sim- dataset [20], created in collaboration with Mayo Clinic ex-
ilarities. LSA is an unsupervised machine learning technique perts, we have averaged similarity measures on a set of 30
that employs singular value decomposition to build a high medical terms assessed by a group of 3 physicians, who were
dimensional semantic space using a large corpus of docu- experts in rheumatology and 9 medical coders who were
ments that is representative of the knowledge of the target aware about the concept of semantic similarity on a scale
acute pain in the hip, lower back and pelvis region. He also
Table 1: Correlation values obtained using special lost 12 kilos in the last 6 months. What problem could he
and general semantic spaces on Pedersen et al.’s and be suffering from?”, users had to formulate multiple queries
Hliaoutakis’s benchmarks (** significant at .01 level, such as “kidney stones pain in the back”, “burning sensation
* significant at .05 level). when urinating”, “urinary infection” to find the answer. The
Dataset Special General answer to this task “prostate cancer” was also not found eas-
Semantic Semantic ily in the snippets of the search results of the queries, unless
Space Space the query was very specific.
Pedersen’s Physicians 0.78∗∗ 0.74∗∗ Participants were allowed to use only Google’s search en-
Pedersen’s Coders 0.81∗∗ 0.74∗∗ gine. All the queries generated by the users, the correspond-
∗∗
Hliaoutakis’s Medical Experts 0.58 0.38∗ ing search engine result pages and the URLs opened by them
were logged in the backend. There were in total 738 queries
of 1 (low in similarity) to 4 (high in similarity). The cor- and 724 clicks.
relation between physician judgements was 0.68, and that
between the medical coders was 0.78. In the second dataset 3. MODEL SIMULATIONS
[6], a set of 36 word pairs extracted from MeSH reposi-
We followed the same methodology as authors in [13]
tory were assessed on a scale of 0 (low in similarity) to 1
who extended the CoLiDeS model to predict user clicks on
(high in similarity), by 8 medical experts. The word pairs
SERPs. Simulations of CoLiDeS were run using both the
in both datasets were translated to Dutch by 3 experts
general and the special semantic spaces on each query and
and agreement among them was very high. We dropped
its corresponding search results using the same methodology
two word-pairs from each data set (antibiotic-allergy and
followed by Karanam et al., [12] on navigating in a mock-up
cholangiocarcinoma-colonoscopy from Pederson’s dataset and
website on the human body. We consider each SERP as a
meningitis-tricuspid atresia and measles-rubeola from
page of a website. And each of the search engine results as
Hliaoutakis’s dataset) as they were not in the two corpora
a hyperlink within a page of a website. The problem of pre-
designed by us. So, we were left with 28 word pairs from Ped-
dicting which search engine result to click is now equivalent
ersen’s dataset and 34 word pairs from Hliaoutakis’s dataset.
to the problem of predicting which hyperlink to click within
Next, we computed the semantic similarity between the re-
a page of a website. Therefore, the process of computing in-
maining word pairs from both data sets and computed the
formation scent and predicting which search result to click
correlation with the expert ratings. We expected the similar-
remains the same as in [12]. For the time being, we used
ity values from the special semantic space to be more highly
the user-generated query as a representation of local goal or
correlated with the expert ratings than the similarity values
the understanding of the user at any point of time and se-
from the general semantic space as the former was designed
mantic similarity values were computed from it. The main
to contain greater medical and health related information.
steps we followed in simulating CoLiDeS on interacting with
The correlation values obtained are shown in Table 1.
the SERPs are the following: (a) the semantic similarity be-
Analysing the correlation values from Table 1, we found
tween the query and the title and the snippet combination
that the special semantic space gave a significantly higher
of a search result was computed, (b) this was repeated for
correlation with Hliaoutakis’s dataset and Pedersen’s Coders
all the remaining titles and snippets on a SERP. The title
data set and a marginally higher correlation with Pedersen’s
and snippet combination with the highest semantic similar-
Physicians dataset, compared to the general semantic space.
ity value with the query was selected by the model, and (c)
Based on these outcomes, we were able to confirm that the
finally, this process was repeated for all the queries of a task
special semantic space has health and medical knowledge
and for all the tasks of a participant and finally for all the
better represented than the general semantic space.
participants. (see [13] for details of the procedure).
After running the main simulation steps a) to c) we had
2.4 Behavioral Data Collection available the model predictions on all the queries of all the
Actual behavioural data was collected from 48 partici- tasks and we could compare these with the actual selec-
pants (18 females, 30 males, average age: 48.79) in a labo- tions of real participants. Please note that the CoLiDeS
ratory experiment. Participants were first presented with a model can predict only one search result per query using
domain knowledge test on the topic of health in which they this methodology becase CoLiDeS does not possess a back-
had to answer twelve multiple choice questions. A correct tracking mechanism whereas users in reality click on more
answer was scored 1 and a wrong answer was scored 0. They than one search result per query.
were then presented with six information search tasks in ran-
dom order specifically from the domain of health in order to
examine the behavioural differences in click behaviour of the 4. SIMULATION RESULTS
participants, if any, because of the individual differences in We divided the participants into two groups of high (25
their knowledge of the health domain. To solve these tasks, participants) and low (23 participants) prior domain knowl-
they had to formulate queries using their knowledge and un- edge (PDK) by taking the median score on the prior domain
derstanding of the task, the answer was not present in one knowledge test. We used two metrics to evaluate the efficacy
location or a website and often they had to evaluate infor- of modeling: number of matches per task between the model
mation from multiple websites. For instance, for the task and the actual participant behaviour and the LSA value of
“Elbert, 76 years old has been suffering for few years from the matches in our analysis. For both metrics, a 2 (Semantic
burning sensation while passing urine. He passes urine more Space: General vs. Special) X 2 (Prior Domain Knowledge
often than normal at night and complains of a feeling that the (PDK): High vs. Low) mixed model ANOVA was conducted
bladder is not empty completely. Lately, he also developed with semantic space as within-subjects variable and prior
Mean number of matches (per task)
(a) (b)
0.8

Mean LSA value (of matches)
1.0
0.7
SemanticSpace SemanticSpace
General General
0.8 Special Special
0.6

0.6
0.5
Low High Low High
Domain Knowledge Level Domain Knowledge Level

Figure 1: (a) Mean number of matches (per task) and (b) Mean LSA value (of matches) in relation to
Semantic Space and Prior Domain Knowledge (PDK).

domain knowledge as between-subjects variable. domain knowledge participants it is the other way around. A
possible explanation for the interaction effect is that the spe-
4.1 Number of matches per task cial and the general semantic spaces give appropriate sim-
For each query and its corresponding SERP, the number of ilarity values as assessed by users with high (more precise)
matches between the model predictions and the actual par- and low (less precise) domain knowledge respectively. It is
ticipant behavior is computed. This gives us an indication important to note that these interaction effects are lost when
of how many of the total number of actual participant clicks semantic space is not used as a factor in the analysis. That
per task did the model successfully predict. The main effects is, if we would not have used semantic space as a factor,
of semantic space and prior domain knowledge were not sig- we would have concluded that there is no difference in the
nificant (p>.05). However, the interaction of semantic space model’s performance between the participants with high and
and prior domain knowledge was significant F (1,46) = 7.5, low domain knowledge levels. This would have been a hasty
p<.01 (Figure 1a). conclusion because when we included semantic space as a
factor in the analysis, there was an effect of PDK, but it was
4.2 LSA value of matched search result dependent on the type of semantic space.
For each match between the model and the actual partic- Overall, our outcomes suggest that using appropriate se-
ipant click, the LSA value of the match is determined using mantic spaces - a semantic space with high domain knowl-
the two different semantic spaces. Data of 2 participants edge represented for high domain knowledge users and a
from the low domain knowledge group and 3 participants semantic space with low domain knowledge represented for
from the high domain knowledge group had to be dropped low domain knowledge users - gives better prediction out-
as there were no matches with the actual behaviour for these comes. Improved predictive capacity of these models would
participants. The main effect of semantic space was sig- lead to more accurate model-generated support for search
nificant F (1,41) = 8.88, p<.005. The main effect of prior and navigation which, in turn, would lead to enhanced in-
domain knowledge was not significant (p>.05). The inter- formation seeking performance, as two studies have already
action of semantic space and prior domain knowledge was shown [11, 23]. For each task, navigation support was gen-
tending towards significance F (1,41) = 2.9, p<.09 (Figure erated by recording the step-by-step decisions made by the
1b). cognitive model which in turn are based on the semantic
Taking all together, Figure 1a shows that for participants relatedness of hyperlinks to the user goal (given by a task
with high domain knowledge, the number of matches was description). The model predictions were presented to the
significantly higher with the special semantic space whereas user in the form of visually highlighted hyperlinks. In both
for participants with low domain knowledge, the number of studies, the navigation performance of participants who re-
matches was significantly higher with the general seman- ceived such support was found to be more structured and
tic space. From Figure 1b, we can see that the special se- less disoriented compared to participants who did not re-
mantic space matched user behaviour with a significantly ceive such support. This was found to be true, especially for
higher LSA value, especially for participants with high do- participants with a particular cognitive deficit: such as low
main knowledge. spatial ability.
Model generated support for information search and nav-
5. CONCLUSIONS igation contributes to the knowledge acquisition process as
Indeed the results show that the modeling should take it helps the users in efficiently filtering unnecessary infor-
into account individual differences in domain knowledge and mation. It gives them more time to process and evaluate
adapt the semantic space to these differences: with high do- relevant information during the intermediate stages of click-
main knowledge participants the efficacy of the modeling (in ing on search results and web-pages within websites before
terms of the number of matches and the LSA values of the reaching the target page. This helps in reducing user’s effort
matched search results) is higher with the special semantic in turn lessening cognitive load. This can lead to better com-
space compared to the general semantic space while for low prehension and retention of relevant material (because con-
textual information relevant to the user’s goal is emphasized web-navigation on real websites. Journal of
by model generated support), thereby, leading to higher in- Information Science, 42(1):94–113, 2016.
cidental learning outcomes. Concerning precising the mod- [11] S. Karanam, H. van Oostendorp, and B. Indurkhya.
eling itself, we are currently running experiments with the Towards a fully computational model of
more advanced model CoLiDeS+ [9] which was found to be web-navigation. In Modern Approaches in Applied
more efficient than CoLiDeS in locating the target page on Intelligence, pages 327–337. Springer, 2011.
real websites [10]. CoLiDeS+ incorporates contextual in- [12] S. Karanam, H. van Oostendorp, and B. Indurkhya.
formation in addition to information scent and implements Evaluating colides+ pic: the role of relevance of
backtracking strategies and therefore can predict more than pictures in user navigation behaviour. Behaviour &
one click on a SERP. Lastly, the domain of health has been Information Technology, 31(1):31–40, 2012.
used only as an example and we think that these results [13] S. Karanam, H. van Oostendorp, M. Sanchiz,
would be generalizable to any domain. A. Chevalier, J. Chin, and W. T. Fu. Modeling and
predicting information search behavior. In Proceedings
6. ACKNOWLEDGMENTS of the 5th International Conference on Web
This research was supported by Netherlands Organization Intelligence, Mining and Semantics, page 7. ACM,
for Scientific Research (NWO), ORA-Plus project MISSION 2015.
(464-13-043), and carried out in collaboration with Univer- [14] W. Kintsch. Comprehension: A paradigm for
sity of Toulouse and University of Illinois. cognition. Cambridge university press, 1998.
[15] M. Kitajima, M. H. Blackmon, and P. G. Polson. A
7. REFERENCES comprehension-based model of web navigation and its
[1] M. H. Blackmon, D. R. Mandalia, P. G. Polson, and application to web usability analysis. People and
M. Kitajima. Automating usability evaluation: Computers, pages 357–374, 2000.
Cognitive walkthrough for the web puts lsa to work on [16] T. K. Landauer, D. S. McNamara, S. Dennis, and
real-world hci design problems. In T. K. Landauer, W. Kintsch. Handbook of latent semantic analysis.
D. S. McNamara, S. Dennis, and W. Kintsch, editors, Mahwah,NJ: Erlbaum, 2007.
Handbook of Latent Semantic Analysis, pages 345–375. [17] S. Monchaux, F. Amadieu, A. Chevalier, and
Lawrence Erlbaum Associates Mahwah, NJ, 2007. C. Mariné. Query strategies during information
[2] A. Chuklin, I. Markov, and M. de Rijke. Click models searching: Effects of prior domain knowledge and
for web search. Synthesis Lectures on Information complexity of the information problems to be solved.
Concepts, Retrieval, and Services, 7(3):1–115, 2015. Information Processing & Management,
[3] M. J. Cole, X. Zhang, C. Liu, N. J. Belkin, and 51(5):557–569, 2015.
J. Gwizdka. Knowledge effects on document selection [18] R. Olmos, G. Jorge-Botana, J. A. León, and
in search results pages. In Proceedings of the 34th I. Escudero. Transforming selected concepts into
International ACM SIGIR Conference on Research dimensions in latent semantic analysis. Discourse
and Development in Information Retrieval, pages Processes, 51(5-6):494–510, 2014.
1219–1220. ACM, 2011. [19] C. Olston and E. H. Chi. Scenttrails: Integrating
[4] W.-T. Fu. From plato to the world wide web: browsing and searching on the web. ACM
Information foraging on the internet. In M. T. Peter, Transactions on Computer-Human Interaction
T. H. Thomas, and W. R. Trevor, editors, Cognitive (TOCHI), 10(3):177–197, 2003.
Search, pages 283–299. MIT Press, 2013. [20] T. Pedersen, S. V. Pakhomov, S. Patwardhan, and
[5] W.-T. Fu and W. Dong. Collaborative indexing and C. G. Chute. Measures of semantic similarity and
knowledge exploration: A social learning model. IEEE relatedness in the biomedical domain. Journal of
Intelligent Systems, (1):39–46, 2010. Biomedical Informatics, 40(3):288–299, 2007.
[6] A. Hliaoutakis. Semantic similarity measures in mesh [21] P. Pirolli and S. Card. Information foraging.
ontology and their application to information retrieval Psychological review, 106(4):643, 1999.
on medline. Master’s thesis, Technical Univ. of Crete, [22] S. Shen, B. Hu, W. Chen, and Q. Yang. Personalized
Dept. of Electronic and Computer Engineering, Crete, click model through collaborative filtering. In
Greece, 2005. Proceedings of the fifth ACM International Conference
[7] B. Hu, Y. Zhang, W. Chen, G. Wang, and Q. Yang. on Web Search and Data Mining, pages 323–332.
Characterizing search intent diversity into click ACM, 2012.
models. In Proceedings of the 20th International [23] H. van Oostendorp and I. Juvina. Using a cognitive
Conference on World Wide Web, pages 17–26. ACM, model to generate web navigation support.
2011. International Journal of Human-Computer Studies,
[8] G. Jorge-Botana, J. A. Leon, R. Olmos, and 65(10):887–897, 2007.
I. Escudero. Latent semantic analysis parameters for [24] Q. Xing, Y. Liu, J.-Y. Nie, M. Zhang, S. Ma, and
essay evaluation using small-scale corpora. Journal of K. Zhang. Incorporating user preferences into click
Quantitative Linguistics, 17(1):1–29, 2010. models. In Proceedings of the 22nd ACM International
[9] I. Juvina and H. van Oostendorp. Modeling semantic Conference on Conference on Information &
and structural knowledge in web navigation. Discourse Knowledge Management, pages 1301–1310. ACM,
Processes, 45(4-5):346–364, 2008. 2013.
[10] S. Karanam, H. van Oostendorp, and W. T. Fu.
Performance of computational cognitive models of