=Paper= {{Paper |id=Vol-2699/paper20 |storemode=property |title=How to Support Search Activity of Users Without Prior Domain Knowledge When They are Solving Learning Tasks? |pdfUrl=https://ceur-ws.org/Vol-2699/paper20.pdf |volume=Vol-2699 |authors=Cheyenne Dosso,Aline Chevalier,Lynda Tamine |dblpUrl=https://dblp.org/rec/conf/cikm/DossoCT20 }} ==How to Support Search Activity of Users Without Prior Domain Knowledge When They are Solving Learning Tasks?== https://ceur-ws.org/Vol-2699/paper20.pdf
How to support search activity of users without prior
domain knowledge when they are solving learning tasks?
Cheyenne Dossoa , Aline Chevaliera and Lynda Tamineb
a University of Toulouse Jean-Jaurès, 5 allée Antonio Machado, Toulouse, 31058, France
b University of Toulouse Paul Sabatier, Route de Narbonne, Toulouse, 31330, France



                                          Abstract
                                          This study focused on the impact of prior domain knowledge on the resolution of search tasks. More precisely, the study
                                          looked at the effect of procedural and semantic support on search strategies and performances during information search
                                          activity comparing different levels of learning tasks. Eighteen students with prior domain knowledge, fourteen unguided
                                          students without prior domain knowledge and fifteen guided students without prior domain knowledge had to solve six
                                          learning tasks (two “remember” tasks, two “understand” tasks, two “evaluate” tasks) related to psychology. Main results
                                          showed that procedural and semantic support improved the navigation of users without prior domain knowledge (i.e. fewer
                                          links opened from SERP, less time spent on URL and globally less time to find information, longer queries) and they got close
                                          to users having higher prior domain knowledge.

                                          Keywords
                                          Search strategies and performances, learning tasks, prior domain knowledge, support, searching as learning


1. Introduction                                                                                                    users with a higher level of prior domain knowledge
                                                                                                                   if we provided them with procedural (i.e. regarding
Information systems are no longer seen simply as a                                                                 the procedure for optimal task solving) and semantic
tool for retrieving content to meet a specific informa-                                                            (i.e. regarding the specific vocabulary used in a do-
tion need but as a tool for acquiring new knowledge in                                                             main) support. More precisely, when they are solv-
the course of searching, i.e. searching as learning [1].                                                           ing learning tasks at different levels. In this paper, we
According to [2], Searching as Learning aims to deter-                                                             present related work on information search, prior do-
mine the relationships between the information search                                                              main knowledge and learning search tasks. We review
activity (e.g., formulation of queries, search strategies,                                                         the methodology used to test our hypotheses and then
etc.) and learning activities (e.g., reading, note-taking,                                                         describe our results.
organizing information collected, etc.). The tasks in
SAL can have different levels of learning goals, rang-
ing from simple fact-finding task (i.e remember) to the                                                            2. Related Work
production of a new set of information (i.e create), [3,
4]. According to [5], search tasks in general can be                                                               In the cognitive model of information search, [6] de-
modulated by other factors, such as prior knowledge                                                                scribe the role of cognitive abilities (e.g., verbal and
related to the search domain, knowledge of the tasks                                                               vocabulary abilities, selective attention, etc.) on the
procedures and knowledge in information search. This                                                               three stages of this cyclical activity: (1) during plan-
knowledge have been widely studied, but not in the                                                                 ning and formulating the query stage, (2) the stage of
context of Searching as Learning. Therefore, this study                                                            evaluating and selecting the information provided by
aimed to understand how these types of knowledge                                                                   the search engine, and (3) the stage of deep processing
could support search activity when users dealt with                                                                of the information contained in the web pages. Among
search tasks of different levels of learning. In partic-                                                           these abilities, verbal and vocabulary abilities are im-
ular, we want to know whether users who have no                                                                    portant on all stages. These ones are directly related to
or little knowledge in a domain could get closer to                                                                the prior domain knowledge [7]. Users with domain-
                                                                                                                   specific vocabulary knowledge generally construct a
                                                                                                                   more consistent mental representation of the task than
Proceedings of the CIKM 2020 Workshops, October 19–20, Galway,
Ireland
                                                                                                                   users with lower prior domain knowledge, making it
email: cheyenne.dosso@univ-tlse2.fr (C. Dosso);                                                                    easier for them to assess the relevance of the SERP and
aline.chevalier@univ-tlse2.fr (A. Chevalier); lynda.lechani@irit.fr                                                to select more relevant sources [8]. Specifically, users
(L. Tamine)                                                                                                        with a high level of prior domain knowledge are able
orcid:
                                    © 2020 Copyright for this paper by its authors. Use permitted under Creative   to focus their attention on relevant elements and in-
                                    Commons License Attribution 4.0 International (CC BY 4.0).
 CEUR
 Workshop
 Proceedings
               http://ceur-ws.org
               ISSN 1613-0073       CEUR Workshop Proceedings (CEUR-WS.org)                                        hibit others [9]. With regard to query (re)formulation
strategies, those of users having higher prior domain parallel task, users are supported in their search ac-
knowledge are often longer (i.e., composed of more tivity. Based to [3], parallel and dependent tasks have
words) [10] and these users are faster and perform bet- been called "Evaluate" in the present study because the
ter [6]. Beyond prior domain knowledge related to main objective of these tasks is to compare a set of el-
vocabulary, other type of knowledge that can impact ements. Parallel task corresponds to guided evaluate
information search activity is procedural knowledge task and dependent task to unguided evaluate task. In
related to the task [11]. If a task is well known and this way,the current study aimed to understand how
routine, the resolution procedure is easy to perform; semantic and procedural support can help users with-
users have less difficulty understanding the structure out prior domain knowledge to solve learning tasks
of the task [12]. According to [13], the interaction of compared to users with prior domain knowledge. Specif-
these two types of knowledge (i.e., domain and pro- ically, the present study’s objective is to determine the
cedure) supports search activity when users are solv- effect of knowledge of a resolution procedure (“under-
ing search tasks. In this study, we focus on the res- stand” task) and a specific vocabulary (“evaluate” task)
olution of learning tasks at different levels. The first on search activity, with respect to the level of prior do-
level is "Remember" [3]. It is a simple fact-finding task main knowledge (with vs without in psychology).
where the learning objective is unique. The key words
provided in the statement are clear, consistent, well-
defined, and achievement of the learning objective does 3. Method
not require high cognitive effort [14]. For this tasks, a
plateau effect appeared between experts and novices 3.1. Hypotheses
on search performance [7]. The second level is "Un-          • Hypothesis 1: Guided users without domain knowledge
derstand" [3]. This is a task that will require the un-        should spend less time to solve tasks and to have better scores
derstanding and clarification of certain terms in order        (correct answers) than unguided users without domain knowl-
to access the answer. The statement here is poorly de-         edge so that the first ones should be close to users with high
fined because the terms used are not clear but also be-        level of domain knowledge.
cause they are linked to specific high vocabulary. The       • Hypothesis 2: Guided users without domain knowledge
production of new search terms is requisite [14]. These        should formulate more and longer queries than unguided
tasks can be solved by following a specific procedure          users without domain knowledge so that the first ones should
(e.g. understanding the definition and terminology of          be close to users with high level of domain knowledge.
the proposed terms so that inferences can be made to         • Hypothesis 3: Guided users without domain knowledge
more relevant terms, understanding each search crite-          should open fewer links from SERPs and spend less time
ria and finding an answer that satisfies them). If users       to explore webpages than unguided users without domain
know this procedure, they could solve the task easier          knowledge so that the first ones should be close to users with
and obtain higher performance. For this task, users            high level of domain knowledge.
with a high level of prior domain knowledge formu-
lated more queries than users without prior domain
knowledge [7] and had less difficulty adding new search 3.2. Independent Variables
terms [8]. The third level used is "Evaluate" [3]. It re-    • IV1: Level of prior domain knowledge as between-subject
quires a comparison of elements as proposed by [15].           factor (high/without)
These tasks involve text production but vary in struc-       • IV2: Level of support as between-subject factor (guided/ un-
ture: "parallel task" and "dependent task". For the par-       guided) – only for users without prior domain knowledge
allel task, the elements to be compared are clearly de-      • IV3: Level of learning task as a within-subject factor (Re-
fined and the specific vocabulary to be used to learn          member/Understand/Evaluate)
new information is provided in the task statement. Se-
mantic support is therefore high. For the dependent 3.3. Dependent Variables
task, the elements to be compared are not given in
statement and have to be inferred by users during their      • DV1: Total time (in sec.) of search session. For each
search. The dependent task requires a higher level of          task (included total time spent on SERP and total time spent
prior knowledge than the parallel task since the se-           on webpages).
mantic support(i.e. specific vocabulary level related        • DV2: Total score in percent. For remember and under-
to the elements to be compared) is lower in the state-         stand tasks, when one answer was correct and possible, scores
ment; users have to define and to infer them. These            are “1” (correct) and “0” (wrong). For evaluate tasks, which
differences are important because in the case of the           were open-ended tasks (several answers were acceptable but
      specific elements had to be found), 1 point was assigned for       Understand task - guided: As part of [...] by Lionel? To solve
      each expected element contained in the answer with a score  this task, you have to produce new keywords and it is necessary that
      varying from 0 to 9 by task.                                research method integrates the set of given criteria.
    • DV3: Queries (number and length). For each search task,         Evaluate task - unguided: You have an interest about social
      the total number of queries submitted to the search engine  psychology     domain and you want to write an article about social
      per search session was computed. The mean length of queries perceptions,    in particular on ones which contribute to discrimina-
      per search corresponds to the total sum of keywords number  tion. To  do  that, you have to know what are the elements included
      used during a search session divided by the number of total in social  perceptions,   how they work, how they build themselves,
      queries submitted to the system during this search session. what   the sub-processes   are and how they influence the discrimina-
    • DV4: Number of links opened up from SERP. For each          tion.  Specifically,  you have  to carry out these following activities:
      search task, total number of selected and opened links by   1) to retrieve   information  about  social perception elements, which
      users from the search results pages.                        contribute   to  discrimination.  2) To select three elements on which
    • DV5: Total spent time (in sec.) on webpages. For each you are going to concentrate in this article. You want to present
      task during the search session.                                their specific characteristics which encourage you to select them
                                                                     among others elements. 3) To compare their functioning at the level
                                                                     of sub-processes.
3.4. Participants
                                                                         Evaluate task - guided: You have [...] discrimination. You want
Eighteen users with high level of domain knowledge                   to focus on three elements about social perception, which allow ex-
aged from 22 to 30 years old (𝑀 = 24.6 𝑆𝐷 = 2.20),                   plaining the functioning of discrimination. These three elements
fifteen guided users without domain knowledge aged                   are: 1) social categorization, 2) Stereotypes, 3) Prejudices. You wish
from 22 to 28 years old (𝑀 = 24 𝑆𝐷 = 1.77) and four-                 to describe the set of these three elements, particularly: to present
teen unguided users without domain knowledge aged                    their functioning, how they build themselves, the sub-processes and
from 22 to 27 years old (𝑀 = 24.1 𝑆𝐷 = 1.44) took                    how they influence discrimination. You have to integrate in the arti-
part in the experiment. All of them were French native               cle the completeness of three descriptions, which correspond to the
speakers. The sample was composed of 12 males and                    set of analysis criteria.
35 females, all in master degree (16 females with psy-                  Understand and evaluate tasks test the support vari-
chology knowledge, 8 guided and 11 unguided females                  able. In the unguided version, participants saw only
without psychology knowledge). Concerning the self-                  the task statement. In the guided condition, the proce-
assessment scale of psychology knowledge (4-p Lik-                   dural support (Understand) took the form of an addi-
ert scale), scores were significantly different (𝑡(45) =             tional instruction that informed the participant about
7.69, 𝑝 > .001) between users with domain knowledge                  the procedure to follow to succeed in the task. The
(𝑀 = 3.39 𝑆𝐷 = 0.5) and users without (𝑀 = 1.86                      semantic support (Evaluate) informed participants on
𝑆𝐷 = 0.74). In addition, the scores obtained through                 the items to compare. For evaluate task non-guided,
the multiple-choice test in Psychology domain were                   the items on which to perform evaluation and com-
significant (𝑡(45) = 7.71, 𝑝 < .001). Users with domain              parison work were not indicated in the statement. For
knowledge had better scores (𝑀 = 8.94 𝑆𝐷 = 2.46)                     the remember tasks, no support was provided because
than users without (𝑀 = 3.34 𝑆𝐷 = 2.39).                             these were control tasks where the literature does not
                                                                     show any significant difference in their resolution.
3.5. Material
                                                                     3.6. Procedure
All participants used a Dell Latitude 5590 (17 inch) with
Windows 10 Pro, Intel Core i7 8th Gen processor and                 The study took place at the University of Toulouse. Be-
external mouse. To record data, we used an ad-hoc                   fore starting search sessions, participants had to com-
software, which recorded time, clicks, visited SERPs                plete four online questionnaires: demographic infor-
and documents. To test our hypotheses, we created                   mation; habits with internet, self-efficacy scale in in-
six search tasks in guided and unguided version, all                formation search (10 items), MCQ of psychology knowl-
related to psychology:                                              edge (16 questions). Once the pre-questionnaires were
   Remember task: What was the name of Chomsky, the author completed, the main instructions were presented and
of generative theory?                                               participants started to perform the six search tasks in
   Understand task - unguided: As part of his researches, Lionel randomized order. Participants had to provide a writ-
conducts observations in various circumstances that he extrapolates ten response. Users with domain knowledge and a
to make previsions. In your view, what is the research method used part of users without domain knowledge saw the un-
by Lionel?                                                          guided tasks and the other part of users without do-
main knowledge performed the tasks in their guided 6031.29, 𝑝 < .001. No significant differences were ob-
version.                                                   tained between users with knowledge and guided users
                                                           without, nor between the two groups of users without
                                                           knowledge (𝑝 > .05). The second part of H2 was not
4. Results                                                 completely verified.
                                                              No significant effect of support appeared for links
For all the dependent variables, we carried out ANOVA
                                                           opened up from SERPs (𝐹 (2, 44) = 0.45, 𝑝 > .05).Contrasts
(repeated measures) on two independent variables and
                                                           indicated a significant difference between users with
contrasts to identify if the support helps non-experts.
                                                           knowledge (𝑀 = 8.85 𝑆𝐷 = 1.49) and unguided users
We mixed the "level of prior domain knowledge" (IV1)
                                                           without (𝑀 = 10.76 𝑆𝐷 = 1.7), with 𝐹 (2, 43) = 19232.32,
and the "support" (IV2) to obtain the independent vari-
                                                           𝑝 < .001. Users with knowledge opened fewer links
able "Group" with three modalities (with domain knowl-
                                                           from SERPs. A significant difference appeared between
edge unguided, without domain knowledge guided and
                                                           guided (𝑀 = 9.96 𝑆𝐷 = 1.64) and unguided users with-
unguided).The independent variable "level of learning
                                                           out knowledge (𝐹 (2, 43) = 1543.31, 𝑝 < .001). Guided
task" stayed the same.
                                                           opened fewer links from SERPs. No significant differ-
   Regarding the total time on search session, statisti-
                                                           ences were obtained between users with knowledge
cal analyzes did not reveal any significant effect of sup-
                                                           and guided users without (𝑝 > .05). This part of hy-
port (𝐹 (2, 44) = 1.10, 𝑝 > .05). Nevertheless, contrasts
                                                           pothesis 3 was validated.
indicated that users with domain knowledge (𝑀 = 340
                                                              Regarding the total time spent on web pages, ANOVA
𝑆𝐷 = 60.16) need less time to solve tasks than un-
                                                           did not show any significant effect of support (𝐹 (2, 44) =
guided users without domain knowledge (𝑀 = 464.58
                                                           1.20, 𝑝 > .05). Contrasts indicated that users with
𝑆𝐷 = 68.42) with 𝐹 (3, 42) = 9995106, 𝑝 < .001. No sig-
                                                           knowledge (𝑀 = 316 𝑆𝐷 = 58.47) spent less time on
nificant difference was obtained between users with
                                                           web pages than unguided without (𝑀 = 441.72 𝑆𝐷 =
knowledge and guided users without (𝑝 > .05) nor
                                                           66.26) with 𝐹 (3, 42) = 11520492, 𝑝 < .001. Guided users
between the two groups of users without knowledge
                                                           without (𝑀 = 347.64 𝑆𝐷 = 64.02) spent less time than
(𝑝 > .05). This part of hypothesis 1 was only partially
                                                           unguided without (𝐹 (3, 42) = 585671.6, 𝑝 < .001). No
verified.
                                                           significant difference between users with knowledge
   Concerning the scores of correct answers, ANOVA
                                                           and guided without appeared.The second part of hy-
did not show any significant effect of the support with
                                                           pothesis 3 was confirmed.
𝐹 (2, 44) = 2.76, 𝑝 > .05. Contrasts indicated that users
with knowledge (𝑀 = 0.53 𝑆𝐷 = 0.02) have better
scores than guided users without (𝑀 = 0.49 𝑆𝐷 = 5. Conclusion
0.03) with 𝐹 (2, 43) = 487.71, 𝑝 < .001.No significant
differences were observed between users with knowl- Users with knowledge and guided users without opened
edge and unguided users without, nor between the two up fewer links and spent less time on web pages than
groups of users without knowledge (𝑝 > .05). The sec- unguided users without knowledge. These results sug-
ond part of hypothesis 1 was not confirmed.                gest that support used was able to allow users without
   Concerning the effect of support, the ANOVA did knowledge who benefited from it to focus more on
not show any significant difference on the total num- the relevant information contained in the SERPs and
ber of queries (𝐹 (2, 44) = 0.62, 𝑝 > .05). Contrasts web pages. Users with knowledge scored better than
showed that users with knowledge (𝑀 = 7.12 𝑆𝐷 = guided users without. This result may in part raise
1.05) produce fewer queries than unguided users with- questions about the relevance of the support used. First,
out (𝑀 = 7.83 𝑆𝐷 = 1.20) with 𝐹 (2, 43) = 5657.23, for the understand task, procedural support had to help
𝑝 < .001. No significant differences were obtained users when they were formulating queries. However,
between users with knowledge guided users without although this instruction was handled by the partici-
(𝑝 > .05), nor between the two groups of users with- pants, guided users may have experienced difficulties
out knowledge (𝑝 > .05). The first part of H2 was not completing this activity and understanding the infor-
validated.                                                 mation from the web content. As for the semantic
   The ANOVA did not reveal a significant effect of the support for the evaluate tasks, it made the task more
support on queries length (𝐹 (2, 44) = 1.78, 𝑝 > .05). closed than the non-guided version. Guided users had
Contrast indicated that users with knowledge (𝑀 = to compare specific items, while the other two groups
4.09 𝑆𝐷 = 0.35) produced longer queries than unguided had more freedom in the items to be selected. To fur-
users without (𝑀 = 3.75 𝑆𝐷 = 0.38) with 𝐹 (2, 43) = ther understand the correct answer scores, qualitative
analysis of the answers would be interesting in order        man, New York, NY, 2001.
to determine if the semantic level of the final produc- [4] B. J. Jansen, B. Smith, Using the taxonomy of cog-
tions (level of specificity of the terms used) as well       nitive learning to model online searching, Informa-
as their structure (copy/paste, paraphrase, reformula-       tion Processing & Management (2009).
tion...) are different between users with knowledge [5] C. L. Smith, Domain-independent search expertise:
and users without. With regard to the length of queries      A description of procedural knowledge gained dur-
and the total search time for users having prior do-         ing guided instruction, The Journal of the Associa-
main knowledge, they produced longer queries and             tion for Information Science and Technology (2015).
needed less time to complete the search than the un- [6] J. Sharit, J. Taha, R. W. Berkowsky, H. Profita, S.
guided without knowledge. While the guided users             J. Czaja, Online information search performance
were close to users with a high level of knowledge,          and search strategies in a health problem-solving
their performance was not sufficient to outperform the       scenario, The Journal of Cognitive Engineering and
unguided users without knowledge in terms of correct         Decision Making (2015).
answers. For the queries, further analyses are cur- [7] A. Dommes, A. Chevalier, S. Lia, The role of cogni-
rently in progress in collaboration with natural lan-        tive flexibility and vocabulary abilities of younger
guage processing researchers. The objective is to eval-      and older users in searching for information on the
uate the specificity of the terms used in queries in or-     web, Applied Cognitive Psychology (2011).
der to know if the support has had an impact on this [8] M. Sanchiz, A. Chevalier, F. Amadieu, How do older
variable. Regarding the total number of queries gen-         and young adults start searching for information?
erated, users with knowledge generated fewer queries         Impact of Age, Domain knowledge and problem
than unguided users without knowledge. One possible          complexity steps of information searching, Comp-
explanation was that unguided users without knowl-           uters in Human Behavior (2017).
edge had to search for more information, such as defi- [9] A. Tricot, F. Amadieu, Navigation dans les hyper-
nitions of specific terms provided in task statement for      textes, in: J. Dinet, J. M. C. Bastien (Eds.), L’ergono-
example. While the users with prior domain knowl-              mie des objets et des environnements physiques
edge were familiar with the specific psychological terms       et numériques, Hermès, Paris, 2011, pp. 167-192.
provided in the task statement, they started their search [10] A. Aula, Query formulation in web information
directly. This hypothesis would be interesting to test         search, in: Proceedings of the IADIS International
in future studies on Searching as Learning. Indeed,            Conference, 2003, p. 403-410.
a search engine should be able to adapt to different [11] Y. Li, N. J. Belkin, A faceted approach to conceptu-
user backgrounds. Users with prior domain knowl-               alizing tasks in information seeking, Information
edge and users without cannot follow the same search           Processing & Management (2008).
and learning sub-objectives. The first ones can start [12] K. Byström, K. Järvelin, Task complexity affects
their search at a higher learning level than the others        information seeking and use, Information Process-
who might need to access certain definitions and ter-          ing & Management, (1995).
minology to elaborate a mental representation of the [13] G. Marchionini, S. Dwiggins, A. Katz, X. Lin, Inf-
task closer to ones of users having a high prior domain        ormation seeking in full-text end-user-oriented
knowledge.                                                     search systems: The role of domain and search
                                                               expertise, Library and Information Science Research
                                                               (1993).
References                                                [14] D. J. Bell, I. Ruthven, Searcher’s assessments of
                                                               task complexity for web searching, in: S. McDon-
[1] J. Gwizdka, P. Hansen, C. Hauff, J. He, N. Kando,
                                                               ald, J. Trait (Eds.), Advances in information re-
    Search as learning (SAL) workshop, in: Proceed-
                                                               trieval lecture notes in computer science, pp. 57-
    ings of the 39th ACM Special Interest Group on In-
                                                               71.
   formation Retrieval, SIGIR ’16, Association for Com-
                                                          [15] J. Liu, N. J. Belkin, Personalizing Information Re-
    puting Machinery, Pisa, Italy, 2016, p. 1249-1250.
                                                              trieval for Multi-session Tasks: Examining the Roles
[2] P. Vakkari, Searching as learning : A systemati-
                                                               of Task Stage, Task Type, and Topic Knowledge
    zation based on literature, The Journal of Informa-
                                                              on the Interpretation on Dwell Times as an In-
    tion Science (2016).
                                                              dicator of Document Usefulness, The Journal of
[3] L. W. Anderson, D. R. Krathwohl, A taxonomy for
                                                              the Association for Information Science and Tech-
    learning, teaching, and assessing: A revision of
                                                               nology (2015).
    Bloom’s taxonomy of educational objectives, Long-