=Paper=
{{Paper
|id=Vol-2699/paper20
|storemode=property
|title=How to Support Search Activity of Users Without Prior Domain Knowledge When They are Solving Learning Tasks?
|pdfUrl=https://ceur-ws.org/Vol-2699/paper20.pdf
|volume=Vol-2699
|authors=Cheyenne Dosso,Aline Chevalier,Lynda Tamine
|dblpUrl=https://dblp.org/rec/conf/cikm/DossoCT20
}}
==How to Support Search Activity of Users Without Prior Domain Knowledge When They are Solving Learning Tasks?==
How to support search activity of users without prior domain knowledge when they are solving learning tasks? Cheyenne Dossoa , Aline Chevaliera and Lynda Tamineb a University of Toulouse Jean-Jaurès, 5 allée Antonio Machado, Toulouse, 31058, France b University of Toulouse Paul Sabatier, Route de Narbonne, Toulouse, 31330, France Abstract This study focused on the impact of prior domain knowledge on the resolution of search tasks. More precisely, the study looked at the effect of procedural and semantic support on search strategies and performances during information search activity comparing different levels of learning tasks. Eighteen students with prior domain knowledge, fourteen unguided students without prior domain knowledge and fifteen guided students without prior domain knowledge had to solve six learning tasks (two “remember” tasks, two “understand” tasks, two “evaluate” tasks) related to psychology. Main results showed that procedural and semantic support improved the navigation of users without prior domain knowledge (i.e. fewer links opened from SERP, less time spent on URL and globally less time to find information, longer queries) and they got close to users having higher prior domain knowledge. Keywords Search strategies and performances, learning tasks, prior domain knowledge, support, searching as learning 1. Introduction users with a higher level of prior domain knowledge if we provided them with procedural (i.e. regarding Information systems are no longer seen simply as a the procedure for optimal task solving) and semantic tool for retrieving content to meet a specific informa- (i.e. regarding the specific vocabulary used in a do- tion need but as a tool for acquiring new knowledge in main) support. More precisely, when they are solv- the course of searching, i.e. searching as learning [1]. ing learning tasks at different levels. In this paper, we According to [2], Searching as Learning aims to deter- present related work on information search, prior do- mine the relationships between the information search main knowledge and learning search tasks. We review activity (e.g., formulation of queries, search strategies, the methodology used to test our hypotheses and then etc.) and learning activities (e.g., reading, note-taking, describe our results. organizing information collected, etc.). The tasks in SAL can have different levels of learning goals, rang- ing from simple fact-finding task (i.e remember) to the 2. Related Work production of a new set of information (i.e create), [3, 4]. According to [5], search tasks in general can be In the cognitive model of information search, [6] de- modulated by other factors, such as prior knowledge scribe the role of cognitive abilities (e.g., verbal and related to the search domain, knowledge of the tasks vocabulary abilities, selective attention, etc.) on the procedures and knowledge in information search. This three stages of this cyclical activity: (1) during plan- knowledge have been widely studied, but not in the ning and formulating the query stage, (2) the stage of context of Searching as Learning. Therefore, this study evaluating and selecting the information provided by aimed to understand how these types of knowledge the search engine, and (3) the stage of deep processing could support search activity when users dealt with of the information contained in the web pages. Among search tasks of different levels of learning. In partic- these abilities, verbal and vocabulary abilities are im- ular, we want to know whether users who have no portant on all stages. These ones are directly related to or little knowledge in a domain could get closer to the prior domain knowledge [7]. Users with domain- specific vocabulary knowledge generally construct a more consistent mental representation of the task than Proceedings of the CIKM 2020 Workshops, October 19–20, Galway, Ireland users with lower prior domain knowledge, making it email: cheyenne.dosso@univ-tlse2.fr (C. Dosso); easier for them to assess the relevance of the SERP and aline.chevalier@univ-tlse2.fr (A. Chevalier); lynda.lechani@irit.fr to select more relevant sources [8]. Specifically, users (L. Tamine) with a high level of prior domain knowledge are able orcid: © 2020 Copyright for this paper by its authors. Use permitted under Creative to focus their attention on relevant elements and in- Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) hibit others [9]. With regard to query (re)formulation strategies, those of users having higher prior domain parallel task, users are supported in their search ac- knowledge are often longer (i.e., composed of more tivity. Based to [3], parallel and dependent tasks have words) [10] and these users are faster and perform bet- been called "Evaluate" in the present study because the ter [6]. Beyond prior domain knowledge related to main objective of these tasks is to compare a set of el- vocabulary, other type of knowledge that can impact ements. Parallel task corresponds to guided evaluate information search activity is procedural knowledge task and dependent task to unguided evaluate task. In related to the task [11]. If a task is well known and this way,the current study aimed to understand how routine, the resolution procedure is easy to perform; semantic and procedural support can help users with- users have less difficulty understanding the structure out prior domain knowledge to solve learning tasks of the task [12]. According to [13], the interaction of compared to users with prior domain knowledge. Specif- these two types of knowledge (i.e., domain and pro- ically, the present study’s objective is to determine the cedure) supports search activity when users are solv- effect of knowledge of a resolution procedure (“under- ing search tasks. In this study, we focus on the res- stand” task) and a specific vocabulary (“evaluate” task) olution of learning tasks at different levels. The first on search activity, with respect to the level of prior do- level is "Remember" [3]. It is a simple fact-finding task main knowledge (with vs without in psychology). where the learning objective is unique. The key words provided in the statement are clear, consistent, well- defined, and achievement of the learning objective does 3. Method not require high cognitive effort [14]. For this tasks, a plateau effect appeared between experts and novices 3.1. Hypotheses on search performance [7]. The second level is "Un- • Hypothesis 1: Guided users without domain knowledge derstand" [3]. This is a task that will require the un- should spend less time to solve tasks and to have better scores derstanding and clarification of certain terms in order (correct answers) than unguided users without domain knowl- to access the answer. The statement here is poorly de- edge so that the first ones should be close to users with high fined because the terms used are not clear but also be- level of domain knowledge. cause they are linked to specific high vocabulary. The • Hypothesis 2: Guided users without domain knowledge production of new search terms is requisite [14]. These should formulate more and longer queries than unguided tasks can be solved by following a specific procedure users without domain knowledge so that the first ones should (e.g. understanding the definition and terminology of be close to users with high level of domain knowledge. the proposed terms so that inferences can be made to • Hypothesis 3: Guided users without domain knowledge more relevant terms, understanding each search crite- should open fewer links from SERPs and spend less time ria and finding an answer that satisfies them). If users to explore webpages than unguided users without domain know this procedure, they could solve the task easier knowledge so that the first ones should be close to users with and obtain higher performance. For this task, users high level of domain knowledge. with a high level of prior domain knowledge formu- lated more queries than users without prior domain knowledge [7] and had less difficulty adding new search 3.2. Independent Variables terms [8]. The third level used is "Evaluate" [3]. It re- • IV1: Level of prior domain knowledge as between-subject quires a comparison of elements as proposed by [15]. factor (high/without) These tasks involve text production but vary in struc- • IV2: Level of support as between-subject factor (guided/ un- ture: "parallel task" and "dependent task". For the par- guided) – only for users without prior domain knowledge allel task, the elements to be compared are clearly de- • IV3: Level of learning task as a within-subject factor (Re- fined and the specific vocabulary to be used to learn member/Understand/Evaluate) new information is provided in the task statement. Se- mantic support is therefore high. For the dependent 3.3. Dependent Variables task, the elements to be compared are not given in statement and have to be inferred by users during their • DV1: Total time (in sec.) of search session. For each search. The dependent task requires a higher level of task (included total time spent on SERP and total time spent prior knowledge than the parallel task since the se- on webpages). mantic support(i.e. specific vocabulary level related • DV2: Total score in percent. For remember and under- to the elements to be compared) is lower in the state- stand tasks, when one answer was correct and possible, scores ment; users have to define and to infer them. These are “1” (correct) and “0” (wrong). For evaluate tasks, which differences are important because in the case of the were open-ended tasks (several answers were acceptable but specific elements had to be found), 1 point was assigned for Understand task - guided: As part of [...] by Lionel? To solve each expected element contained in the answer with a score this task, you have to produce new keywords and it is necessary that varying from 0 to 9 by task. research method integrates the set of given criteria. • DV3: Queries (number and length). For each search task, Evaluate task - unguided: You have an interest about social the total number of queries submitted to the search engine psychology domain and you want to write an article about social per search session was computed. The mean length of queries perceptions, in particular on ones which contribute to discrimina- per search corresponds to the total sum of keywords number tion. To do that, you have to know what are the elements included used during a search session divided by the number of total in social perceptions, how they work, how they build themselves, queries submitted to the system during this search session. what the sub-processes are and how they influence the discrimina- • DV4: Number of links opened up from SERP. For each tion. Specifically, you have to carry out these following activities: search task, total number of selected and opened links by 1) to retrieve information about social perception elements, which users from the search results pages. contribute to discrimination. 2) To select three elements on which • DV5: Total spent time (in sec.) on webpages. For each you are going to concentrate in this article. You want to present task during the search session. their specific characteristics which encourage you to select them among others elements. 3) To compare their functioning at the level of sub-processes. 3.4. Participants Evaluate task - guided: You have [...] discrimination. You want Eighteen users with high level of domain knowledge to focus on three elements about social perception, which allow ex- aged from 22 to 30 years old (𝑀 = 24.6 𝑆𝐷 = 2.20), plaining the functioning of discrimination. These three elements fifteen guided users without domain knowledge aged are: 1) social categorization, 2) Stereotypes, 3) Prejudices. You wish from 22 to 28 years old (𝑀 = 24 𝑆𝐷 = 1.77) and four- to describe the set of these three elements, particularly: to present teen unguided users without domain knowledge aged their functioning, how they build themselves, the sub-processes and from 22 to 27 years old (𝑀 = 24.1 𝑆𝐷 = 1.44) took how they influence discrimination. You have to integrate in the arti- part in the experiment. All of them were French native cle the completeness of three descriptions, which correspond to the speakers. The sample was composed of 12 males and set of analysis criteria. 35 females, all in master degree (16 females with psy- Understand and evaluate tasks test the support vari- chology knowledge, 8 guided and 11 unguided females able. In the unguided version, participants saw only without psychology knowledge). Concerning the self- the task statement. In the guided condition, the proce- assessment scale of psychology knowledge (4-p Lik- dural support (Understand) took the form of an addi- ert scale), scores were significantly different (𝑡(45) = tional instruction that informed the participant about 7.69, 𝑝 > .001) between users with domain knowledge the procedure to follow to succeed in the task. The (𝑀 = 3.39 𝑆𝐷 = 0.5) and users without (𝑀 = 1.86 semantic support (Evaluate) informed participants on 𝑆𝐷 = 0.74). In addition, the scores obtained through the items to compare. For evaluate task non-guided, the multiple-choice test in Psychology domain were the items on which to perform evaluation and com- significant (𝑡(45) = 7.71, 𝑝 < .001). Users with domain parison work were not indicated in the statement. For knowledge had better scores (𝑀 = 8.94 𝑆𝐷 = 2.46) the remember tasks, no support was provided because than users without (𝑀 = 3.34 𝑆𝐷 = 2.39). these were control tasks where the literature does not show any significant difference in their resolution. 3.5. Material 3.6. Procedure All participants used a Dell Latitude 5590 (17 inch) with Windows 10 Pro, Intel Core i7 8th Gen processor and The study took place at the University of Toulouse. Be- external mouse. To record data, we used an ad-hoc fore starting search sessions, participants had to com- software, which recorded time, clicks, visited SERPs plete four online questionnaires: demographic infor- and documents. To test our hypotheses, we created mation; habits with internet, self-efficacy scale in in- six search tasks in guided and unguided version, all formation search (10 items), MCQ of psychology knowl- related to psychology: edge (16 questions). Once the pre-questionnaires were Remember task: What was the name of Chomsky, the author completed, the main instructions were presented and of generative theory? participants started to perform the six search tasks in Understand task - unguided: As part of his researches, Lionel randomized order. Participants had to provide a writ- conducts observations in various circumstances that he extrapolates ten response. Users with domain knowledge and a to make previsions. In your view, what is the research method used part of users without domain knowledge saw the un- by Lionel? guided tasks and the other part of users without do- main knowledge performed the tasks in their guided 6031.29, 𝑝 < .001. No significant differences were ob- version. tained between users with knowledge and guided users without, nor between the two groups of users without knowledge (𝑝 > .05). The second part of H2 was not 4. Results completely verified. No significant effect of support appeared for links For all the dependent variables, we carried out ANOVA opened up from SERPs (𝐹 (2, 44) = 0.45, 𝑝 > .05).Contrasts (repeated measures) on two independent variables and indicated a significant difference between users with contrasts to identify if the support helps non-experts. knowledge (𝑀 = 8.85 𝑆𝐷 = 1.49) and unguided users We mixed the "level of prior domain knowledge" (IV1) without (𝑀 = 10.76 𝑆𝐷 = 1.7), with 𝐹 (2, 43) = 19232.32, and the "support" (IV2) to obtain the independent vari- 𝑝 < .001. Users with knowledge opened fewer links able "Group" with three modalities (with domain knowl- from SERPs. A significant difference appeared between edge unguided, without domain knowledge guided and guided (𝑀 = 9.96 𝑆𝐷 = 1.64) and unguided users with- unguided).The independent variable "level of learning out knowledge (𝐹 (2, 43) = 1543.31, 𝑝 < .001). Guided task" stayed the same. opened fewer links from SERPs. No significant differ- Regarding the total time on search session, statisti- ences were obtained between users with knowledge cal analyzes did not reveal any significant effect of sup- and guided users without (𝑝 > .05). This part of hy- port (𝐹 (2, 44) = 1.10, 𝑝 > .05). Nevertheless, contrasts pothesis 3 was validated. indicated that users with domain knowledge (𝑀 = 340 Regarding the total time spent on web pages, ANOVA 𝑆𝐷 = 60.16) need less time to solve tasks than un- did not show any significant effect of support (𝐹 (2, 44) = guided users without domain knowledge (𝑀 = 464.58 1.20, 𝑝 > .05). Contrasts indicated that users with 𝑆𝐷 = 68.42) with 𝐹 (3, 42) = 9995106, 𝑝 < .001. No sig- knowledge (𝑀 = 316 𝑆𝐷 = 58.47) spent less time on nificant difference was obtained between users with web pages than unguided without (𝑀 = 441.72 𝑆𝐷 = knowledge and guided users without (𝑝 > .05) nor 66.26) with 𝐹 (3, 42) = 11520492, 𝑝 < .001. Guided users between the two groups of users without knowledge without (𝑀 = 347.64 𝑆𝐷 = 64.02) spent less time than (𝑝 > .05). This part of hypothesis 1 was only partially unguided without (𝐹 (3, 42) = 585671.6, 𝑝 < .001). No verified. significant difference between users with knowledge Concerning the scores of correct answers, ANOVA and guided without appeared.The second part of hy- did not show any significant effect of the support with pothesis 3 was confirmed. 𝐹 (2, 44) = 2.76, 𝑝 > .05. Contrasts indicated that users with knowledge (𝑀 = 0.53 𝑆𝐷 = 0.02) have better scores than guided users without (𝑀 = 0.49 𝑆𝐷 = 5. Conclusion 0.03) with 𝐹 (2, 43) = 487.71, 𝑝 < .001.No significant differences were observed between users with knowl- Users with knowledge and guided users without opened edge and unguided users without, nor between the two up fewer links and spent less time on web pages than groups of users without knowledge (𝑝 > .05). The sec- unguided users without knowledge. These results sug- ond part of hypothesis 1 was not confirmed. gest that support used was able to allow users without Concerning the effect of support, the ANOVA did knowledge who benefited from it to focus more on not show any significant difference on the total num- the relevant information contained in the SERPs and ber of queries (𝐹 (2, 44) = 0.62, 𝑝 > .05). Contrasts web pages. Users with knowledge scored better than showed that users with knowledge (𝑀 = 7.12 𝑆𝐷 = guided users without. This result may in part raise 1.05) produce fewer queries than unguided users with- questions about the relevance of the support used. First, out (𝑀 = 7.83 𝑆𝐷 = 1.20) with 𝐹 (2, 43) = 5657.23, for the understand task, procedural support had to help 𝑝 < .001. No significant differences were obtained users when they were formulating queries. However, between users with knowledge guided users without although this instruction was handled by the partici- (𝑝 > .05), nor between the two groups of users with- pants, guided users may have experienced difficulties out knowledge (𝑝 > .05). The first part of H2 was not completing this activity and understanding the infor- validated. mation from the web content. As for the semantic The ANOVA did not reveal a significant effect of the support for the evaluate tasks, it made the task more support on queries length (𝐹 (2, 44) = 1.78, 𝑝 > .05). closed than the non-guided version. Guided users had Contrast indicated that users with knowledge (𝑀 = to compare specific items, while the other two groups 4.09 𝑆𝐷 = 0.35) produced longer queries than unguided had more freedom in the items to be selected. To fur- users without (𝑀 = 3.75 𝑆𝐷 = 0.38) with 𝐹 (2, 43) = ther understand the correct answer scores, qualitative analysis of the answers would be interesting in order man, New York, NY, 2001. to determine if the semantic level of the final produc- [4] B. J. Jansen, B. Smith, Using the taxonomy of cog- tions (level of specificity of the terms used) as well nitive learning to model online searching, Informa- as their structure (copy/paste, paraphrase, reformula- tion Processing & Management (2009). tion...) are different between users with knowledge [5] C. L. Smith, Domain-independent search expertise: and users without. With regard to the length of queries A description of procedural knowledge gained dur- and the total search time for users having prior do- ing guided instruction, The Journal of the Associa- main knowledge, they produced longer queries and tion for Information Science and Technology (2015). needed less time to complete the search than the un- [6] J. Sharit, J. Taha, R. W. Berkowsky, H. Profita, S. guided without knowledge. While the guided users J. Czaja, Online information search performance were close to users with a high level of knowledge, and search strategies in a health problem-solving their performance was not sufficient to outperform the scenario, The Journal of Cognitive Engineering and unguided users without knowledge in terms of correct Decision Making (2015). answers. For the queries, further analyses are cur- [7] A. Dommes, A. Chevalier, S. Lia, The role of cogni- rently in progress in collaboration with natural lan- tive flexibility and vocabulary abilities of younger guage processing researchers. The objective is to eval- and older users in searching for information on the uate the specificity of the terms used in queries in or- web, Applied Cognitive Psychology (2011). der to know if the support has had an impact on this [8] M. Sanchiz, A. Chevalier, F. Amadieu, How do older variable. Regarding the total number of queries gen- and young adults start searching for information? erated, users with knowledge generated fewer queries Impact of Age, Domain knowledge and problem than unguided users without knowledge. One possible complexity steps of information searching, Comp- explanation was that unguided users without knowl- uters in Human Behavior (2017). edge had to search for more information, such as defi- [9] A. Tricot, F. Amadieu, Navigation dans les hyper- nitions of specific terms provided in task statement for textes, in: J. Dinet, J. M. C. Bastien (Eds.), L’ergono- example. While the users with prior domain knowl- mie des objets et des environnements physiques edge were familiar with the specific psychological terms et numériques, Hermès, Paris, 2011, pp. 167-192. provided in the task statement, they started their search [10] A. Aula, Query formulation in web information directly. This hypothesis would be interesting to test search, in: Proceedings of the IADIS International in future studies on Searching as Learning. Indeed, Conference, 2003, p. 403-410. a search engine should be able to adapt to different [11] Y. Li, N. J. Belkin, A faceted approach to conceptu- user backgrounds. Users with prior domain knowl- alizing tasks in information seeking, Information edge and users without cannot follow the same search Processing & Management (2008). and learning sub-objectives. The first ones can start [12] K. Byström, K. Järvelin, Task complexity affects their search at a higher learning level than the others information seeking and use, Information Process- who might need to access certain definitions and ter- ing & Management, (1995). minology to elaborate a mental representation of the [13] G. Marchionini, S. Dwiggins, A. Katz, X. Lin, Inf- task closer to ones of users having a high prior domain ormation seeking in full-text end-user-oriented knowledge. search systems: The role of domain and search expertise, Library and Information Science Research (1993). References [14] D. J. Bell, I. Ruthven, Searcher’s assessments of task complexity for web searching, in: S. McDon- [1] J. Gwizdka, P. Hansen, C. Hauff, J. He, N. Kando, ald, J. Trait (Eds.), Advances in information re- Search as learning (SAL) workshop, in: Proceed- trieval lecture notes in computer science, pp. 57- ings of the 39th ACM Special Interest Group on In- 71. formation Retrieval, SIGIR ’16, Association for Com- [15] J. Liu, N. J. Belkin, Personalizing Information Re- puting Machinery, Pisa, Italy, 2016, p. 1249-1250. trieval for Multi-session Tasks: Examining the Roles [2] P. Vakkari, Searching as learning : A systemati- of Task Stage, Task Type, and Topic Knowledge zation based on literature, The Journal of Informa- on the Interpretation on Dwell Times as an In- tion Science (2016). dicator of Document Usefulness, The Journal of [3] L. W. Anderson, D. R. Krathwohl, A taxonomy for the Association for Information Science and Tech- learning, teaching, and assessing: A revision of nology (2015). Bloom’s taxonomy of educational objectives, Long-