=Paper= {{Paper |id=Vol-1647/SAL2016_paper_20 |storemode=property |title= How does Domain Expertise Affect Users' Search Processes in Exploratory Searches? |pdfUrl=https://ceur-ws.org/Vol-1647/SAL2016_paper_20.pdf |volume=Vol-1647 |authors=Jiaxin Mao,Yiqun Liu,Min Zhang,Shaoping Ma |dblpUrl=https://dblp.org/rec/conf/sigir/MaoL0M16 }} == How does Domain Expertise Affect Users' Search Processes in Exploratory Searches? == https://ceur-ws.org/Vol-1647/SAL2016_paper_20.pdf
              How does Domain Expertise Affect Users’ Search
                   Processes in Exploratory Searches?

                                       Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma
             Tsinghua National Laboratory for Information Science and Technology, Department of Computer
                               Science & Technology, Tsinghua University, Beijing, China
                                                                yiqunliu@tsinghua.edu.cn


ABSTRACT                                                                               describe    information-seeking      processes    that    are
Huge amount of users use Web search engines to learn new                               opportunistic, iterative, and multi-tactical ”.
skills and knowledge everyday. Understanding how the                                     While modern search engines are extremely good at
users search to learn is essential for making search engines                           helping users locate specific facts and information, how to
support these learning-related searches more effectively.                              better support exploratory search is still a challenging
Previous researches categorize these learning-related                                  problem.     One of the reasons that make supporting
searches as exploratory searches, because they are often                               exploratory search harder is that the search user plays an
open-ended and multi-faceted, in which the user usually                                even more important role in the interactive exploratory
submits multiple queries iteratively to explore a large                                search process. Therefore, the search system needs to go
information space.                                                                     beyond locating information relevant to the query, and
  In this position paper, we propose to conduct a user                                 provide further help and guidance in exploring unfamiliar
study to investigate whether and how users’ domain                                     information space for users.
expertise affect their search processes in exploratory                                   To make web search engines more effective in supporting
searches. We also set up a preliminary research framework,                             such tasks, we need to study and understand the process of
design the experiment protocol of the user study, and                                  exploratory search from the user’s perspective.            In
discuss about the limitations of this study and the                                    particular, we want to know which user factors affect the
potential implications for improving Web search engines.                               search outcomes of the exploratory search. In this position
                                                                                       paper, we focus to study one of the most important
                                                                                       factors, domain expertise, and design a user study to
Keywords                                                                               investigate whether and how the domain expertise of
Exploratory   Search;                 Domain          Expertise;           Query       search users affects the search outcomes.
Reformulation                                                                            In the following of the paper, we will further discuss the
                                                                                       research framework and propose research questions in
                                                                                       Section 2, present the design of the user study in Section 3,
1.     INTRODUCTION                                                                    and finally discuss the limitations and potential
  Web search engines help people efficiently access                                    implications of this study in Section 4.
information on the Web, and fundamentally change the
way we learn new skills and knowledge [11]. When search
                                                                                       2.   RESEARCH FRAMEWORK
engine users search to learn new knowledge, their initial                                In this section, we introduce the research framework and
information needs are usually multi-faceted and                                        the research questions.
open-ended. While they digest new information by reading                                 The overall research framework is demonstrated in a
the search results, their knowledge structures in mind and                             concept map [9] shown in Figure 1. A closely related
their immediate information needs are evolving                                         conceptual framework was proposed by Vakkari [13].
simultaneously, which leads to highly interactive search                               Through a longitudinal empirical study, in which the
sessions with multiple iterative query reformulations.                                 subjects were college students who attended a 4-month
These characters match the definition of exploratory search                            seminar on preparing a research proposal for a master’s
adopted by White and Roth [15]: “Exploratory search can                                thesis, Vakkari studied the systematic relationship between
be used to describe an information-seeking problem context                             the stages of the task performance process, the information
that is open-ended, persistent, and multi-faceted; and to                              sought for, the search tactics adopted by the search users,
                                                                                       and the usefulness of the information retrieved.         He
                                                                                       differentiated the task performance process into 3 stages:
                                                                                       pre-focus, formulation, and post-focus, and analyzed
                                                                                       subjects’ searching behavior in each stage during the
                                                                                       4-month period. He showed that as the subjects’ domain
                                                                                       knowledge developed across these stages, the information
                                                                                       sought for became more specific, the number of search
Search as Learning (SAL), July 21, 2016, Pisa, Italy                                   terms increased, as well as the search tactics became more
The copyright for this paper remains with its authors. Copying permitted for private   diverse.    Our work differs from and further extends
and academic purposes                                                                  Vakkari’s study in two ways: 1) while Vakkari’s study and
                              Figure 1: The concept map for the research framework.
findings are associated to an specific academic IR system,         Previous research suggests that compared to users with
the LISA data-base, we build an experiment Web search           little domain knowledge, domain experts search differently
engine to study users’ search-to-learn behaviors on             and are generally more successful in in-domain search tasks
general-purpose Web search engines; and 2) in Vakkari’s         [14]. Because the exploratory search is a learning process,
study, the domain knowledge is a longitudinal,                  search users’ domain knowledge and expertise also change
within-subject variable determined by the stages in task        simultaneously during search sessions. In previous studies,
performance process, but in our study, besides measuring        Eickhoff et al. [3] use a few implicit search behavior
the within-subject learning process over a session, we set      metrics as evidences of users’ knowledge acquisition during
the domain knowledge level as a cross-subject independent       searching, and Egusa et al.[1] use Concept Map to
variable (see Section 3 for how we design the experiment        explicitly measure the changes in users’ knowledge
search system to simulate Web search scenarios and how          structures after search. These previous studies developed
we manipulate domain expertise levels).                         methods to measure knowledge development during
                                                                exploratory search sessions.        However, they did not
                                                                investigate the effects of users’ initial domain expertise on
2.1   Search Outcome                                            the search processes and outcomes, which we will
   The search outcomes can be decomposed into two parts:        investigate by setting domain expertise as an independent
knowledge gain and user satisfaction. They can be measured      variable in the user study. While domain experts are
independently.                                                  expected to be more successful in in-domain search tasks,
   The most direct way to measure the knowledge gain is to      their success may be due to their background knowledge or
ask the user to answer questions about the search after she     their expertise in searching for pertinent information. To
finishes searching. In the user study, we will ask subjects     investigate which is the case, in addition to question
to use search engine to find answers about a set of             answering, we will adopt the implicit behavior metrics and
pre-defined questions from different domains, and let the       the explicit concept map method to measure the changes
domain expert assessors with proficient domain knowledge        of users’ knowledge.
grade their answers.
   User satisfaction is a measure that “attempts to gauge
subjects’ feelings about their interactions with the system”
[6]. We plan to use a post-task questionnaire to get explicit
                                                                2.3   Query Reformulation Strategy
satisfaction feedbacks from subjects as well as use implicit      Because the user mainly relies on query reformulations
user behavior metrics to estimate subjects’ satisfaction [8].   to convey her changing information needs to the search
   Measuring both knowledge gain and user satisfaction will     engines, the query reformulation strategy may be another
provide us with a more comprehensive view of the search         vital factor for the success of exploratory search. Previous
outcome. For example, domain experts are expected to be         works on query reformulation strategy study the
more successful in answering in-domain questions [14];          reformulation patterns [4], why the user adds or removes
however, they may be more sensitive to the non-relevant         terms in query reformulation [5], the sources of query
results [13], and therefore, more likely to feel unsatisfied.   terms [2, 12], and the relationship between query
                                                                reformulations and search success in struggling search tasks
                                                                [10]. These previous works establish methodologies and
2.2   Domain Expertise                                          measures to characterize and model users’ query
   The user’s background knowledge about the search task        reformulation strategies. In this work, we will adopt these
(i.e. the domain expertise) is the first user factor that we    methods to characterize the query reformulation strategy
want to investigate in this study.                              in exploratory search.
   Previous study also shows that domain expertise will
influence users’ querying behaviors [14]. In this work, we
will study this influence for the learning-related search
tasks, too. On the one hand, the feedback of search             Table 1: The search tasks from the environment
outcome is usually hard to collect outside the laboratory       domain, medicine domain, and politics domain.
user study environment.         Therefore, the relationship      Domain       Task Description
between query reformulation strategies and domain                             What are the characteristics of particle
expertise may be more important in identifying domain            Environment
                                                                              pollution (also called particulate matter)
experts in practice. On the other hand, understanding how                     in China? Your answer should cover its
the domain experts query differently than other users helps                   compositions, its time-varying patterns,
us understand how the domain expertise influences the                         and its geographical characteristics.
search processes and outcomes. In a recent study, Odijk et                    Why can’t Ultraviolet (UV) disinfection
al. [10] show that in struggling search sessions, the pivotal                 completely supplant chlorination in
query to a great extent determines whether the search will                    disinfecting the drinking water?
succeed or not. We are interested in how the users come up                    What are the most commonly-used
with such pivotal queries. Are the query terms mainly            Medicine
                                                                              treatments for cancer in clinical?
from users’ background knowledge (i.e.           the domain                   What are the potential applications of
expertise), or are they read and collected from the SERPs                     3D printing for “Precision Medicine”?
and landing pages during the search processes? To answer                      Political scientist have noted that the
these questions, we will investigate the sources of the query    Politics
                                                                              trend of political polarization during the
terms, and their relationships with both domain expertise                     US presidential election is increasingly
and search outcomes.                                                          evident. What are the reasons behind
2.4   Research Questions                                                      it?    (polarization here refers to the
                                                                              divergence of political attitudes to
  To summarize, in this study, we want to investigate the                     ideological extremes.)
relationship between the domain expertise, query                              In order to achieve their own interests,
reformulation strategy, and search outcome in exploratory                     the US interest groups often take what
search.    Therefore, we propose the following research                       kind of strategies?
questions:
RQ1 Whether and how does users’ domain expertise
      influence the search outcomes in exploratory search?
RQ2 How does users’ query reformulation strategy
      influence the search outcomes of the exploratory
      search?
RQ3 Do domain experts have a different query
      reformulation strategy in exploratory search?             Table 2:     The questions used in the pre-task
                                                                questionnaire (II.1 in Figure 2).
3.    USER STUDY DESIGN                                          Domain knowledge How much do you know about the
  The procedure of the user study is shown in Figure 2.                              topic of the task?
We choose 3 domains in this work: environment, medicine,         Expected difficulty How difficult do you think it will be
and politics. For each domain, we hired senior graduate                              to complete this search task?
students in related majors as domain expert assessors.           Interest            How interested are you to learn
They are responsible for designing the knowledge learning                            more about the topic of this task?
search tasks and assessing the answers submitted by
experiment subjects. With the help of the domain expert
assessors, 6 search tasks, 2 for each domain, were designed.
Each search task is an open-ended question that can be
answered in about 60-100 words. The descriptions for the
search tasks are shown in Table 1. The domain expert
assessors also provided a reference answer for each task.
These answers will be used to access the subjects’ answers.     Table 3: The questions used in the post-task
  To manipulate the domain expertise level of the subjects,     questionnaire (II.7 in Figure 2).
for each domain we will hire 10-15 senior undergraduate          Domain knowledge       How much did your knowledge
students in related majors. Each subject will be asked to                               increase as you searched?
complete all 6 search tasks, which means that he or she          Experienced difficulty How difficult was this task?
will complete 2 in-domain tasks and 4 out-of-domain tasks.       Interest               How much did your interest
The order of the tasks will be rotated using the Latin                                  in the task increase as you
square method. Before the experiment starts, each subject                               searched?
will go through a pre-experiment training stage (I.1), a         Satisfaction           How satisfied were you with your
pre-experiment questionnaire stage (I.2),           and an                              search experience?
eye-tracking device calibration stage (I.3). In I.1 stage, we
will use an example search task, which is not from
environment, medicine, or politics domain, to teach the
     User Study
                                         Each	
  subject	
  needs	
  to	
  complete	
  2	
  in-­‐domain	
   &	
              Task Set
                    Subjects	
  w/	
  
                                                        4	
  out-­‐of-­‐domain	
   tasks
                     different	
  
                      domain	
           II.1	
  Task Description Reading and Rehearsal                                       Questionnaire	
     Design
                    knowledge                                                                                                    Data
                       level             II.2	
  Pre-­‐task	
  Questionnaire
                                                                                                                             Query	
  Logs
   I.1	
  Pre-­‐experiment               II.3	
  Pre-­‐task	
  Concept	
  Map	
  Drawing
   Training                                                                                                                                           Domain expert
                                                                                                                             Behavior	
                 assessors
                                         II.4	
  Task	
  Completion	
   w/	
  the	
  Experiment	
  Search	
  Engine	
  
                                                                                                                               Logs
   I.2	
  Pre-­‐experiment                                                                                                                        Assess
   Questionnaire                         II.5	
  Question	
  Answering                                                       Answers	
  to	
  
                                                                                                                             Questions
   I.3	
  Eye-­‐tracking Device	
        II.6	
  Post-­‐task	
  Concept	
  Map	
  Drawing
   Calibration	
                                                                                                             Concept
                                         II.7	
  Post-­‐task	
  Questionnaire                                                 Maps


                                                     Figure 2: The user study procedure.


subject how to use the experiment search engine. We will                                          5-point Likert scale used for the pre-task questionnaire.
also teach the subject how to use concept map in I.1 stage.                                       The post-task questionnaire, which is shown in Table 3, is
In I.2 stage, we will collect the subject’s basic information,                                    expected to measure subjects’ satisfaction and perceived
such as age, gender, and experience in using Web search                                           knowledge gain.
engines. In previous study, Eickhoff et al. [2] use an
eye-tracking device to study the sources of query terms. In                                       4.       DISCUSION
this work, we will also use a Tobii X2-30 eye-tracker to log
                                                                                                    In this section, we discuss the limitations of this study
subjects’ eye fixations. Therefore, for each subject, we
                                                                                                  as well as the potential implications for the design of Web
need to calibrate the eye-tracker for her in I.3 stage.
                                                                                                  search engines.
   For each search task, the subject will first read and
memorize the task description (i.e.            an open-ended                                      4.1         Limitations
question) in II.1 stage. After that, she will complete a
                                                                                                    We plan to collect data from a laboratory user study.
pre-task questionnaire (II.2) about the current domain
                                                                                                  Compared to a naturalistic log-based study (e.g. [14]), the
knowledge level, the expected difficulty, and the interest
                                                                                                  laboratory user study has limitations in its relative small
level of the task [7].          The questions in pre-task
                                                                                                  scale and the questionable ecological validity of the
questionnaire are shown in Table 2. The subject will be
                                                                                                  collected data. To address the ecological validity problem,
required to answer these questions in a 5-point Likert scale
                                                                                                  we carefully design the experiment search system and user
(1: not at all, 2: slightly, 3: somewhat, 4: moderately, 5:
                                                                                                  study protocol to simulate a practical Web search scenario.
very). Then, in II.3 stage, the subject will draw a pre-task
                                                                                                    The only independent variable in this work is the
concept map on paper. This concept map is expected to
                                                                                                  domain expertise of users. We plan to control it by hiring
measure the subject’s background knowledge about the
                                                                                                  subjects among senior undergraduate students from the
current task.      In II.4 stage, the subject will use an
                                                                                                  corresponding majors. However, whether this manipulation
experiment search engine to complete the search task.
                                                                                                  can effectively control the domain expertise variable needs
When the experiment search engine receives a query, it will
                                                                                                  to be verified by the collected data. The reported domain
forward the query to a commercial Web search engine and
                                                                                                  expertise, measured by the pre-task questionnaire, can be
retrieve the corresponding SERP. To control the variability
                                                                                                  used to test the effectiveness of our manipulation.
in the SERPs, we will filter all the query suggestions,
sponsor search results, knowledge graph results, and                                              4.2         Potential Implications for System Design
vertical results out, and only return the organic results to                                         The investigations of the proposed research questions
the subject. We will inject JavaScript into this filtered                                         may lead to useful implications for improving the search
SERP to log all the query reformulations along with other                                         engines. For example: for RQ1, if the domain experts
user behaviors such as clicks, tab-switchings, scrolls, and                                       indeed have a higher knowledge gain during the search, the
mouse-movements. After completing the search task, the                                            results read by them are more likely to be of high quality,
subject will answer the task-related question in II.5 stage                                       and the search engine can identify these high-quality
and draw a post-task concept map on paper in II.6 stage.                                          results based on domain experts’ click logs; and if the
The answer and the concept maps will be assessed by the                                           domain experts are more likely to feel unsatisfied during
domain expert assessors to measure the subject’s                                                  the search, then maybe we should consider providing more
knowledge gain. Finally in II.7 stage, the subject will                                           specialized and authoritative information in the SERPs to
complete a post-task questionnaire about the knowledge                                            make them satisfied. For RQ2, if we can find most
level after search, the perceived difficulty as well as interest                                  effective query reformulation strategies for knowledge
of the task, and the overall user satisfaction, in the same                                       learning task, we can teach users how to adopt these
strategies or make search engines provide better guidance          Development in Information Retrieval, pages 493–502.
during the search session via query suggestions. And for           ACM, 2015.
RQ3, if the domain experts have a different query              [9] J. D. Novak and D. B. Gowin. Learning how to learn.
reformulation strategy, we can identify them by observing          Cambridge University Press, 1984.
their query logs in exploratory search sessions, and then     [10] D. Odijk, R. W. White, A. Hassan Awadallah, and
provide personalized results for them; furthermore,                S. T. Dumais. Struggling and success in web search. In
understanding the relationship between the developing              Proceedings of the 24th ACM International on
domain expertise and the changing querying strategy will           Conference on Information and Knowledge
help us understand how information needs emerge and                Management, pages 1551–1560. ACM, 2015.
evolve during exploratory searches, which may provide new     [11] D. M. Russell. What do you need to know to use a
insights for constructing a better session-level user              search engine? why we still need to teach research
behavioral model.                                                  skills. AI Magazine, 36(4), 2015.
                                                              [12] M. Sloan, H. Yang, and J. Wang. A term-based
5.   ACKNOWLEDGMENTS                                               methodology for query reformulation understanding.
  This work was supported by Tsinghua University                   Information Retrieval Journal, 18(2):145–165, 2015.
Initiative Scientific Research Program(2014Z21032),           [13] P. Vakkari. Changes in search tactics and relevance
National Key Basic Research Program (2015CB358700),                judgments in preparing a research proposal: A
Natural Science Foundation (61532011, 61472206) of China           summary of findings of a longitudinal study.
and Tsinghua-Samsung Joint Laboratory for Intelligent              Information Retrieval, 4:295–310, 2001.
Media Computing.                                              [14] R. W. White, S. T. Dumais, and J. Teevan.
                                                                   Characterizing the influence of domain expertise on
6.   REFERENCES                                                    web search behavior. In Proceedings of the Second
 [1] Y. Egusa, H. Saito, M. Takaku, H. Terai, M. Miwa,             ACM International Conference on Web Search and
     and N. Kando. Using a concept map to evaluate                 Data Mining, pages 132–141. ACM, 2009.
     exploratory search. In Proceedings of the third          [15] R. W. White and R. A. Roth. Exploratory search:
     symposium on Information interaction in context,              Beyond the query-response paradigm. Synthesis
     pages 175–184. ACM, 2010.                                     Lectures on Information Concepts, Retrieval, and
 [2] C. Eickhoff, S. Dungs, and V. Tran. An eye-tracking           Services, 1(1):1–98, 2009.
     study of query reformulation. In Proceedings of the
     38th International ACM SIGIR Conference on
     Research and Development in Information Retrieval,
     pages 13–22. ACM, 2015.
 [3] C. Eickhoff, J. Teevan, R. White, and S. Dumais.
     Lessons from the journey: A query log analysis of
     within-session learning. In Proceedings of the 7th ACM
     international conference on Web search and data
     mining, pages 223–232. ACM, 2014.
 [4] J. Huang and E. N. Efthimiadis. Analyzing and
     evaluating query reformulation strategies in web
     search logs. In Proceedings of the 18th ACM
     conference on Information and knowledge
     management, pages 77–86. ACM, 2009.
 [5] J. Jiang and C. Ni. What affects word changes in
     query reformulation during a task-based search
     session? In Proceedings of the 2016 ACM on
     Conference on Human Information Interaction and
     Retrieval, pages 111–120. ACM, 2016.
 [6] D. Kelly. Methods for evaluating interactive
     information retrieval systems with users. Foundations
     and Trends in Information Retrieval, 3(1—2):1–224,
     2009.
 [7] D. Kelly, J. Arguello, A. Edwards, and W.-c. Wu.
     Development and evaluation of search tasks for iir
     experiments using a cognitive complexity framework.
     In Proceedings of the 2015 International Conference
     on The Theory of Information Retrieval, pages
     101–110. ACM, 2015.
 [8] Y. Liu, Y. Chen, J. Tang, J. Sun, M. Zhang, S. Ma,
     and X. Zhu. Different users, different opinions:
     Predicting search satisfaction with mouse movement
     information. In Proceedings of the 38th International
     ACM SIGIR Conference on Research and