<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Van Den Broek, E.L., Kisters, P.M., Vuurpijl, L.G.: Content-based image retrieval benchmarking: Utilizing
color categories and color distributions. Journal of imaging science and technology</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Validating the importance of work tasks as context for professional search</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Egon L. van den Broek</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>vandenbroek@acm.org</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Freudenthal Institute, Utrecht University</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Information and Computing Sciences, Utrecht University</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Thomas Schoegje</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2005</year>
      </pub-date>
      <volume>49</volume>
      <issue>3</issue>
      <fpage>24</fpage>
      <lpage>28</lpage>
      <abstract>
        <p>In professional search many work tasks share some structure and are likely to recur. We argue that retrieval in the context of such tasks can exploit prior knowledge about these tasks to serve more relevant results. Understanding why someone searches can help to distill and express the user's information desires. In order to validate the value of this approach, we asked users to judge search results with and without a new search lter that narrowed down the results to documents relevant to the work tasks.Initial results suggest the importance of exploiting work tasks as context for professional search, although future work should consider the extent of this e ect.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Copyright c by the paper's authors. Copying permitted for private and academic purposes.
often shape typical information desires, and that enabling users to express these desires more accurately will
better support them in their tasks.</p>
      <p>Section 2 will open by reviewing (authentic) work tasks as concepts. Afterwards, we introduce the work tasks
investigated in their context at the municipality Utrecht in Section 3. In Section 4 we describe an experiment
to quantify the bene ts of ltering documents based on their originating work tasks. Finally we present our
conclusions in Section 5 and expand on our plans for future work in Section 6.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Authentic work tasks</title>
      <p>
        Work tasks are de ned as concrete sections of time that include actions towards a goal; the task outcome (e.g.
handling email)[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. A work task may include multiple search tasks, which are sub-tasks that include one or more
queries towards an information need. Although the work tasks will rarely be identical, many of them will share
characteristic aspects. This is especially the case for professional search, as work tasks are more recurring and
structurally de ned than the information desires in typical web search.
      </p>
      <p>
        Some work tasks are shared between users within a team or between users with similar roles in di erent teams.
Others, such as reviewing recent information, are very common. The importance of one such ubiquitous work
task, reviewing recent documents, has been quanti ed [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Here it was shown that the number of times any
document was accessed in their professional setting followed a logarithmic function.
      </p>
      <p>
        The actual search process itself also a ects the user's information desires[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The context of one's stage in
the information seeking process could also be used to re ne results.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Investigating work tasks</title>
      <p>The work tasks investigated are performed within the municipality Utrecht in the Netherlands. This is an
organization where diverse teams look up information with diverse goals during their work tasks. This diversity
is ideal for investigating the types and nature of work tasks, as well as their e ects on user interactions with the
information systems. Our experiment aims to show that work tasks can often shape typical information desires,
and that embedding searches with this context information will better support them in their tasks. First, we
will identify typical information desires through exploratory interviews with a focus on the user work tasks and
the di culties they have in completing these. We then provide them with a lter that lets them express their
work task context in a search interface. This lter is based on meta-data that can be algorithmically annotated
(i.e. the documents can be classi ed and ltered by class). In order to validate the perceived impact of the new
interface we perform an experiment including a dummy lter that does not function as intended but instead
randomly lters out documents.
3.1</p>
      <sec id="sec-3-1">
        <title>Analysis of work tasks</title>
        <p>
          In our experiment we aim to support policy makers from diverse domains. Although their speci c approach
may vary per domain and individual, they experienced similar challenges in retrieving information. The rst
key challenge was retrieving speci c documents that were stored without descriptive titles in a semi-structured
archive. In this case users tended to resort to systems other than the intended interface to nd the documents (in
particular by checking email attachments and by asking colleagues). The second key challenge is sorting through
a large volume of documents retrieved unrelated to the intended domain. The rst challenge was addressed by
allowing users to easily access recently viewed documents (as suggested in the literature [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]). The current focus
will be on the second challenge; allowing users to more accurately express their information desires in a way that
reduces the number of unrelated results.
        </p>
        <p>Based on work tasks, two promising document categorizations were identi ed through explorative interviews.
The rst was to consider the information needs at the various (global) stages of policy making. Here, di erent
teams continue building on the same information as the documents produced get less explorative and increasingly
speci c. The second was to focus on the types of communication between these teams, as a policy maker is
required to produce or search for speci c types of documents. Whereas the rst case groups the work tasks
within a team, the second case groups on the communication tasks that policy makers from multiple teams
might encounter. These two categorizations will be discussed in order.</p>
        <p>The rst categorization considers the global policy making process. There are three main steps involving
di erent teams of policy makers: gathering information, forming policy proposals and adapting it. The forming
of policy proposals is further separated by domain, with two signi cant ones shown in Table 1.</p>
        <sec id="sec-3-1-1">
          <title>Step</title>
          <p>Information gathering
Debating (domain: city and space)
Debating (domain: man and society)
Adapting</p>
          <p>The second categorization is based on the policy documents communicated between teams, which can generally
be divided into ve categories based on their purpose. They are the products of mutually exclusive work tasks,
and their purpose is summarized in Table 2.</p>
          <p>Our initial solution to reduce the number of irrelevant search results is to allow users to lter on such a
categorization based on meta-data. This annotation was approached manually for new documents, where a
new interface was introduced to aid employees in selecting one of ve appropriate templates (corresponding to
these classes). Although this ensures accurate annotation of new documents, users also want to search older
documents. This can be approached by exploiting the le location and document title where possible, and using
classi cation for the remaining cases. The algorithmic classi cation is out of the scope of this paper. This avoids
inaccuracies due to classi cation errors for the remainder of the paper. An experiment will now follow to test
the value in this work-task based approach, where classi cation was veri ed manually.
4</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Empirical study</title>
      <p>An empirical study was set up to verify whether lters in the search interface help the user to better express
the information desire by expressing the context of the work tasks. Users were asked to judge how well a set of
results ful lled a speci ed information desired, both with and without the lter. In order to test the presence of
a placebo e ect on the new lter (which was created as they hypothesized it would improve performance), we
also introduce a second placebo lter which lters on a random subtype.
4.1</p>
      <sec id="sec-4-1">
        <title>Materials</title>
        <p>
          Based on the categorization of the policy making process (see Table 1), there are 4 document types. For each
of these 2 search queries were formulated by a user familiar with the system, resulting in 8 queries. They were
chosen to represent authentic search tasks. The actual query used is hidden, and the user is instead presented
a search question that represents the underlying information desire. Although such verbose question yield poor
results when used as the query[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], these authentic search tasks re ect examples where users were interesting in
ltering on the categories. Using each of the 8 queries, results were retrieved under the following three conditions:
        </p>
        <sec id="sec-4-1-1">
          <title>1. TextSearch: a full-text search using the queries.</title>
          <p>2. FilterSearch: the same search, but ltering out all results of the incorrect document types.
3. PlaceboSearch: appears identical to FilterSearch, but instead of ltering on the desired (and indicated)
document type it lters on another document type.</p>
          <p>This document categorization was chosen over the alternative previously introduced as the placebo lter would
be more obvious (the user would recognize that a memo is not actually a letter). The current PlaceboSearch
displays documents that were written for a di erent purpose or domain, but on the same topic.
5 female and 4 male participants were asked to judge the results for queries. They were presented with a static
search interface such as the one shown in Figure 1. Document type is indicated using the color and title. Users
were asked to highlight any relevant results by clicking on them (similar to previous work in gathering subjective
opinions[8]). Then they were asked to indicate an overall rating of the results on a (Likert) scale of 1 - 5. They
could do so at their own tempo, before proceeding to the next set of results. In order to avoid the possibility
of users forgetting to answer one of these questions, users were asked to con rm when they found no useful
results. The time it took to answer each question (in seconds) was also stored, and the experiment took around
30 minutes. After two practice screens, a total of 24 sets of results were presented. The order was randomized
by shu ing in which order the lter types were presented, and then presenting the 8 questions per lter type
in a random order per user. This was done so the user could get accustomed to the di erent lters. The time
taken per answer is recorded.
4.3</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>Results</title>
        <p>Table 3 shows basic descriptors of how users evaluated the various search engines. Having the proper lter
outperform the others in every aspect encourages further study, although further statistical analysis is required.
An ANOVA was performed using the mean Likert scores for each set of results shown (8 questions x 3 lters),
with the search engine as the dependent variable. The result was not strongly statistically signi cant (F(2,27)
= 3:453, p = :0505), although the e ect size was fairly large ( 2 = 0:247). In combination with the con dence
intervals, this suggests that the p value would decrease below :05 if more participants were added to the study.</p>
        <p>The discrepancy where the TextSearch lter has a large number of results selected but a relatively Likert
rating is likely because this version tended to show the same le preview multiple times (from di erent sources).
The reduced decision time might be a related to presenting that are obviously relevant, but also in uenced by
the TextSearch lter including more documents with lengthy previews.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>We noted the important role of the user's (work) tasks could play in helping users. Users perform queries within
the context of a work task, and this context can be used to better understand their information needs. This is
especially the case for professional search as many work tasks have a more structured and recurring nature than
is often the case in web search. Results suggest the importance of work tasks as the context for professional
search, although future work should consider the extent of this e ect.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>The authors gratefully acknowledge the participants for their time and Arjen de Vries for his comments on the
present work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Clarkson</surname>
            ,
            <given-names>K.L.</given-names>
          </string-name>
          :
          <article-title>Supporting the complex dynamics of the information seeking process</article-title>
          .
          <source>PhD thesis</source>
          , Radboud University, Nijmegen, Netherlands (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Dumais</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cutrell</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cadiz</surname>
            ,
            <given-names>J.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jancke</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sarin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Robbins</surname>
            ,
            <given-names>D.C.</given-names>
          </string-name>
          :
          <article-title>Stu I've seen: a system for personal information retrieval and re-use</article-title>
          .
          <source>In: ACM SIGIR Forum</source>
          . Volume
          <volume>49</volume>
          ., ACM (
          <year>2016</year>
          )
          <volume>28</volume>
          {
          <fpage>35</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Hoenkamp</surname>
            ,
            <given-names>E.C.</given-names>
          </string-name>
          :
          <article-title>About the'compromised information need'and optimal interaction as quality measure for search interfaces</article-title>
          .
          <source>In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , ACM (
          <year>2015</year>
          )
          <volume>835</volume>
          {
          <fpage>838</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Croft</surname>
          </string-name>
          , W.B.:
          <article-title>Automatic boolean query suggestion for professional search</article-title>
          .
          <source>In: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval</source>
          , ACM (
          <year>2011</year>
          )
          <volume>825</volume>
          {
          <fpage>834</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Koster</surname>
            ,
            <given-names>C.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seibert</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Seutter</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The phasar search engine</article-title>
          .
          <source>In: International Conference on Application of Natural Language to Information Systems</source>
          , Springer (
          <year>2006</year>
          )
          <volume>141</volume>
          {
          <fpage>152</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Saastamoinen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          arvelin, K.:
          <article-title>Queries in authentic work tasks: the e ects of task type and complexity</article-title>
          .
          <source>Journal of Documentation</source>
          <volume>72</volume>
          (
          <issue>6</issue>
          ) (
          <year>2016</year>
          )
          <volume>1114</volume>
          {
          <fpage>1133</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Spink</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Multiple search sessions model of end-user behavior: An exploratory study</article-title>
          .
          <source>Journal of the American Society for Information Science</source>
          <volume>47</volume>
          (
          <issue>8</issue>
          ) (
          <year>1996</year>
          )
          <volume>603</volume>
          {
          <fpage>609</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>