<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshop on Supporting Complex Search Tasks, March</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Complex Search Task: How to Make a Phone Safe for a Child</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sophie Rutter</string-name>
          <email>sarutter1@she</email>
          <email>sarutter1@sheffield.ac.uk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Verena Blinzler</string-name>
          <email>verena.blinzler@stud.uni-regensburg.de</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chaoyu Ye</string-name>
          <email>psxcy1@nottingham.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael B. Twidale</string-name>
          <email>twidale@illinois.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Max L. Wilson</string-name>
          <email>max.wilson@nottingham.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign</institution>
          ,
          <addr-line>Champaign, IL</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Mixed Reality Lab, University of Nottingham</institution>
          ,
          <addr-line>Nottingham</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>School of Information Science, University of Sheffield</institution>
          ,
          <addr-line>Sheffield</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Regensburg</institution>
          ,
          <addr-line>Regensburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <volume>11</volume>
      <issue>2017</issue>
      <abstract>
        <p>There are many factors in task design that might make it 'complex': having multiple components, having multiple crossdependent components, tasks that involve comparison, evaluation, estimation, or learning. In this paper, we discuss a case study of a complex task we may consider to be highly natural, a common concern for many people, and one that 'should' have a clear answer, but doesn't: how do you make a phone safe for a child. For this question, there is a lot of opinion online, many possibilities for actions, many variations in hardware and software, but ultimately no one clear and correct answer for everyday phone users. We found very little objective behaviours that separated people in terms of performance but instead have begun to identify some successful tactics that are not directly linked to domain knowledge.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 INTRODUCTION</title>
      <p>
        Designing complex tasks for a user study is hard. Wildemuth and
Freund [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] synthesized all the aspects of search tasks that might
make them exploratory in nature, which include: learning goals,
general topics, open-ended topics, multi-focus needs,
multifaceted needs, uncertain aims, ill-structured problems, and which
are “not too easy”. Exploratory tasks will therefore involve long
and dynamic searching processes, which are accompanied by
other information and cognitive activities, such as analysing,
organizing, and decision making [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Choosing complex tasks to
embody some portion of these factors for user studies, and
indeed comparing the observed behaviours with those seen in
other user studies, is a non-trivial process. As a community, we
may have studied many complex tasks, but we are perhaps still a
CHIIR 2017 Workshop on Supporting Complex Search Tasks, Oslo, Norway.
Copyright for the individual papers remains with the authors. Copying permitted
for private and academic purposes. This volume is published and copyrighted by
its editors. Published on CEUR-WS, Volume 1798, http://ceur-ws.org/Vol-1798/
long way from a comprehensive and discriminatory model of
how people solve different complex tasks, and thus how search
systems can support them.
      </p>
      <p>It is also hard to generate complex search tasks, especially
those being performed on uncontrolled collections like the
World Wide Web, as much online content is user generated
content that reflects their attempts at solving possibly similar
questions. This means that it is difficult to use previously
designed complex tasks, because the solutions often become
available online1; indeed, such pages could appear during the
data collection period of a study.</p>
      <p>
        This paper presents a case study of a task used in our
ongoing Search Literacy project, where our final task was the
product of many failed attempts at selecting a task that would
engage participants for more than ten minutes, without asking
them to ignore certain websites or informational pages. We
asked participants to find out how to make a phone safe for a
child (Figure 1) and below we review how this task fits into the
exploratory task facets identified by Wildemuth and Freund [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
We then review our initial findings as a means to discuss
approaches to evaluating performance in such complex tasks.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2 STUDY AND TASK CONTEXT</title>
      <p>The aim of our research was to study Search Literacy, and
how it affects searchers; we wanted to examine and compare
how competent searchers attempted a task, in comparison to less
competent searchers. A person with good search literacy should
be able to resolve e.g. technology problems when they are out of
their depth in the domain. The secondary ongoing aim,
therefore, is to find design recommendations that would help a
person to become more competent.
1 Although interesting new complex search tasks are posted on Dan Russell’s
Search Research blog, many people often post their solutions.
A friend of yours has recently bought a new phone
(the one provided here). Sometimes their child uses
the phone. Your friend has asked for your help.</p>
      <p>1) They do not want to unnecessarily restrict
their child from downloading Apps. They do, however,
want to ensure that they do not hear more than mild
bad language and they do not see any violence
directed towards humans. Is this possible? What
should they do?</p>
      <p>2) They would like to set up a separate profile for
their child but have been unable to do so. Why would
your friend find this difficult? Can you do this for
them? If not, why not?</p>
      <p>3) What else would you recommend your friend
do to make the phone safe for a child to use?</p>
      <p>
        Please help your friend by searching the Internet
(on the laptop provided) to find solutions. When you
have found a solution (s) you should implement these
on the phone provided.
Our chosen task, shown in Figure 1, was to find out how to make
an android phone safe for a child. The goal of the task was
primarily to learn about how best to make the phone safe
(although to simulate a real work task [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], they were asked to
make changes to an actual Samsung phone running the Lollipop
version of android’s operation system). The task was general in
terms of the topic, and the task had multiple targets. The task
was also multi-faceted, in that there were three related sub-tasks
that could be achieved. One of the sub-tasks was open-ended in
that there was no one correct answer to find. Two of the
subtasks were more specific in that participants needed to find a
solution to a problem and make direct recommendations. A key
aspect of the overall task was uncertainty. Firstly, in that the
best way to make a phone safe for a child is highly discussed
online, as is general internet safety for children and the views
are conflicting. Secondly, for the second sub-task much of the
information available online (at the time of the study anyway)
was misleading. As such this task was also ill-formed, in that it
was based upon what a person might want to achieve, rather
than what can be achieved. The task was also successfully
dynamic and long, in that the majority of participants used all 20
minutes without completing all three parts of the task. The task,
therefore, was also certainly “not too easy”, especially in that
creating profiles in the second aspect of the task was not
achievable in that version of android on that make of phone.
Finally, based upon Wildemuth and Freund’s factors [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], the
task involved many related information and cognitive activities,
including sensemaking, comparing, and decision making.
      </p>
      <p>Several versions of the task, as well as alternatives, were
trialled in pilot studies. Many alternatives that we tried,
including e.g. how to set up a Chrome browser, eventually all
had instructional videos. A key factor in the success of our final
task is that it is a) debated fundamentally in terms of child
protection approaches, b) achieved in many ways (protection vs
prevention, etc.) and c) implemented differently for different
hardware and software versions. Crucially this meant that ready
solutions were not available.
2.2</p>
      <p>Participants and Protocol</p>
      <p>
        We recruited 39 participants using two strategies. Initially,
we recruited participants to: take part in a study about solving
technical problems using a search engine. We then used a
selfassessment scale for search literacy and technical competence,
based upon the EU Digital Competence framework [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. After
determining that our initial sample of participants had mostly
high search literacy and high tech domain knowledge, we later
recruited people with posters asking ‘do you ask other people to
solve your tech problems?’. Consequently, we aimed for a mix of
participants: 17 had high search literacy and high domain
knowledge (HH), 13 had high search literacy and low domain
knowledge (HL), and 9 had low search literacy and low domain
knowledge (LL). Because of the chosen domain, however, we did
not have any participants that we classified as low search
literacy and high domain knowledge (LH), implying that having
good “tech knowledge” came along with higher search literacy in
our sample. We also later classified people as being successful at
different performance levels, described below, and gathered
information about other domains of knowledge, including
parenting and experience with different mobile phone platforms.
      </p>
      <p>
        After gathering informed consent, participants were
presented with the task in the form of a simulated work task [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
and given 20 minutes to make progress on it. We did not allocate
specific time periods to each sub-task, and so participants could
work towards the larger task by attempting the subtasks in any
order, or indeed in combination; finding advice for part 1 often
meant encountering information for part 3, for example. The
screen of the phone and the laptop were both recorded, and the
movement between them was recorded using a GoPro Hero 4
camera. The interaction with the Chrome browser was also
comprehensively logged using a custom extension2. After the
time was up, participants completed a short questionnaire,
before reviewing their laptop screen recording as part of a
posttask interview. This post-task interview allowed us to capture a
reflective cued-retrospective think aloud [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] of their search
processes and gain insight into their cognitive activities. The
browser and phone was reset between participants, to remove
revisitation indicators for subsequent participants, however the
study was performed in a Computer Science department and so
the results could have been affected by our location.
      </p>
      <p>The study was approved by the school’s ethics board, and
participants received a £10 Amazon Voucher as remuneration for
their time. Although the task was not entirely achievable, no
participants exhibited signs of distress at being unable to
2 https://github.com/kelvinye/ChromeExtensionForWebData
complete the task. In fact, most participants were
enthusiastically engaged such that they did not want to stop
searching after the allotted time. In fact, most believed that child
safety was so important that the information should be clearly
available, and if anything were frustrated that it was not.
We broke participants’ performance into three levels, based on a
point-rating given to all three sub-tasks. The three groups are
typically (but not exclusively) characterized by their resolution
of the second part: 1) those that were unable to find basic
information, including the location of the phone’s settings (N=9),
2) those that thought they had completed the second part, but
had an incorrect solution (N=21), 3) those that correctly
concluded that part two wasn’t possible (N=9). Participants in
the top group also tended to make more than one
recommendation in part three of the task.</p>
      <p>
        Objective Behavioural Differences
We examined many metrics of search behaviour, from number of
queries and page views, to average query length, speed of
interactions, and dwell time. We found very few differences
between participants, when broken down by both performance
and by self-assessed search literacy and domain knowledge. In
fact, much of the time-based data was affected by the participant
interacting with the phone as they testing the found information.
Long periods of dwell time were not because participants were
reading results, but testing them. Indeed, longer dwell times
associated with good searching techniques [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] were often more
evident in participants that struggled with the task. Two further
activities are considered to be good searching techniques:
evaluating search results and thus deeper clicks in the SERP.
However, we saw that the most effective participants clicked
very quickly on top results only, without examining the source.
Whilst our initial results indicate that low-performing
participants clicked deeper in the search results. In interviews,
high performing participants indicated that they simply trusted
the search engine to put reputable results on the top, but judged
the utility of the result after clicking on them. We plan to release
the logged behaviour data as part of a dataset in the future.
3.2 Search Process and Tactical Differences
Overall, the majority of differences that we saw between high
and low performing participants was more to do with their
search process and use of different tactics. Based on Bates’
search tactics [
        <xref ref-type="bibr" rid="ref2 ref3">2,3</xref>
        ] and Barry &amp; Schamber’s relevance criteria
[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], we qualitatively analysed the post-task interviews to
evaluate the tactics that participants used to solve the task. We
found that participants used tactics to (1) manage the task, the
tactics that are used to answer the tasks and manage the search
process, (2) control the search, the moves made to direct what
information is received and to manage information across
multiple devices, and (3) evaluate and use information, the
tactics that participants use to select objects. For each of these
areas of concern, participants had tactics that they could use in
isolation or in combination to progress the search towards the
resolution of task problems.
      </p>
      <p>
        Although we are still finishing this analysis, early results
indicate that there are tactics associated with domain knowledge
and tactics associated with good performance, and that the two
are not an exact match. Different tactics, for example, are
available depending on the domain knowledge of the participant.
For example, when selecting search results, those with more tech
knowledge evaluated the date field in the snippet because they
were aware that information about technology quickly dates.
However, this tactic did not necessarily improve performance.
An example of a tactic that is associated with good performance,
rather than domain knowledge, was narrowing the query early
in the process. By including information about the phone (e.g.
model, make etc.) the results returned were more specific to the
task. This tactic was used by high performers, including those
with high and low domain knowledge.
In this study, we set users a very complex search task, involving
a general problem that was multi-faceted, and where the
solutions were not easily recognizable. Participants had to
engage in information and cognitive activities during the task, as
well as interacting with and testing solutions on a physical
phone in between searching. Overall, we found that objective log
data was not the best source for evaluating the open-ended,
dynamic, and extended periods of searching involved in
resolving complex search tasks. Instead, we were able to evaluate
the searching from the tactics that participants employed. Use of
different tactics, made the largest difference in task performance.
We conclude that striving to convert logged behaviour into
tactics (e.g. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]) is important future work for evaluating complex
search tasks. Further, we expect that future search user
interfaces should a) encourage participants to move between
more and less specific searches when important, and b) help
searchers to identify key concepts in results and perform
secondary searches about them.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Barry</surname>
            ,
            <given-names>C.L.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Schamber</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <year>1998</year>
          .
          <article-title>Users' criteria for relevance evaluation: a cross-situational comparison</article-title>
          . IP&amp;M,
          <volume>34</volume>
          (
          <issue>2-3</issue>
          ), pp.
          <fpage>219</fpage>
          -
          <lpage>236</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Bates</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <year>1979</year>
          .
          <article-title>Idea tactics</article-title>
          .
          <source>JASIST</source>
          ,
          <volume>30</volume>
          (
          <issue>5</issue>
          ), pp.
          <fpage>280</fpage>
          -
          <lpage>289</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Bates</surname>
            ,
            <given-names>M.J.</given-names>
          </string-name>
          ,
          <year>1979</year>
          .
          <article-title>Information search tactics</article-title>
          .
          <source>JASIST</source>
          ,
          <volume>30</volume>
          (
          <issue>4</issue>
          ), pp.
          <fpage>205</fpage>
          -
          <lpage>214</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Borlund</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Ingwersen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <year>1997</year>
          .
          <article-title>The development of a method for the evaluation of interactive information retrieval systems</article-title>
          .
          <source>JDOC</source>
          ,
          <volume>53</volume>
          (
          <issue>3</issue>
          ),
          <fpage>225</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Ferrari</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <year>2013</year>
          .
          <article-title>DIGCOMP: A framework for developing and understanding digital competence in Europe.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>He</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Qvarfordt</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Halvey</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Golovchinsky</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <year>2016</year>
          .
          <article-title>Beyond actions: Exploring the discovery of tactics from user logs</article-title>
          .
          <source>IP&amp;M</source>
          <volume>52</volume>
          (
          <issue>6</issue>
          ), pp.
          <fpage>1200</fpage>
          -
          <lpage>1226</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Laxman</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <year>2010</year>
          .
          <article-title>A conceptual framework mapping the application of information search strategies to well and ill-structured problem solving</article-title>
          .
          <source>Computers &amp; Education</source>
          ,
          <volume>55</volume>
          (
          <issue>2</issue>
          ), pp.
          <fpage>513</fpage>
          -
          <lpage>526</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Vakkari</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luoma</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Pöntinen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <year>2014</year>
          .
          <article-title>Books' interest grading and dwell time in metadata in selecting fiction</article-title>
          .
          <source>In Proc. IIiX'14</source>
          . (
          <volume>28</volume>
          -
          <fpage>37</fpage>
          ). ACM.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Van</given-names>
            <surname>Gog</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Paas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            ,
            <surname>Van Merriënboer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.J.</given-names>
            &amp;
            <surname>Witte</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.</surname>
          </string-name>
          ,
          <year>2005</year>
          .
          <article-title>Uncovering the problem-solving process: Cued retrospective reporting versus concurrent and retrospective reporting</article-title>
          .
          <source>Journal of Experimental Psychology: Applied</source>
          ,
          <volume>11</volume>
          (
          <issue>4</issue>
          ), p.
          <fpage>237</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Wildemuth</surname>
            ,
            <given-names>B.M.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Freund</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <year>2012</year>
          .
          <article-title>Assigning search tasks designed to elicit exploratory search behaviors</article-title>
          .
          <source>In Proc. HCIR'12, Article</source>
          <volume>4</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>