Search Literacy: Learning to Search to Learn
                       Max L. Wilson1, Chaoyu Ye1, Michael B. Twidale2, Hannah Grasse2,
                                       Jacob Rosenthal2, Max McKittrick2
                          1                                                              2
                        Mixed Reality Lab                                               School of Information Sciences
                   School of Computer Science                                     University of Illinois at Urbana-Champaign
                   University of Nottingham, UK                                                         USA
      [max.wilson, psxcy1]@nottingham.ac.uk

ABSTRACT                                                                    underlying problem is, they may not know the correct terminology
People can often find themselves out of their depth when they               to describe the problem or to search for a solution, and find it hard
face knowledge-based problems, such as faulty technology, or                to understand if results are relevant. Indeed they may struggle to
medical concerns. This can also happen in everyday domains that             find a result that explains the solution in a way that they can
users are simply inexperienced with, like cooking. These are                understand without doing yet more searches.
common exploratory search conditions, where users don’t quite               In the “tech problems” case study domain, we see examples of
know enough about the domain to know if they are submitting a               both domain-novice users struggling to comprehend technical
good query, nor if the results directly resolve their need or can be        jargon and whether results will help them sort out their problems,
translated to do so. In such situations, people turn to their friends       but we also see examples of domain-expert users, who fully
for help, or to forums like StackOverflow, so that someone can              understand the technical jargon, but are synthesizing or
explain things to them and translate information to their specific          diagnosing more complex or combined technical problems, and
need. This short paper describes work-in-progress within a                  are seeking more specific specialized knowledge. Our research is
Google-funded project focusing on Search Literacy in these                  driven by the observation that the behavioral difference between
situations, where improved search skills will help users to learn as        techies solving these problems, and novices, is that techies use
they search, to search better, and to better comprehend the results.        search skills, associated with higher search literacy [7, 15], to
Focusing on the technology-problem domain, we present initial               resolve the situation: e.g. when they encounter something they
results from a qualitative study of questions asked and answers             don’t understand, they resolve the new information needs with
given in StackOverflow, and present plans for designing search              supplementary searches.
engine support to help searchers learn as they search.
                                                                            Regardless of domain expertise, research indicates that searching
CCS Concepts                                                                and learning are often closely interleaved [16]. A person may
                                                                            choose to learn about a technology, tinker with it, get stuck,
• Information systems → Information retrieval → Users and
                                                                            search online for help, find a resource (such as a tutorial, blogpost
interactive retrieval → Search Interfaces. • Information
                                                                            or how-to video) - or ask for help in a forum. This can lead to
systems → World Wide Web → Web searching and
                                                                            either a solution or further learning goals. As well as searching-as-
information discovery.
                                                                            part-of-learning, a person may also be learning-as-part-of-
Keywords                                                                    searching: learning better search skills and information literacy,
Search Literacy, Search User Interfaces, Information Seeking.               but also in technical areas, learning how to debug a problem
                                                                            better, how to isolate the cause of the failure they have
1. INTRODUCTION                                                             encountered, how to do better diagnosis of a technological
While there are many facets that create different kinds of                  impasse – or of their understanding of that impasse. This project,
exploratory search situations [18], and even less task-oriented             therefore, aims to observe strong searching skills, in order to
casual-leisure situations [6], Exploratory Search was originally            design new Search User Interface features [19] that encourage
characterized as occurring when users are 1) unfamiliar with their          search novices to learn and improve their search literacy.
domain, 2) unsure of which words to use, and 3) unable to judge
the usefulness of results [17]. This work aims to study how Search
                                                                            2. INITIAL STUDY
Literacy helps to make progress within such confusing search                To examine experiences of solving tech problems, we looked at
situations. To do this, we focus on searchers trying to solve “tech         the questions asked and answers given in StackOverflow (SO), a
problems”, where they are likely to experience all three                    social collaborative Q&A site (SQA) for technical questions [e.g.
Exploratory Search characterizations – they don’t really                    13]. The aim was to look at a venue where technical questions
understand the technology and may not really know what the                  were asked and answered, often quite complex ones and including
                                                                            technically sophisticated question askers and answerers (although
                                                                            also including some novices). We wanted to get a better
                                                                            understanding of what it takes to ask good questions and obtain
                                                                            good answers in a social collaborative setting. However our main
                                                                            focus was less on “how does SO work so well?” and more on the
                                                                            lessons and ideas it might inspire when thinking about how a
 SAL2016, July 21, 2016, Pisa, Italy.                                       search engine to help with searching for technical answers. Seeing
 Copyright for this paper remains with the authors. Copying permitted for   how humans do it well can be informative, even if we cannot, or
 private and academic purposes.                                             do not want to, directly translate the methods to a search agent.
As part of a larger Trace Ethnography [8] investigation, we first       queries in a search engine) but also question editing – both by the
looked at our small sample of questions in SO from the                  original asker and by other participants. The formulation of the
perspective of the features that seem to be part of what makes a        question is considered important work in SO and not to be done in
good question in this setting. We then compared them with what          a sloppy manner. The construction of the question title and choice
we see in a generic search engine such as Google. Finally we            of tags used needs particular care – in order to catch the eye and
compared social question asking with the well-established and           interest of potential question answerers. These findings are similar
well-documented case of reference librarianship where a                 to related work [10].
designated professional tries to help people with all kinds of
questions. These analyses are informing ongoing design ideas for        Question askers may occasionally answer their own question –
a better search engine, and we note some preliminary ideas.             and then take the time to report that back to SO. This and the
                                                                        previous point about question phrasing remind us that SO can also
2.1 Methods                                                             be seen in terms of knowledge management – including how
We selected sixty-four questions from Stack Overflow. Special           answered questions can potentially be reused by others, whereas
attention was paid to questions that had garnered responses from        we usually think of queries into a search engine as use-once
other users. The topics of the questions varied, but they all related   disposable activities. This insight reinforces the value of tracking
in some way to programming in a range of languages. Topicality          user journeys of understanding [3], and might inspire
was limited to questions that the research team had prior               developments in ideas like the retired SearchPad [5].
knowledge about (so we could analyze the discussion).
                                                                        Taken together, these points serve to remind us that a well-posed
Based on an emergent thematic analysis approach, the questions          question is a learnable skill. Learning it is desirable to increase the
were coded from three main perspectives:                                odds of getting a good response from SO members. It also can
•   Aspects of the question. These include points informally            help in understanding your own current problem – to the extent
    characterized as: How-do-I? Is this possible?, My main goal         that you may be able to solve your problem yourself while you are
    is X, and so I am trying to do Y, In particular, what I really      waiting for a solution from others. Furthermore, learning how to
    want is…, Please recommend X, Why is this the case?,                form good questions may help in future tech problems as part of
    Here's a weird thing, Is X even possible?, etc.                     an armory of metacognitive strategies. This is similar to academic
•   Supplemental Information. These include specific examples,          research, where asking the right question or the question in the
    code fragments, URLs and images, as well as what might be           right way is a critical part of the research process and one to be
    termed “due diligence” - what the question asker has already        taught to new researchers.
    tried, how that failed, places looked for information, etc.         SO has various affordances for learning how to ask good
•   Tone of the question. These include how the question was            questions. You can learn vicariously by seeing other people’s
    asked, the formality of phrasing, whether more of a narrative,      questions as exemplars. Your own question can lead to follow-up
    whether particularly focussed, identifying background as a          clarification questions, or even having your question edited by
    newbie or expert, or issues of question-asking etiquette.           others as a more immediate indicator of what you should have
                                                                        done and should think about doing next time. The voting
2.2 Selected findings from question analysis                            mechanism also allows an indication of collective views of what
Both search engine use and SQA have certain features in                 makes a good question, including seeing your own question being
common: 1) Initial query, 2) Results, ranked somehow, 3)                up-voted as it is improved. In future work we want to think about
Selection of those to attend to, 4) Assessment of quality and           how search interfaces (typically rather solitary places) might also
relevance, 5) Query refinement and iteration. With SO, however,         facilitate such kinds of learning that occurs almost spontaneously
there is of course a human in the loop. Indeed several humans and       in SQA – given that we suspect that SO was designed and
at each stage of the Information Seeking Process (ISP). This            developed with much more attention to giving answers than
makes aspects of the ISP much more visible and helps to consider        explicitly facilitating these kinds of learning.
not just how it operates in this particular case, but how it might
operate otherwise in other cases. Although do not have space to         Finally, some question askers explicitly note their level of
present our three full taxonomies in this paper, we now present         technical sophistication as a way of indicating the kind of answer
several initial findings.                                               they need or would appreciate. An expert typically can manage
                                                                        with a far more terse, abstract and technical answer than a novice.
There are various features that seem to help make a good question       However, expertise is not a single scale. A person may be
– one that is easier for others to answer and indeed invites others     technically adept but is currently asking about a problem with
to answer. These features are articulated in various kinds of           JavaScript, having never used it before. Although these levels of
advice given about best practices on SO, which are embodied in          ‘techiness’ may be explicitly stated, they can also be inferred from
an established etiquette in SO usage. Examples of features that         the way the question is phrased, and it seems that this is taken into
make a good question include context: the technical setup that the      account in the answers given.
person was using, and the overall goal of what the person was
actually trying to do that led to the particular question that was      2.3 SQA vs. Reference Librarianship
asked. Some kind of due diligence information is common. This           We re-coded our sample of questions and answers for features
can include what the asker tried in order to solve the problem          identifying similarities and differences from what is seen in
herself (and what resulted and why it was not helpful, thereby          reference librarianship (RL). For this we used the expertise of one
necessitating this help request), mention of search attempts to try     of the authors, who works as a reference librarian and has studied
and find an answer and diagnostic activities to try and simplify        its theory and practice, e.g. [11]. There were considerably more
down the initial problem to isolate the underlying cause.               similarities than differences (twice as many coded items that are
One feature we found particularly interesting was not merely            similarities). Librarians use a combination of both hard and soft
question refinement as an iterative activity (analogous to iterated     skills – and this creates an interesting lens to look at what happens
                                                                        in SO. Similarities included numerous instruction-like how-to
operations (rather than simply giving a factual answer – known in       As indicated in Figure 1, browsers could auto-detect some of this
RL as ‘ready reference’). We found considerable rapport-building        information and offer it to the users. For example, the browser can
communication that is a core aspect of the reference interview          detect the operating system and version being used. This can only
(RI), despite the fact that SO guidelines discourage the use of this    be indicative since searchers may be searching for tech problems
kind of communication.                                                  experienced on other devices. The key challenge here is to
Half the threads included some kind of clarification question – a       iteratively discover the correct boxes to suggest, which becomes
recurrent aspect of the RI where it is often about trying to            harder once we consider supporting many different search
understand the underlying goals of what the patron really wants to      problems. In some kinds of remote library reference, patrons are
do and how that may contrast with what they have currently              asked to fill out an online form which can help to structure the
actually asked. This is something that basic search engines do not      interaction, and at least help the patron to consider providing
support – and with good reason. It is extremely hard to do well         information that may help the librarian give the best advice.
and requires considerable sensitivity on the part of the RL,            3.2 Design 2: Eliciting through Dialogue
because sometimes the patron may not want to answer questions           With the increasing predominance of spoken interfaces like
like: “Why are you asking about that?”, “What do you want to do         Google Now, Siri, and Cortana, spoken search [9] presents an
with it?” or: “What are you actually trying to do?”.                    opportunity for providing tech support through dialogue.
As noted, due diligence activities occur as ways of noting what         Although we have come a long way since automated support like
the asker has tried so far. Other categories include whether the        Clippy, there are still many open challenges with Spoken Search,
answer was complex or a simple fact-based answer, and whether           like experiencing an error midway through a multi-stage
external resources (documents or people) were pointed to.               interaction [12]. However, spoken search presents a new
Complex answers including pointers to resources that can enable         opportunity to learn from and re-engage the ideas of Search
additional kinds of learning. They may include the answer to the        Intermediaries [14] like Reference Librarians. Studies of Search
particular question asked, but also additional insights, larger         Intermediaries led to Dialogue oriented systems in the 90s [2],
framings and generalizations. This kind of enrichment                   which may now have new relevance in spoken search. With
learning/teaching is a feature of reference interactions that are in    dialogue based interaction it becomes less critical for the question
that context termed ‘research’ rather than ‘ready reference’.           asker to provide all the contextual information up-front (desirable
                                                                        in SO) and indeed to need to know what that contextual
Reference librarians are discouraged from offering judgments or
                                                                        information may be. Figure 2 shows a visual alternative to this
opinions; instead they are meant to focus on ‘just the facts’ and
                                                                        scenario for non-spoken on-screen dialogues.
providing access to information resources. Similarly SO
guidelines discourage requests for opinions and recommendations.
Despite this we do find a significant minority of recommendation
requests, especially for choosing between alternative approaches.

3. PROPOSED DESIGNS
The aim of the first period of work, described above, was to create
design implications. Although the results and design implications
are still underway, we present three initial designs below.
3.1 Design 1: Elaborating the Detail
The common technique seen in ‘good questions’ on SO, is to
make sure important contextual information is provided upfront.
This is to help answers estimate Common Ground in
understanding [4]. Our first initial prototype explores the
possibilities of a search interface identifying the question type and
prompting for detail on the right answer.                               Figure 2: For some StackOverflow questions, a back and forth
                                                                             conversation is needed to identify the right answer


                                                                         Figure 3: Sometimes users do not understand the jargon and
Figure 1: A key factor of strong questions in StackOverflow is
                                                                              its implications, in either the query or the results
to enter all the valuable detail to set the context of the problem
                                                                         [3] Mikhail Bilenko and Ryen W. White. (2008) Mining the
                                                                             search trails of surfing crowds: identifying relevant websites
3.3 Design 3: Exploring the Definitions                                      from user activity. In Proc. WWW '08. ACM, New York,
A key problem experienced by users is that they don’t recognize              NY, USA, 51-60.
whether items are object-specific or generic terms, and thus
whether advice specifically relates to their situation, or does in       [4] Clark, Herbert H., and Susan E. Brennan. "Grounding in
principle. One way to improve search literacy would be to help               communication."Perspectives on socially shared
users to interrogate key words or phrases. These might be to find            cognition 13.1991 (1991): 127-149.
out more about query terms, or terms in the SERPs, supporting            [5] Debora Donato, Francesco Bonchi, Tom Chi, and Yoelle
what Bates called the TRACE tactic [1]. As shown in Figure 3, a              Maarek. (2010) Do you want to take notes?: identifying
key element of this idea might be to set of vary the existing                research missions in Yahoo! search pad. In Proc. WWW '10.
knowledge of the user, which would help with the ‘levels of                  ACM, New York, NY, USA, 321-330.
techiness’ problem, where those levels affect the kind of answer
                                                                         [6] David Elsweiler, Max L. Wilson, and Brian Kirkegaard
that you may want. Another key element of this idea is to remain
                                                                             Lunn (2011), Chapter 9 Understanding Casual-Leisure
within the context of the search, but be able to interrogate and
                                                                             Information Behaviour, in Amanda Spink, Jannica
explore concepts returned in the results, rather than queries, in situ
                                                                             Heinström (ed.) New Directions in Information Behaviour,
to develop confidence in the results.
                                                                             Emerald Group Publishing Limited, pp.211 – 241
4. CONCLUSIONS & FUTURE WORK                                             [7] Susan R. Goldman. (2010). Literacy in the digital world:
The aim of this on-going project is to investigate Search Literacy           Comprehending and learning from multiple sources. In M.G.
in situations where users become easily confused within their                McKeown & L. Kucan (Eds.), Bringing reading research to
search domain. Solving technical problems leads to many kinds of             life (pp. 257–284). New York: Guilford.
interleaved search and learning. You may be choosing to learn a          [8] R. Stuart Geiger and David Ribes, (2011), Trace
technology and use search as part of your learning activities. Or            Ethnography: Following Coordination through Documentary
you may be searching, hoping to find a simple fact-like answer,              Practices, System Sciences (HICSS’11), Kauai, HI, pp. 1-10.
but along the way learn other things including how to search
better and how to go about learning other technologies better. This      [9] Larry P. Heck, Dilek Hakkani-Tür, Madhu Chinthakunta,
might be because they are domain novices, or they may even be                Gökhan Tür, Rukmini Iyer, Partha Parthasarathy, Lisa
domain-experts who just lack the specific knowledge they need. It            Stifelman, Elizabeth Shriberg, and Ashley Fidler. (2013).
may be that technology related problems have features that make              Multi-Modal Conversational Search and Browse. In SLAM@
certain kinds of search hard, not because technology problems are            INTERSPEECH, pp. 96-101.
unique but that they exacerbate issues seen in many other settings.      [10] Jeon, Grace YoungJoo, and Soo Young Rieh. "Social search
These include multiple kinds of expertise, problems with not                  behavior in a social Q&A service: Goals, strategies, and
knowing terminology, confusing interactions with other                        outcomes." Proc ASIST 2015, 52.1 (2015): 1-10.
technologies and large amounts of prerequisite knowledge you
                                                                         [11] William A. Katz, (2002) Introduction to Reference Work.
may not yet have, or see as related.
                                                                              Boston: McGraw-Hill,.
Our on-going work has begun to produce three taxonomies of how
                                                                         [12] Julia Kiseleva, Kyle Williams, Jiepu Jiang, Ahmed Hassan
users overcome these barriers by asking questions within the
                                                                              Awadallah, Aidan C. Crook, Imed Zitouni, and Tasos
StackOverflow community. Initial analysis of this taxonomy has
                                                                              Anastasakos, (2016). Understanding User Satisfaction with
led to the identification of a few initial design ideas, presented            Intelligent Assistants. In Proc. CHIIR '16. 121-130.
above, that might help users to improve their search literacy and
make progress even in confusing circumstances.                           [13] Lena Mamykina, Bella Manoim, Manas Mittal, George
                                                                              Hripcsak, and Björn Hartmann. (2011) Design lessons from
Future work will focus on elaborating on these design ideas and               the fastest Q&A site in the west. In Proc CHI’11 2857-2866.
producing further ideas. Refining these designs will be important,
as keeping them within the familiar experiences of Google will           [14] Richard S. Marcus (1983). An experimental comparison of
help study their potential benefits. Initial work has also begun into         the effectiveness of computers and humans as search
building Chrome extensions to customize the appearance of                     intermediaries. J. Am. Soc. Inf. Sci., 34: 381–404.
Google pages, in order to deploy final prototypes with real users.       [15] Daniel Russell (2015). Mindtools: what does it mean to be
Subsequent observational and experimental studies will attempt to             literate in the age of Google? J. Comput. Sci. Coll. 30(3) 5-6.
investigate the benefits of each design idea.
                                                                         [16] Pertti Vakkari (2016). Searching as learning: A
5. ACKNOWLEDGMENTS                                                            systematization based on literature. Journal of Information
This work was funded by a Google Faculty Research Award                       Science. 42: 7-18.
2015_R1_669, as a collaboration between Nottingham and Illinois          [17] Ryen W. White, Bill Kules, Steven M. Drucker, and m.c.
on “Understanding Search Literacy and Search Skills Adoption”.                schraefel. (2006). Introduction. Comm. ACM 49 (4), 36-39.

6. REFERENCES                                                            [18] Barbara M. Wildemuth and Luanne Freund. (2012).
                                                                              Assigning search tasks designed to elicit exploratory search
[1] Maric J, Bates. (1979), Information search tactics. J. Am.
                                                                              behaviors. In Proc HCIR '12. ACM, New York, NY, USA,
    Soc. Inf. Sci., 30: 205–214.
                                                                         [19] Max L. Wilson (2011). "Search user interface
[2] Nicholas J. Belkin, Colleen Cool, Adelheit Stein, and Ulrich
                                                                              design." Synthesis lectures on information concepts,
    Thiel. (1995) Cases, scripts, and information-seeking
                                                                              retrieval, and services 3.3: 1-143.
    strategies: On the design of interactive information retrieval
    systems. Expert systems with applications 9, no. 3: 379-395.