Search Literacy: Learning to Search to Learn Max L. Wilson1, Chaoyu Ye1, Michael B. Twidale2, Hannah Grasse2, Jacob Rosenthal2, Max McKittrick2 1 2 Mixed Reality Lab School of Information Sciences School of Computer Science University of Illinois at Urbana-Champaign University of Nottingham, UK USA [max.wilson, psxcy1]@nottingham.ac.uk ABSTRACT underlying problem is, they may not know the correct terminology People can often find themselves out of their depth when they to describe the problem or to search for a solution, and find it hard face knowledge-based problems, such as faulty technology, or to understand if results are relevant. Indeed they may struggle to medical concerns. This can also happen in everyday domains that find a result that explains the solution in a way that they can users are simply inexperienced with, like cooking. These are understand without doing yet more searches. common exploratory search conditions, where users don’t quite In the “tech problems” case study domain, we see examples of know enough about the domain to know if they are submitting a both domain-novice users struggling to comprehend technical good query, nor if the results directly resolve their need or can be jargon and whether results will help them sort out their problems, translated to do so. In such situations, people turn to their friends but we also see examples of domain-expert users, who fully for help, or to forums like StackOverflow, so that someone can understand the technical jargon, but are synthesizing or explain things to them and translate information to their specific diagnosing more complex or combined technical problems, and need. This short paper describes work-in-progress within a are seeking more specific specialized knowledge. Our research is Google-funded project focusing on Search Literacy in these driven by the observation that the behavioral difference between situations, where improved search skills will help users to learn as techies solving these problems, and novices, is that techies use they search, to search better, and to better comprehend the results. search skills, associated with higher search literacy [7, 15], to Focusing on the technology-problem domain, we present initial resolve the situation: e.g. when they encounter something they results from a qualitative study of questions asked and answers don’t understand, they resolve the new information needs with given in StackOverflow, and present plans for designing search supplementary searches. engine support to help searchers learn as they search. Regardless of domain expertise, research indicates that searching CCS Concepts and learning are often closely interleaved [16]. A person may choose to learn about a technology, tinker with it, get stuck, • Information systems → Information retrieval → Users and search online for help, find a resource (such as a tutorial, blogpost interactive retrieval → Search Interfaces. • Information or how-to video) - or ask for help in a forum. This can lead to systems → World Wide Web → Web searching and either a solution or further learning goals. As well as searching-as- information discovery. part-of-learning, a person may also be learning-as-part-of- Keywords searching: learning better search skills and information literacy, Search Literacy, Search User Interfaces, Information Seeking. but also in technical areas, learning how to debug a problem better, how to isolate the cause of the failure they have 1. INTRODUCTION encountered, how to do better diagnosis of a technological While there are many facets that create different kinds of impasse – or of their understanding of that impasse. This project, exploratory search situations [18], and even less task-oriented therefore, aims to observe strong searching skills, in order to casual-leisure situations [6], Exploratory Search was originally design new Search User Interface features [19] that encourage characterized as occurring when users are 1) unfamiliar with their search novices to learn and improve their search literacy. domain, 2) unsure of which words to use, and 3) unable to judge the usefulness of results [17]. This work aims to study how Search 2. INITIAL STUDY Literacy helps to make progress within such confusing search To examine experiences of solving tech problems, we looked at situations. To do this, we focus on searchers trying to solve “tech the questions asked and answers given in StackOverflow (SO), a problems”, where they are likely to experience all three social collaborative Q&A site (SQA) for technical questions [e.g. Exploratory Search characterizations – they don’t really 13]. The aim was to look at a venue where technical questions understand the technology and may not really know what the were asked and answered, often quite complex ones and including technically sophisticated question askers and answerers (although also including some novices). We wanted to get a better understanding of what it takes to ask good questions and obtain good answers in a social collaborative setting. However our main focus was less on “how does SO work so well?” and more on the lessons and ideas it might inspire when thinking about how a SAL2016, July 21, 2016, Pisa, Italy. search engine to help with searching for technical answers. Seeing Copyright for this paper remains with the authors. Copying permitted for how humans do it well can be informative, even if we cannot, or private and academic purposes. do not want to, directly translate the methods to a search agent. As part of a larger Trace Ethnography [8] investigation, we first queries in a search engine) but also question editing – both by the looked at our small sample of questions in SO from the original asker and by other participants. The formulation of the perspective of the features that seem to be part of what makes a question is considered important work in SO and not to be done in good question in this setting. We then compared them with what a sloppy manner. The construction of the question title and choice we see in a generic search engine such as Google. Finally we of tags used needs particular care – in order to catch the eye and compared social question asking with the well-established and interest of potential question answerers. These findings are similar well-documented case of reference librarianship where a to related work [10]. designated professional tries to help people with all kinds of questions. These analyses are informing ongoing design ideas for Question askers may occasionally answer their own question – a better search engine, and we note some preliminary ideas. and then take the time to report that back to SO. This and the previous point about question phrasing remind us that SO can also 2.1 Methods be seen in terms of knowledge management – including how We selected sixty-four questions from Stack Overflow. Special answered questions can potentially be reused by others, whereas attention was paid to questions that had garnered responses from we usually think of queries into a search engine as use-once other users. The topics of the questions varied, but they all related disposable activities. This insight reinforces the value of tracking in some way to programming in a range of languages. Topicality user journeys of understanding [3], and might inspire was limited to questions that the research team had prior developments in ideas like the retired SearchPad [5]. knowledge about (so we could analyze the discussion). Taken together, these points serve to remind us that a well-posed Based on an emergent thematic analysis approach, the questions question is a learnable skill. Learning it is desirable to increase the were coded from three main perspectives: odds of getting a good response from SO members. It also can • Aspects of the question. These include points informally help in understanding your own current problem – to the extent characterized as: How-do-I? Is this possible?, My main goal that you may be able to solve your problem yourself while you are is X, and so I am trying to do Y, In particular, what I really waiting for a solution from others. Furthermore, learning how to want is…, Please recommend X, Why is this the case?, form good questions may help in future tech problems as part of Here's a weird thing, Is X even possible?, etc. an armory of metacognitive strategies. This is similar to academic • Supplemental Information. These include specific examples, research, where asking the right question or the question in the code fragments, URLs and images, as well as what might be right way is a critical part of the research process and one to be termed “due diligence” - what the question asker has already taught to new researchers. tried, how that failed, places looked for information, etc. SO has various affordances for learning how to ask good • Tone of the question. These include how the question was questions. You can learn vicariously by seeing other people’s asked, the formality of phrasing, whether more of a narrative, questions as exemplars. Your own question can lead to follow-up whether particularly focussed, identifying background as a clarification questions, or even having your question edited by newbie or expert, or issues of question-asking etiquette. others as a more immediate indicator of what you should have done and should think about doing next time. The voting 2.2 Selected findings from question analysis mechanism also allows an indication of collective views of what Both search engine use and SQA have certain features in makes a good question, including seeing your own question being common: 1) Initial query, 2) Results, ranked somehow, 3) up-voted as it is improved. In future work we want to think about Selection of those to attend to, 4) Assessment of quality and how search interfaces (typically rather solitary places) might also relevance, 5) Query refinement and iteration. With SO, however, facilitate such kinds of learning that occurs almost spontaneously there is of course a human in the loop. Indeed several humans and in SQA – given that we suspect that SO was designed and at each stage of the Information Seeking Process (ISP). This developed with much more attention to giving answers than makes aspects of the ISP much more visible and helps to consider explicitly facilitating these kinds of learning. not just how it operates in this particular case, but how it might operate otherwise in other cases. Although do not have space to Finally, some question askers explicitly note their level of present our three full taxonomies in this paper, we now present technical sophistication as a way of indicating the kind of answer several initial findings. they need or would appreciate. An expert typically can manage with a far more terse, abstract and technical answer than a novice. There are various features that seem to help make a good question However, expertise is not a single scale. A person may be – one that is easier for others to answer and indeed invites others technically adept but is currently asking about a problem with to answer. These features are articulated in various kinds of JavaScript, having never used it before. Although these levels of advice given about best practices on SO, which are embodied in ‘techiness’ may be explicitly stated, they can also be inferred from an established etiquette in SO usage. Examples of features that the way the question is phrased, and it seems that this is taken into make a good question include context: the technical setup that the account in the answers given. person was using, and the overall goal of what the person was actually trying to do that led to the particular question that was 2.3 SQA vs. Reference Librarianship asked. Some kind of due diligence information is common. This We re-coded our sample of questions and answers for features can include what the asker tried in order to solve the problem identifying similarities and differences from what is seen in herself (and what resulted and why it was not helpful, thereby reference librarianship (RL). For this we used the expertise of one necessitating this help request), mention of search attempts to try of the authors, who works as a reference librarian and has studied and find an answer and diagnostic activities to try and simplify its theory and practice, e.g. [11]. There were considerably more down the initial problem to isolate the underlying cause. similarities than differences (twice as many coded items that are One feature we found particularly interesting was not merely similarities). Librarians use a combination of both hard and soft question refinement as an iterative activity (analogous to iterated skills – and this creates an interesting lens to look at what happens in SO. Similarities included numerous instruction-like how-to operations (rather than simply giving a factual answer – known in As indicated in Figure 1, browsers could auto-detect some of this RL as ‘ready reference’). We found considerable rapport-building information and offer it to the users. For example, the browser can communication that is a core aspect of the reference interview detect the operating system and version being used. This can only (RI), despite the fact that SO guidelines discourage the use of this be indicative since searchers may be searching for tech problems kind of communication. experienced on other devices. The key challenge here is to Half the threads included some kind of clarification question – a iteratively discover the correct boxes to suggest, which becomes recurrent aspect of the RI where it is often about trying to harder once we consider supporting many different search understand the underlying goals of what the patron really wants to problems. In some kinds of remote library reference, patrons are do and how that may contrast with what they have currently asked to fill out an online form which can help to structure the actually asked. This is something that basic search engines do not interaction, and at least help the patron to consider providing support – and with good reason. It is extremely hard to do well information that may help the librarian give the best advice. and requires considerable sensitivity on the part of the RL, 3.2 Design 2: Eliciting through Dialogue because sometimes the patron may not want to answer questions With the increasing predominance of spoken interfaces like like: “Why are you asking about that?”, “What do you want to do Google Now, Siri, and Cortana, spoken search [9] presents an with it?” or: “What are you actually trying to do?”. opportunity for providing tech support through dialogue. As noted, due diligence activities occur as ways of noting what Although we have come a long way since automated support like the asker has tried so far. Other categories include whether the Clippy, there are still many open challenges with Spoken Search, answer was complex or a simple fact-based answer, and whether like experiencing an error midway through a multi-stage external resources (documents or people) were pointed to. interaction [12]. However, spoken search presents a new Complex answers including pointers to resources that can enable opportunity to learn from and re-engage the ideas of Search additional kinds of learning. They may include the answer to the Intermediaries [14] like Reference Librarians. Studies of Search particular question asked, but also additional insights, larger Intermediaries led to Dialogue oriented systems in the 90s [2], framings and generalizations. This kind of enrichment which may now have new relevance in spoken search. With learning/teaching is a feature of reference interactions that are in dialogue based interaction it becomes less critical for the question that context termed ‘research’ rather than ‘ready reference’. asker to provide all the contextual information up-front (desirable in SO) and indeed to need to know what that contextual Reference librarians are discouraged from offering judgments or information may be. Figure 2 shows a visual alternative to this opinions; instead they are meant to focus on ‘just the facts’ and scenario for non-spoken on-screen dialogues. providing access to information resources. Similarly SO guidelines discourage requests for opinions and recommendations. Despite this we do find a significant minority of recommendation requests, especially for choosing between alternative approaches. 3. PROPOSED DESIGNS The aim of the first period of work, described above, was to create design implications. Although the results and design implications are still underway, we present three initial designs below. 3.1 Design 1: Elaborating the Detail The common technique seen in ‘good questions’ on SO, is to make sure important contextual information is provided upfront. This is to help answers estimate Common Ground in understanding [4]. Our first initial prototype explores the possibilities of a search interface identifying the question type and prompting for detail on the right answer. Figure 2: For some StackOverflow questions, a back and forth conversation is needed to identify the right answer Figure 3: Sometimes users do not understand the jargon and Figure 1: A key factor of strong questions in StackOverflow is its implications, in either the query or the results to enter all the valuable detail to set the context of the problem [3] Mikhail Bilenko and Ryen W. White. (2008) Mining the search trails of surfing crowds: identifying relevant websites 3.3 Design 3: Exploring the Definitions from user activity. In Proc. WWW '08. ACM, New York, A key problem experienced by users is that they don’t recognize NY, USA, 51-60. whether items are object-specific or generic terms, and thus whether advice specifically relates to their situation, or does in [4] Clark, Herbert H., and Susan E. Brennan. "Grounding in principle. One way to improve search literacy would be to help communication."Perspectives on socially shared users to interrogate key words or phrases. These might be to find cognition 13.1991 (1991): 127-149. out more about query terms, or terms in the SERPs, supporting [5] Debora Donato, Francesco Bonchi, Tom Chi, and Yoelle what Bates called the TRACE tactic [1]. As shown in Figure 3, a Maarek. (2010) Do you want to take notes?: identifying key element of this idea might be to set of vary the existing research missions in Yahoo! search pad. In Proc. WWW '10. knowledge of the user, which would help with the ‘levels of ACM, New York, NY, USA, 321-330. techiness’ problem, where those levels affect the kind of answer [6] David Elsweiler, Max L. Wilson, and Brian Kirkegaard that you may want. Another key element of this idea is to remain Lunn (2011), Chapter 9 Understanding Casual-Leisure within the context of the search, but be able to interrogate and Information Behaviour, in Amanda Spink, Jannica explore concepts returned in the results, rather than queries, in situ Heinström (ed.) New Directions in Information Behaviour, to develop confidence in the results. Emerald Group Publishing Limited, pp.211 – 241 4. CONCLUSIONS & FUTURE WORK [7] Susan R. Goldman. (2010). Literacy in the digital world: The aim of this on-going project is to investigate Search Literacy Comprehending and learning from multiple sources. In M.G. in situations where users become easily confused within their McKeown & L. Kucan (Eds.), Bringing reading research to search domain. Solving technical problems leads to many kinds of life (pp. 257–284). New York: Guilford. interleaved search and learning. You may be choosing to learn a [8] R. Stuart Geiger and David Ribes, (2011), Trace technology and use search as part of your learning activities. Or Ethnography: Following Coordination through Documentary you may be searching, hoping to find a simple fact-like answer, Practices, System Sciences (HICSS’11), Kauai, HI, pp. 1-10. but along the way learn other things including how to search better and how to go about learning other technologies better. This [9] Larry P. Heck, Dilek Hakkani-Tür, Madhu Chinthakunta, might be because they are domain novices, or they may even be Gökhan Tür, Rukmini Iyer, Partha Parthasarathy, Lisa domain-experts who just lack the specific knowledge they need. It Stifelman, Elizabeth Shriberg, and Ashley Fidler. (2013). may be that technology related problems have features that make Multi-Modal Conversational Search and Browse. In SLAM@ certain kinds of search hard, not because technology problems are INTERSPEECH, pp. 96-101. unique but that they exacerbate issues seen in many other settings. [10] Jeon, Grace YoungJoo, and Soo Young Rieh. "Social search These include multiple kinds of expertise, problems with not behavior in a social Q&A service: Goals, strategies, and knowing terminology, confusing interactions with other outcomes." Proc ASIST 2015, 52.1 (2015): 1-10. technologies and large amounts of prerequisite knowledge you [11] William A. Katz, (2002) Introduction to Reference Work. may not yet have, or see as related. Boston: McGraw-Hill,. Our on-going work has begun to produce three taxonomies of how [12] Julia Kiseleva, Kyle Williams, Jiepu Jiang, Ahmed Hassan users overcome these barriers by asking questions within the Awadallah, Aidan C. Crook, Imed Zitouni, and Tasos StackOverflow community. Initial analysis of this taxonomy has Anastasakos, (2016). Understanding User Satisfaction with led to the identification of a few initial design ideas, presented Intelligent Assistants. In Proc. CHIIR '16. 121-130. above, that might help users to improve their search literacy and make progress even in confusing circumstances. [13] Lena Mamykina, Bella Manoim, Manas Mittal, George Hripcsak, and Björn Hartmann. (2011) Design lessons from Future work will focus on elaborating on these design ideas and the fastest Q&A site in the west. In Proc CHI’11 2857-2866. producing further ideas. Refining these designs will be important, as keeping them within the familiar experiences of Google will [14] Richard S. Marcus (1983). An experimental comparison of help study their potential benefits. Initial work has also begun into the effectiveness of computers and humans as search building Chrome extensions to customize the appearance of intermediaries. J. Am. Soc. Inf. Sci., 34: 381–404. Google pages, in order to deploy final prototypes with real users. [15] Daniel Russell (2015). Mindtools: what does it mean to be Subsequent observational and experimental studies will attempt to literate in the age of Google? J. Comput. Sci. Coll. 30(3) 5-6. investigate the benefits of each design idea. [16] Pertti Vakkari (2016). Searching as learning: A 5. ACKNOWLEDGMENTS systematization based on literature. Journal of Information This work was funded by a Google Faculty Research Award Science. 42: 7-18. 2015_R1_669, as a collaboration between Nottingham and Illinois [17] Ryen W. White, Bill Kules, Steven M. Drucker, and m.c. on “Understanding Search Literacy and Search Skills Adoption”. schraefel. (2006). Introduction. Comm. ACM 49 (4), 36-39. 6. REFERENCES [18] Barbara M. Wildemuth and Luanne Freund. (2012). Assigning search tasks designed to elicit exploratory search [1] Maric J, Bates. (1979), Information search tactics. J. Am. behaviors. In Proc HCIR '12. ACM, New York, NY, USA, Soc. Inf. Sci., 30: 205–214. [19] Max L. Wilson (2011). "Search user interface [2] Nicholas J. Belkin, Colleen Cool, Adelheit Stein, and Ulrich design." Synthesis lectures on information concepts, Thiel. (1995) Cases, scripts, and information-seeking retrieval, and services 3.3: 1-143. strategies: On the design of interactive information retrieval systems. Expert systems with applications 9, no. 3: 379-395.