Revisiting User Information Needs in Aggregated Search Shanu Sushmita Martin Halvey University of California LA Glasgow Caledonian shanusushmita@ucla.edu University Martin.Halvey@gcu.ac.uk Robert Villa Mounia Lalmas University of Sheffield Yahoo! Labs Barcelona r.villa@sheffield.ac.uk mounia@acm.org ABSTRACT web search, classifying users’ information needs into three Aggregated search interfaces are a common way to present categories, namely, informational, navigational and trans- web search results, mixing different types of results into one actional. For navigational search, the immediate intent is single result page. Although numerous efforts have been to reach a particular site (e.g., BBC Homepage); for infor- made to infer users’ information needs in “standard” search, mational search, the intent is to acquire some information we know little about users’ information needs within the con- likely to be contained in one or more web pages (e.g., global text of aggregated search. This paper presents the outcomes warming); and finally, for transactional search, the intent of a survey of 117 respondents, investigating users’ prefer- is to perform some web-mediated activity (e.g., download, ences for their type of search result (image, news, video) purchase). and their type of information need (informational, naviga- Others such as Lindley et al. [16] looked at why peo- tional and transactional). The survey reveals that users’ re- ple search or go online and identified five main web activi- sult preferences differ based on their underlying information ties: respite, orienting, opportunistic use, purposeful use and needs, suggesting that the taxonomy provided by Broder [1] lean-back internet. An example of a respite activity is when requires updating to reflect user information needs in the people use the web to take a break at work, or through a context of aggregated search. For instance, respondents in- mobile phone to occupy themselves while waiting. Similarly, dicated a preference for diverse results (news and reviews Chew et al. [10] explored the contextual and behavioural about a particular software product) for navigational and details of users’ interaction with web-based images as they transactional queries rather than a single result (the web occur in the course of everyday life, showing that users in- page to download that software product). teract with image results as these help creating connections to other people and remote places, or reflecting on the past. While there is a substantial body of work on understand- 1. INTRODUCTION AND BACKGROUND ing users’ information needs and browsing activities in “stan- Aggregated search is the technique of integrating search re- dard” search, far less is known about these within the context sults from different verticals (e.g., web, image, video, news) of aggregated search. For instance, it is not clear if the exist- on a single search result page so that users can access the ing taxonomies on information needs for “standard” search increasingly diverse content available on the web. Aggre- hold in an aggregated search scenario. In aggregated search, gated search systems aim to facilitate users’ access to “non- search results may originate from different media (e.g., im- standard” web results without having to perform separate ages, maps) or may be of different genres (e.g., news, blogs). searches in the respective verticals, which are source specific This may have an effect on the way users interact with the re- sub-collections provided by search engines [13]. sults, and affect their preferences for the types of results. A Throughout the evolution of web search, users’ interaction study in [15] investigated the former, but the latter remains with search results has been studied by many to improve the largely unexplored. For instance, it is not known whether for quality of the search results and the search experience. Ef- navigational queries, users prefer to view a specific website, forts were (and are still being) made to understand users’ as would be implied by [1]. A negative answer would mean information seeking process, based upon which several tax- that a revisit of Broder’s three-main-categories of informa- onomies describing users’ behaviours have been proposed [1, tion needs is needed. Also, building an awareness of web 5, 6, 9, 10, 11, 16]. activities in aggregated search, which cut across domains, For instance, in 2002, Broder [1] created a taxonomy of media types and applications, can highlight important de- tails when designing for interactions with the web [16]. The focus of this short paper is, therefore, two-fold: (1) to investigate the preference of search results sought by the users; and (2) to investigate the existing frameworks of web activities within the context of aggregated search. For this purpose, users’ preferences for results of several media types and genres are investigated. Furthermore, since Broder’s taxonomy has been heavily used (e.g. [3, 7, 9, 15]) we focus Presented at EuroHCIR2012. Copyright c 2012 for the individual papers by the papers’ authors. Copying permitted only for private and academic on the now classic informational, navigational and transac- purposes. This volume is published and copyrighted by its editors. tional categories. We nonetheless aim to extend this work with other taxonomies (e.g., ODP1 ) in future work. This paper makes the following contributions: (1) Investigates users’ preference for search results (media and genres) for informational, navigational and transactional search tasks; and (2) Provides empirical evidence to support the need for updating the above three categories within the context of aggregated search. We present the results of a survey that investigated users’ preferences for results of different media types and genres, as answers to informational, navigational and transactional queries. Figure 1: Screenshot showing the preference op- 2. STUDY tions provided to the respondents for the selection A survey containing sixteen questions (4 background ques- of search result choices. tions and 12 search task questions) was distributed on vari- ous social networks. The survey allowed us to reach a large and diverse enough number of users, and is a common way to Table 2: Median and Interquartile Range for the elicit user perceptions and preferences [4, 8]. A total of 117 Preference Rank Score, where Q1and Q3 are 1st and respondents completed the survey, of which 60 were female 3rd quartile. and 54 male; the remaining 3 did not disclose their gender. Navigational Informational Transactional The respondents’ age varied between 20-59 years (mean 29). Result Median Median Median Geographically, respondents were distributed across the US Type (Q1 - Q3) (Q1 - Q3) (Q1 - Q3) and Canada (3%), Europe (34%), Asia (62%) and Africa Web 1 (1-1) 1 (1-2) 1 (1-1) (1%). Most respondents were familiar with search engines Image 3 (2-4) 3 (2-4) 3 (2-4) and used them frequently. Video 3 (3-4) 2 (1-3) 3 (2-4) 2.1 Task News 2 (2-4) 2 (1-4) 2 (2-4) Others 4 (2-5) 4 (3-5) 4 (3-5) The aim of the survey was to elicit users’ preferences for the types (media, genres) of search results for informational, navigational and transactional search tasks. To this end, spondents were allowed to select as many options as they we designed four search topics2 for each of these three cate- desired. That is, they were allowed to select just ‘one’ or gories. The list of topics for each category is listed in Table 1. ‘all’ options, and therefore were not forced to provide a pref- In total, there were twelve questions for each respondent to erence for all the choices listed. This allowed a more natural answer. The orders of the questions were rotated to min- selection of choices, and hence reduced any design bias. In imise ordering bias. cases when the respondents selected more than one option, We designed topics that could be understood universally they were asked to rank the choices, by providing “1st”, “2nd” (e.g, global warming, checking emails, buying dvd, soft- ......,“5th” preference for each choice. For instance, if image, ware download). Furthermore, the topics were devised to fit news and others were selected as choices, these had to be the informational, navigational and transactional categories. ranked in order of preference (e.g., 1st preference - news, Therefore, we did not manipulate topics to suit specific me- 2nd preference – image, 3rd preference – others). dia or genre. For instance, for the topic global warming, Figure 1, shows the screenshot of an example question some people may want to read the latest news about global with the preference options. Next, the outcomes of the sur- warming, some others may want to view pictures of melting vey are presented. icebergs, while some others may want to watch a documen- tary on global warming. Therefore this topic does not have an implicit type intent (e.g. image) but requires the gath- 3. OUTCOMES ering of information (informational search task) from many As the data obtained from the survey was non-parametric, web pages; it is expected that users will look for multiple we report medians and the interquartile range for the prefer- results to satisfy the corresponding information need. How- ence scores. The results are reported in Table 2, which shows ever, it will depend on users which result types (image, news, the median rank of each vertical by information need. Fried- video, etc) they prefer to view – only news articles, few pic- man tests were performed to estimate the significance of tures, or a combination of both. preference for the results types, among and across the three categories (navigational, informational and transactional). 2.2 Procedure Finally, multiple Wilcoxon-tests were run in the post-hoc For each search topic, the respondents were given five choices, analyses while adjusting the p-values using the Bonferroni namely, web, news, image, video and other results3 . The re- method. The outcomes from the post-hoc pair wise com- 1 parisons for navigational, informational and transactional http://www.dmoz.org/ 2 categories are shown in Tables 3, 4 and 5 respectively. Each A search topic describes a search task scenario. The con- cept of a search task scenario was inspired from [2]. row in these tables indicates whether a particular result type 3 The definitions of these categories were not specified in the was preferred over each of the other result types. instructions and were left open to respondents’ interpreta- As can be seen in Table 2, most respondents indicated tion. the ‘web page’ as the most preferred type of results, when Table 1: List of topics presented to the respondents in the survey. The topics for each category (navigational, informational and transactional) are grouped here, but their order was rotated in the survey to minimise ordering bias. Navigational Topics 1. When you wish to book tickets with British Airways, which results would be useful for you? 2. When you wish to find an address from yellow pages, which results would be useful for you? 3. When you wish to check courses of a University, which results would be useful for you? 4. When you wish to check your email (e.g, gmail, hotmail, msn, etc), which results would be useful for you? Informational Topics 5. When you wish to learn about salsa dance, which results would be useful for you? 6. When you wish to gather information about global warming, which results would be useful for you? 7. When you wish to learn on how to make a pancake, which results would be useful for you? 8. When you wish to know about 2011 budget, and how it effected farmers, which results would be useful for you? Transactional Topics 9. When you wish to download a free software, which results would be useful for you? 10. When you wish to download a song for your iTunes library, which results would be useful for you? 11. When you wish to file a property complaint, which results would be useful for you? 12. When you wish to buy a DVD online, which results would be useful for you? compared to the other four types (image, video, news and others). The difference was found to be significant for nav- Table 3: Results of post-hoc pair wise comparisons igational, informational, and transactional cases (rows 1-4 for navigational category. in Tables 3, 4 and 5 ); thus suggesting that “standard” web row. no Pair Z- Score p-value results are the prime source of information sought by most 1 Web - Image -14.09 < 0.0001 users. After web results, news was the second most pre- ferred type of results when compared to image, video and 2 Web - Video -13.95 < 0.0001 others (6th row in Table 2). For the navigational category, 3 Web - News -13.62 < 0.0001 news results were significantly preferred over image, video 4 Web - Others -13.46 < 0.0001 and others results (rows 6, 8 and 9 in Table 3). However, 5 Image - Video -1.34 0.1814 video was equally preferred to news for informational and 6 Image - News 5.26 < 0.0001 transactional categories (row 8 in Tables 4 and 5). 7 Image - Others -4.03 < 0.0001 Finally, there is a trend for image and video results to 8 News - Video -7.69 < 0.0001 come third in preference from respondents for most cate- 9 News - Others -8.38 < 0.0001 gories (4th and 5th rows in Table 2). However, post-hoc 10 Video - Others -3.73 0.0001 analyses suggest a significant difference of preference for video and image over ‘other results’ for all three categories (rows 7 and 10 in Tables 3, 4 and 5). In addition, video results were significantly preferred to image results for in- formational and transactional cases (row 5 in Tables 4 and actional search topics. 5), while no significant difference was observed for the nav- Overall, three key observations can be made from this sur- igational case (row 5 in Table 3 ). Therefore, it is possible vey. First, for all query categories, web results continue to that users may prefer image results instead of video results be the prime source of information sought by users – 90% in some cases, and video results in other cases. In addition, for navigational, 54% for informational and 85% for trans- image and video being the third preference indicates that actional – suggesting that for an aggregated search result providing image and video results for all queries may not be page, web results should always be provided. This echoes appreciated by users. the findings of [14] where the importance of web results for In Tables 3 to 5, in only two occasions were the ranking aggregated result pages was demonstrated through the min- of result types not significantly different: image-video for ing of query logs. navigational, and news-video for informational information Second, there appears to be a difference between the re- needs. This indicates that for navigational needs, neither sult preferences for navigational and transactional queries. image or video results are judged as important to users, From Broder [1], the corresponding information needs for backing up the results in Table 2, where both are ranked these categories were identified to be focused (i.e., specific bottom. For informational information needs, both news website, download, etc). In contrast, our study suggests that and video were judged equally important to the search tasks, users also prefer to view other results, and not just one (“to second only to web (Table 2). the point”) result, or one type of result. More precisely, for the navigational search topics, in addition to web results, 4. DISCUSSION respondents also indicated a preference for news and video The aim of our study was to investigate, via a survey, users’ results. This may be due to the fact that, since an aggre- results preference for navigational, informational, and trans- gated result page is often provided for most queries by mod- We presented the analysis of a survey of 117 respondents’ Table 4: Results of post-hoc pair wise comparisons preferences regarding the different types of results for navi- for informational category. gational, informational, and transactional information needs. Although small in terms of the number of users and acknowl- row no. Pair Z- Score p-value edging the limitation of an online survey, interesting insights 1 Web - Image 11.94 < 0.0001 emerged from our investigation. The outcomes of the sur- 2 Web - Video -7.40 < 0.0001 vey support the aggregated search paradigm, showing that 3 Web - News -6.62 < 0.0001 users’ preferences are for a diverse range of result types. The 4 Web - Others -13.87 < 0.0001 analysis also indicates a need to revisit the definition of the 5 Image - Video 8.55 < 0.0001 three categories of information needs [1], within the context 6 Image - News 3.96 < 0.0001 of aggregated search. This work initiates two future research 7 Image - Others -9.06 < 0.0001 questions: (1) What information needs exist within the con- 8 News - Video 0.58 0.5583 text of aggregated search? and (2) How to identify suitable 9 News - Others -11.25 < 0.0001 results satisfying those information needs? 10 Video - Others -11.80 < 0.0001 6. REFERENCES [1] A. Broder. A taxonomy of web search. Journal of Table 5: Results of post-hoc pair wise comparisons SIGIR Forum, 2002. for transactional category. [2] P. Borlund. The IIR evaluation model: a framework for evaluation of interactive information retrieval row no. Pair Z- Score p-value systems. JASIST, 2003. 1 Web - Image -13.40 < 0.0001 [3] L.A. Granka, T. Joachims & G. Gay, Eye-tracking 2 Web - Video -12.65 < 0.0001 analysis of user behavior in WWW search, SIGIR, 3 Web - News -13.17 < 0.0001 2004. 4 Web - Others -13.39 < 0.0001 [4] S.A. Grandhi, Q. Jones & S. Karam, Sharing the big 5 Image - Video 4.64 < 0.0001 apple: a survey study of people, place and locatability, 6 Image - News 5.33 < 0.0001 SIGCHI, 2005. 7 Image - Others -4.34 < 0.0001 [5] M. Kellar, C. Watters & M. Shepherd, A Goal-based 8 News - Video -2.30 0.021 Classification of Web Information Tasks, ASIST 2006. 9 News - Others -10.09 < 0.0001 [6] H. Dai, L. Zhao, Z. Nie, J.-R. Wen, L. Wang & Y. Li, 10 Video - Others -6.77 < 0.0001 Detecting online commercial intention, WWW, 2006. [7] B.J. Jansen, D.L. Booth & A. Spink, Determining the user intent of web search engine queries, WWW, 2007. [8] M.R. Morris, A survey of collaborative web search ern search engines4 , users are exposed to diverse results and practices, SIGCHI , 2008. as a consequence, results other than web have now gained [9] B.J. Jansen, D.L. Booth & A. Spink, Determining the prominence. However, whether providing diverse results for informational, navigational, and transactional intent informational and transactional information needs facilitates of Web queries, IP&M, 2008. task completion, and/or increases user satisfaction, requires [10] B. Chew, J.A. Rode and A. Sellen, Understanding the further investigation. Everyday Use of Images on the Web, NordiCHI, 2008 Third, users’ preferences for the ‘type’ of results vary with [11] S. Stamou & L. Kozanidis Impact of search results on the query category. For instance, for navigational and trans- user queries, WSDM, 2009. actional search topics, web and news results seem to be [12] S. Sushmita, H. Joho, M. Lalmas & J.M. Jose, preferred. The preference is more mixed for informational Understanding domain “relevance” in web search. search topics, with image results least preferred. In itself, WSSP at WWW, 2009. it is not surprising that users’ preferences vary with query [13] J. Arguello, F. Diaz, J. Callan & J.-F. Crespo, categories. However, concrete knowledge regarding which Sources of evidence for vertical selection. SIGIR, 2009. ‘types’ of sought results are preferred would allow for more [14] S. Sushmita, B. Piwowarski & M. Lalmas, Dynamics appropriate aggregation of the different verticals under con- of Domains and Genre Intent, AIRS, 2010. sideration. Similar investigations were carried out in [12] by Sushmita et al. where, associations between query clas- [15] S. Sushmita, H. Joho, M. Lalmas & R. Villa, Factors sifications (e.g., arts, health, etc) and result types were in- affecting click through behavior in aggregated deed identified. Such knowledge may then be used by search interface, CIKM, 2010. systems, to present particular types of result for different [16] S.E. Lindley, S. Meek, A. Sellen & R. Harper, “It’s queries, for example, a system may not present (or demote simply integral to what I do” enquiries into how the in importance) image results in response to an informational web is weaved into everyday life, WWW, 2012. query. 5. CONCLUSION AND FUTURE WORK 4 http://www.slideshare.net/rankabove/com-score- rankabove-final