Revisiting User Information Needs in Aggregated Search

                                       Shanu Sushmita                            Martin Halvey
                                   University of California LA                 Glasgow Caledonian
                                shanusushmita@ucla.edu                              University
                                                                          Martin.Halvey@gcu.ac.uk
                                           Robert Villa                        Mounia Lalmas
                                      University of Sheffield                Yahoo! Labs Barcelona
                                   r.villa@sheffield.ac.uk                    mounia@acm.org

ABSTRACT                                                                   web search, classifying users’ information needs into three
Aggregated search interfaces are a common way to present                   categories, namely, informational, navigational and trans-
web search results, mixing different types of results into one             actional. For navigational search, the immediate intent is
single result page. Although numerous efforts have been                    to reach a particular site (e.g., BBC Homepage); for infor-
made to infer users’ information needs in “standard” search,               mational search, the intent is to acquire some information
we know little about users’ information needs within the con-              likely to be contained in one or more web pages (e.g., global
text of aggregated search. This paper presents the outcomes                warming); and finally, for transactional search, the intent
of a survey of 117 respondents, investigating users’ prefer-               is to perform some web-mediated activity (e.g., download,
ences for their type of search result (image, news, video)                 purchase).
and their type of information need (informational, naviga-                    Others such as Lindley et al. [16] looked at why peo-
tional and transactional). The survey reveals that users’ re-              ple search or go online and identified five main web activi-
sult preferences differ based on their underlying information              ties: respite, orienting, opportunistic use, purposeful use and
needs, suggesting that the taxonomy provided by Broder [1]                 lean-back internet. An example of a respite activity is when
requires updating to reflect user information needs in the                 people use the web to take a break at work, or through a
context of aggregated search. For instance, respondents in-                mobile phone to occupy themselves while waiting. Similarly,
dicated a preference for diverse results (news and reviews                 Chew et al. [10] explored the contextual and behavioural
about a particular software product) for navigational and                  details of users’ interaction with web-based images as they
transactional queries rather than a single result (the web                 occur in the course of everyday life, showing that users in-
page to download that software product).                                   teract with image results as these help creating connections
                                                                           to other people and remote places, or reflecting on the past.
                                                                              While there is a substantial body of work on understand-
1.    INTRODUCTION AND BACKGROUND                                          ing users’ information needs and browsing activities in “stan-
Aggregated search is the technique of integrating search re-               dard” search, far less is known about these within the context
sults from different verticals (e.g., web, image, video, news)             of aggregated search. For instance, it is not clear if the exist-
on a single search result page so that users can access the                ing taxonomies on information needs for “standard” search
increasingly diverse content available on the web. Aggre-                  hold in an aggregated search scenario. In aggregated search,
gated search systems aim to facilitate users’ access to “non-              search results may originate from different media (e.g., im-
standard” web results without having to perform separate                   ages, maps) or may be of different genres (e.g., news, blogs).
searches in the respective verticals, which are source specific            This may have an effect on the way users interact with the re-
sub-collections provided by search engines [13].                           sults, and affect their preferences for the types of results. A
   Throughout the evolution of web search, users’ interaction              study in [15] investigated the former, but the latter remains
with search results has been studied by many to improve the                largely unexplored. For instance, it is not known whether for
quality of the search results and the search experience. Ef-               navigational queries, users prefer to view a specific website,
forts were (and are still being) made to understand users’                 as would be implied by [1]. A negative answer would mean
information seeking process, based upon which several tax-                 that a revisit of Broder’s three-main-categories of informa-
onomies describing users’ behaviours have been proposed [1,                tion needs is needed. Also, building an awareness of web
5, 6, 9, 10, 11, 16].                                                      activities in aggregated search, which cut across domains,
   For instance, in 2002, Broder [1] created a taxonomy of                 media types and applications, can highlight important de-
                                                                           tails when designing for interactions with the web [16].
                                                                              The focus of this short paper is, therefore, two-fold: (1)
                                                                           to investigate the preference of search results sought by the
                                                                           users; and (2) to investigate the existing frameworks of web
                                                                           activities within the context of aggregated search. For this
                                                                           purpose, users’ preferences for results of several media types
                                                                           and genres are investigated. Furthermore, since Broder’s
                                                                           taxonomy has been heavily used (e.g. [3, 7, 9, 15]) we focus
Presented at EuroHCIR2012. Copyright c 2012 for the individual papers
by the papers’ authors. Copying permitted only for private and academic    on the now classic informational, navigational and transac-
purposes. This volume is published and copyrighted by its editors.
tional categories. We nonetheless aim to extend this work
with other taxonomies (e.g., ODP1 ) in future work. This
paper makes the following contributions: (1) Investigates
users’ preference for search results (media and genres) for
informational, navigational and transactional search tasks;
and (2) Provides empirical evidence to support the need for
updating the above three categories within the context of
aggregated search.
   We present the results of a survey that investigated users’
preferences for results of different media types and genres,
as answers to informational, navigational and transactional
queries.
                                                                     Figure 1: Screenshot showing the preference op-
2.    STUDY                                                          tions provided to the respondents for the selection
A survey containing sixteen questions (4 background ques-            of search result choices.
tions and 12 search task questions) was distributed on vari-
ous social networks. The survey allowed us to reach a large
and diverse enough number of users, and is a common way to           Table 2: Median and Interquartile Range for the
elicit user perceptions and preferences [4, 8]. A total of 117       Preference Rank Score, where Q1and Q3 are 1st and
respondents completed the survey, of which 60 were female            3rd quartile.
and 54 male; the remaining 3 did not disclose their gender.                     Navigational Informational Transactional
The respondents’ age varied between 20-59 years (mean 29).            Result    Median       Median        Median
Geographically, respondents were distributed across the US            Type      (Q1 - Q3)    (Q1 - Q3)     (Q1 - Q3)
and Canada (3%), Europe (34%), Asia (62%) and Africa                  Web       1 (1-1)      1 (1-2)       1 (1-1)
(1%). Most respondents were familiar with search engines              Image     3 (2-4)      3 (2-4)       3 (2-4)
and used them frequently.                                             Video     3 (3-4)      2 (1-3)       3 (2-4)
2.1    Task                                                           News      2 (2-4)      2 (1-4)       2 (2-4)
                                                                      Others    4 (2-5)      4 (3-5)       4 (3-5)
The aim of the survey was to elicit users’ preferences for
the types (media, genres) of search results for informational,
navigational and transactional search tasks. To this end,
                                                                     spondents were allowed to select as many options as they
we designed four search topics2 for each of these three cate-
                                                                     desired. That is, they were allowed to select just ‘one’ or
gories. The list of topics for each category is listed in Table 1.
                                                                     ‘all’ options, and therefore were not forced to provide a pref-
In total, there were twelve questions for each respondent to
                                                                     erence for all the choices listed. This allowed a more natural
answer. The orders of the questions were rotated to min-
                                                                     selection of choices, and hence reduced any design bias. In
imise ordering bias.
                                                                     cases when the respondents selected more than one option,
   We designed topics that could be understood universally
                                                                     they were asked to rank the choices, by providing “1st”, “2nd”
(e.g, global warming, checking emails, buying dvd, soft-
                                                                     ......,“5th” preference for each choice. For instance, if image,
ware download). Furthermore, the topics were devised to fit
                                                                     news and others were selected as choices, these had to be
the informational, navigational and transactional categories.
                                                                     ranked in order of preference (e.g., 1st preference - news,
Therefore, we did not manipulate topics to suit specific me-
                                                                     2nd preference – image, 3rd preference – others).
dia or genre. For instance, for the topic global warming,
                                                                         Figure 1, shows the screenshot of an example question
some people may want to read the latest news about global
                                                                     with the preference options. Next, the outcomes of the sur-
warming, some others may want to view pictures of melting
                                                                     vey are presented.
icebergs, while some others may want to watch a documen-
tary on global warming. Therefore this topic does not have
an implicit type intent (e.g. image) but requires the gath-          3.   OUTCOMES
ering of information (informational search task) from many           As the data obtained from the survey was non-parametric,
web pages; it is expected that users will look for multiple          we report medians and the interquartile range for the prefer-
results to satisfy the corresponding information need. How-          ence scores. The results are reported in Table 2, which shows
ever, it will depend on users which result types (image, news,       the median rank of each vertical by information need. Fried-
video, etc) they prefer to view – only news articles, few pic-       man tests were performed to estimate the significance of
tures, or a combination of both.                                     preference for the results types, among and across the three
                                                                     categories (navigational, informational and transactional).
2.2    Procedure                                                     Finally, multiple Wilcoxon-tests were run in the post-hoc
For each search topic, the respondents were given five choices,      analyses while adjusting the p-values using the Bonferroni
namely, web, news, image, video and other results3 . The re-         method. The outcomes from the post-hoc pair wise com-
1                                                                    parisons for navigational, informational and transactional
  http://www.dmoz.org/
2                                                                    categories are shown in Tables 3, 4 and 5 respectively. Each
  A search topic describes a search task scenario. The con-
cept of a search task scenario was inspired from [2].                row in these tables indicates whether a particular result type
3
  The definitions of these categories were not specified in the      was preferred over each of the other result types.
instructions and were left open to respondents’ interpreta-            As can be seen in Table 2, most respondents indicated
tion.                                                                the ‘web page’ as the most preferred type of results, when
Table 1: List of topics presented to the respondents in the survey. The topics for each category (navigational,
informational and transactional) are grouped here, but their order was rotated in the survey to minimise
ordering bias.

     Navigational Topics
     1. When you wish to book tickets with British Airways, which results would be useful for you?
     2. When you wish to find an address from yellow pages, which results would be useful for you?
     3. When you wish to check courses of a University, which results would be useful for you?
     4. When you wish to check your email (e.g, gmail, hotmail, msn, etc), which results would be useful for you?
     Informational Topics
     5. When you wish to learn about salsa dance, which results would be useful for you?
     6. When you wish to gather information about global warming, which results would be useful for you?
     7. When you wish to learn on how to make a pancake, which results would be useful for you?
     8. When you wish to know about 2011 budget, and how it effected farmers, which results would be useful for you?
     Transactional Topics
     9. When you wish to download a free software, which results would be useful for you?
     10. When you wish to download a song for your iTunes library, which results would be useful for you?
     11. When you wish to file a property complaint, which results would be useful for you?
     12. When you wish to buy a DVD online, which results would be useful for you?


compared to the other four types (image, video, news and
others). The difference was found to be significant for nav-     Table 3: Results of post-hoc pair wise comparisons
igational, informational, and transactional cases (rows 1-4      for navigational category.
in Tables 3, 4 and 5 ); thus suggesting that “standard” web
                                                                      row. no         Pair         Z- Score     p-value
results are the prime source of information sought by most
                                                                         1         Web - Image      -14.09      < 0.0001
users. After web results, news was the second most pre-
ferred type of results when compared to image, video and                 2         Web - Video      -13.95      < 0.0001
others (6th row in Table 2). For the navigational category,              3         Web - News       -13.62      < 0.0001
news results were significantly preferred over image, video              4        Web - Others      -13.46      < 0.0001
and others results (rows 6, 8 and 9 in Table 3). However,                5        Image - Video      -1.34       0.1814
video was equally preferred to news for informational and                6        Image - News        5.26      < 0.0001
transactional categories (row 8 in Tables 4 and 5).                      7       Image - Others      -4.03      < 0.0001
   Finally, there is a trend for image and video results to              8        News - Video       -7.69      < 0.0001
come third in preference from respondents for most cate-                 9       News - Others       -8.38      < 0.0001
gories (4th and 5th rows in Table 2). However, post-hoc                 10       Video - Others      -3.73       0.0001
analyses suggest a significant difference of preference for
video and image over ‘other results’ for all three categories
(rows 7 and 10 in Tables 3, 4 and 5). In addition, video
results were significantly preferred to image results for in-
formational and transactional cases (row 5 in Tables 4 and
                                                                 actional search topics.
5), while no significant difference was observed for the nav-
                                                                   Overall, three key observations can be made from this sur-
igational case (row 5 in Table 3 ). Therefore, it is possible
                                                                 vey. First, for all query categories, web results continue to
that users may prefer image results instead of video results
                                                                 be the prime source of information sought by users – 90%
in some cases, and video results in other cases. In addition,
                                                                 for navigational, 54% for informational and 85% for trans-
image and video being the third preference indicates that
                                                                 actional – suggesting that for an aggregated search result
providing image and video results for all queries may not be
                                                                 page, web results should always be provided. This echoes
appreciated by users.
                                                                 the findings of [14] where the importance of web results for
   In Tables 3 to 5, in only two occasions were the ranking
                                                                 aggregated result pages was demonstrated through the min-
of result types not significantly different: image-video for
                                                                 ing of query logs.
navigational, and news-video for informational information
                                                                   Second, there appears to be a difference between the re-
needs. This indicates that for navigational needs, neither
                                                                 sult preferences for navigational and transactional queries.
image or video results are judged as important to users,
                                                                 From Broder [1], the corresponding information needs for
backing up the results in Table 2, where both are ranked
                                                                 these categories were identified to be focused (i.e., specific
bottom. For informational information needs, both news
                                                                 website, download, etc). In contrast, our study suggests that
and video were judged equally important to the search tasks,
                                                                 users also prefer to view other results, and not just one (“to
second only to web (Table 2).
                                                                 the point”) result, or one type of result. More precisely, for
                                                                 the navigational search topics, in addition to web results,
4.   DISCUSSION                                                  respondents also indicated a preference for news and video
The aim of our study was to investigate, via a survey, users’    results. This may be due to the fact that, since an aggre-
results preference for navigational, informational, and trans-   gated result page is often provided for most queries by mod-
                                                                   We presented the analysis of a survey of 117 respondents’
Table 4: Results of post-hoc pair wise comparisons                 preferences regarding the different types of results for navi-
for informational category.                                        gational, informational, and transactional information needs.
                                                                   Although small in terms of the number of users and acknowl-
     row no.         Pair          Z- Score      p-value
                                                                   edging the limitation of an online survey, interesting insights
        1         Web - Image        11.94       < 0.0001
                                                                   emerged from our investigation. The outcomes of the sur-
        2         Web - Video        -7.40       < 0.0001          vey support the aggregated search paradigm, showing that
        3         Web - News         -6.62       < 0.0001          users’ preferences are for a diverse range of result types. The
        4        Web - Others       -13.87       < 0.0001          analysis also indicates a need to revisit the definition of the
        5        Image - Video        8.55       < 0.0001          three categories of information needs [1], within the context
        6        Image - News         3.96       < 0.0001          of aggregated search. This work initiates two future research
        7       Image - Others       -9.06       < 0.0001          questions: (1) What information needs exist within the con-
        8        News - Video         0.58        0.5583           text of aggregated search? and (2) How to identify suitable
        9       News - Others       -11.25       < 0.0001          results satisfying those information needs?
       10       Video - Others      -11.80       < 0.0001
                                                                   6.   REFERENCES
                                                                    [1] A. Broder. A taxonomy of web search. Journal of
Table 5: Results of post-hoc pair wise comparisons                      SIGIR Forum, 2002.
for transactional category.                                         [2] P. Borlund. The IIR evaluation model: a framework
                                                                        for evaluation of interactive information retrieval
     row no.         Pair          Z- Score      p-value                systems. JASIST, 2003.
        1         Web - Image       -13.40       < 0.0001           [3] L.A. Granka, T. Joachims & G. Gay, Eye-tracking
        2         Web - Video       -12.65       < 0.0001               analysis of user behavior in WWW search, SIGIR,
        3         Web - News        -13.17       < 0.0001               2004.
        4        Web - Others       -13.39       < 0.0001           [4] S.A. Grandhi, Q. Jones & S. Karam, Sharing the big
        5        Image - Video        4.64       < 0.0001               apple: a survey study of people, place and locatability,
        6        Image - News         5.33       < 0.0001               SIGCHI, 2005.
        7       Image - Others       -4.34       < 0.0001           [5] M. Kellar, C. Watters & M. Shepherd, A Goal-based
        8        News - Video        -2.30        0.021                 Classification of Web Information Tasks, ASIST 2006.
        9       News - Others       -10.09       < 0.0001           [6] H. Dai, L. Zhao, Z. Nie, J.-R. Wen, L. Wang & Y. Li,
       10       Video - Others       -6.77       < 0.0001               Detecting online commercial intention, WWW, 2006.
                                                                    [7] B.J. Jansen, D.L. Booth & A. Spink, Determining the
                                                                        user intent of web search engine queries, WWW, 2007.
                                                                    [8] M.R. Morris, A survey of collaborative web search
ern search engines4 , users are exposed to diverse results and          practices, SIGCHI , 2008.
as a consequence, results other than web have now gained            [9] B.J. Jansen, D.L. Booth & A. Spink, Determining the
prominence. However, whether providing diverse results for              informational, navigational, and transactional intent
informational and transactional information needs facilitates           of Web queries, IP&M, 2008.
task completion, and/or increases user satisfaction, requires      [10] B. Chew, J.A. Rode and A. Sellen, Understanding the
further investigation.                                                  Everyday Use of Images on the Web, NordiCHI, 2008
   Third, users’ preferences for the ‘type’ of results vary with   [11] S. Stamou & L. Kozanidis Impact of search results on
the query category. For instance, for navigational and trans-           user queries, WSDM, 2009.
actional search topics, web and news results seem to be            [12] S. Sushmita, H. Joho, M. Lalmas & J.M. Jose,
preferred. The preference is more mixed for informational               Understanding domain “relevance” in web search.
search topics, with image results least preferred. In itself,           WSSP at WWW, 2009.
it is not surprising that users’ preferences vary with query
                                                                   [13] J. Arguello, F. Diaz, J. Callan & J.-F. Crespo,
categories. However, concrete knowledge regarding which
                                                                        Sources of evidence for vertical selection. SIGIR, 2009.
‘types’ of sought results are preferred would allow for more
                                                                   [14] S. Sushmita, B. Piwowarski & M. Lalmas, Dynamics
appropriate aggregation of the different verticals under con-
                                                                        of Domains and Genre Intent, AIRS, 2010.
sideration. Similar investigations were carried out in [12]
by Sushmita et al. where, associations between query clas-         [15] S. Sushmita, H. Joho, M. Lalmas & R. Villa, Factors
sifications (e.g., arts, health, etc) and result types were in-         affecting click through behavior in aggregated
deed identified. Such knowledge may then be used by search              interface, CIKM, 2010.
systems, to present particular types of result for different       [16] S.E. Lindley, S. Meek, A. Sellen & R. Harper, “It’s
queries, for example, a system may not present (or demote               simply integral to what I do” enquiries into how the
in importance) image results in response to an informational            web is weaved into everyday life, WWW, 2012.
query.

5.   CONCLUSION AND FUTURE WORK
4
  http://www.slideshare.net/rankabove/com-score-
rankabove-final