Massive Implicit Feedback: Organizing Search Logs into Topic Maps for Collaborative Surfing Xuanhui Wang and ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign {xwang20, czhai}@illinois.edu 1. INTRODUCTION ing box, where a user can submit a keyword query. (2) The Current search engines heavily emphasize on direct query- left pane shows a portion of a multi-resolution topic map ing which tends to work well only for simple information built based on search logs, where a user can click on a node needs such as navigational queries. However, direct query- to navigate into a topic region. (3) The right pane displays ing may not support complex information needs such as information corresponding to a topic region, including the exploratory search well [4, 11] since users’ interactions are clickthroughs made by previous users when they visit the mainly limited to submitting a query, viewing results, and topic region and the documents covered by the topic region. reformulating queries [1]. As a complementary way of infor- These three panes allow a user to navigate in the informa- mation seeking with querying, browsing can be very useful tion space in large, medium, and small steps, respectively. for exploratory search or information foraging [5]. Unfortu- With the query box on the top, a user can make a long dis- nately, with the current search engines, browsing is mostly tance navigation into any topic region (i.e., “large steps”); limited to following hyperlinks or navigating through struc- with the topic map on the left pane, the user can navigate tures consisting of a fixed set of categories or other meta- into related topic regions (i.e., “medium steps”); and with data available [4, 2]. the display of a topic region in the right pane, the user can We have been developing a new collaborative surfing sys- navigate by following hyperlinks (i.e., “small steps”). A user tem to enable users to go beyond hyperlinks to browse flex- can take any of these three navigation actions at any time. ibly for ad hoc information needs. Our main idea is to view Thus our system implements a unified information seeking search logs as information footprints left by users in nav- model where both querying and browsing are viewed as ways igating in the information space and organize these foot- to navigate in the information space. prints into a multi-resolution topic map. The map makes Inside the system, when a user submits a query, the sys- it possible for users to navigate flexibly in the information tem would display the most relevant part of the topic map space by following the footprints left by other users. As on the left pane and show the search results from Google for new users use the map for navigation, they leave more foot- the query. When the user navigates on the map to click on prints, which can then be used to enrich and refine the map a node (corresponding to a topic region), the system would dynamically and continuously for the benefit of future users. automatically update the right pane to show corresponding Thus, by turning search logs into a topic map, we can es- search results using a query constructed based on the node tablish a sustainable infrastructure to facilitate users to surf selected by the user. In general, the right pane is always the information space in a collaborative manner. Prelimi- synchronized with the left pane to show the documents cor- nary experiment results show that the topic map is effective responding to the current node on the map. in helping users to satisfy exploratory information needs [8]. The topic map promotes browsing and can naturally sup- In the following, we describe our system in more detail and port exploratory search. For example, a user who wants to discuss its potential impact on understanding users for im- arrange a house can start with a query “table,” zoom into proving information seeking. “dinning table,” zoom out to “dinning,”, move horizontally to “kitchen,” and further move to “appliance.” From “table”, this user can also horizontally move to “chair,” to “desk,”, or 2. SYSTEM DESCRIPTION to “tablecloth.” Another example is “wedding.” From “wed- Figure 1 shows the interface of our system which is im- ding,” we can zoom into different aspects of wedding such plemented as a meta-search engine interacting with Google. as “wedding dress,’ “wedding vows,” etc. We can also hori- The interface has three panes: (1) The top pane is a query- zontally move to “vacation,” “honeymoon,” or “hotels.” All these browsing traces can be leveraged to infer users’ under- lying information needs and better serve users with complex exploratory information needs. The browsing logs can be leveraged to improve the map and further help future users who have similar information needs. A main technical challenge in developing this system is to construct topic maps. Currently, the nodes in topic maps Copyright is held by the author/owner(s). are valid queries in search logs. All queries with the same SIGIR’09, July 19-23, 2009, Boston, USA. number of keywords belong to the same level. The children .         !          ,21 %(*)!+ /   "$#&%'%(*)!+ ,.-&/0 + 0 - ) " :0 7 < %(*) 34- % 0 56- )!+ 1*7 8 ( 0 9:!;- % / Figure 1: Interface snapshot of the demo system. of a map node is obtained by adding a keyword into the cur- retrieval systems: opportunities and challenges. In rent query and the neighbors of the query is by substituting CHI Extended Abstracts, pages 1594–1595, 2004. a keyword in the current query. All these surrounding nodes [2] M. A. Hearst. Clustering versus faceted categories for are ranked accordingly. Specifically, we rely on the term co- information exploration. Commun. ACM, 49(4):59–61, occurrence in search logs to construct such a map and all 2006. the technique details can be found in [10]. [3] D. Kelly and N. J. Belkin. Display time as implicit feedback: understanding task effects. In Proceedings of 3. MASSIVE IMPLICIT FEEDBACK ACM SIGIR 2004, pages 377–384, 2004. From the viewpoint of understanding users and exploit- [4] G. Marchionini. Exploratory search: from finding to ing user information to provide better search support, our understanding. Commun. ACM, 49(4):41–46, 2006. system implements a strategy of massive implicit feedback [5] P. L. T. Pirolli. Information Foraging Theory: [3, 7, 6, 9], where query logs and browsing logs of all users Adaptive Interaction with Information. Oxford would be captured and leveraged to provide better support University Press, June 2004. for future users in both querying and browsing. Indeed, the [6] X. Shen, B. Tan, and C. Zhai. Implicit user modeling implicit feedback information collectable by the system in- for personalized search. In Proceedings of CIKM 2005, cludes not only the queries and clickthroughs available in a pages 824–831, 2005. current search engine but also the browsing traces left by [7] J. Teevan, S. T. Dumais, and E. Horvitz. users in using the map. The system treats all these different Personalizing search via automated analysis of kinds of user information uniformly as “information foot- interests and activities. In Proceedings of ACM SIGIR prints” left by users and organizes them into a topic map 2005, pages 449–456, 2005. to deliver benefits to future users. At the same time, new [8] X. Wang, B. Tan, A. Shakery, and C. Zhai. Search users would leave new footprints to allow the system to grow logs as information footprints: Supporting guided continuously over time to improve its support for browsing navigation for exploratory search. Technical Report and querying. Thus, the system enables collaborative surf- UIUCDCS-R-2008-3001, University of Illinois, 2008. ing where users help each other through sustained massive https://www.ideals.uiuc.edu/bitstream/handle/2142/ implicit feedback. 10971/UIUCDCS-R-2008-3001.pdf. We hope our demo can stimulate discussions about many [9] X. Wang and C. Zhai. Learn from web search logs to interesting questions related to the workshop: (1) How should organize search results. In Proceedings of SIGIR 2007, we evaluate topic maps? (2) How should we evaluate such pages 87–94, 2007. an interactive system? (3) How can we formally model a [10] X. Wang and C. Zhai. Mining term association user based on both query logs and browsing logs? (4) How patterns from search logs for effective query can we leverage maps to clarify user interests? reformulation. In CIKM, pages 479–488, 2008. 4. REFERENCES [11] R. W. White and R. A. Roth. Exploratory Search: [1] N. J. Belkin, S. T. Dumais, J. Scholtz, and Beyond the Query-Response Paradigm. Morgan and R. Wilkinson. Evaluating interactive information Claypool, 2009.