Position Paper: Promoting User Engagement and Learning in Search Tasks By Effective Document Representation Piyush Arora Gareth J. F. Jones ADAPT Centre, School of Computing, Dublin City University, Dublin 9, Ireland {parora,gfjones}@computing.dcu.ie ABSTRACT Further, information in current surrogates is only intended to in- Much research in information retrieval (IR) focuses on optimiza- dicate the relevance of a document rather than to provide important tion of the rank of relevant retrieval results for single shot ad hoc and useful information about the document to address the infor- IR tasks with straightforward information needs. Relatively little mation need or to improve user topical knowledge. We believe it research has been carried out to study and support user learning and is important to address this limitation of current search systems, engagement for more complex search tasks. We introduce an ap- and study how to select and present inf ormation for different proach intended to improve topical knowledge of a user while un- information needs to users to measure and promote learning and dertaking IR tasks. Specifically, we propose to explore methods of engagement effectively by development of enhanced document sur- finding useful and informative textual units (semantic concepts) rogates. Commercial search system such as Google, have already within retrieved documents, with the objective of creating improved started using different methods for information representation for document surrogates for presentation within the search process. many popular known item queries, rather than displaying only sur- We hypothesize that this strategy will promote improved implicit rogates to view the information. This motivates us further to search learning within search activities. We believe that the richer docu- for alternate representations for complex search tasks for more gen- ment representations proposed in the paper would help to promote eral information needs regardless of their popularity. A sample col- engagement, understanding and learning as compared to more tra- lection of queries and alternative document representation gathered ditional search engine document snippets. We propose a framework during our initial investigation using Google search system is avail- for holistic evaluation of our proposed document representations able at the following link https://goo.gl/Tz2w1y. and their use in search. Our Position: We propose that providing informative surrogates would lead to an overall improved user search experience in terms of satisfaction and increase in topical knowledge, which will pro- 1. INTRODUCTION mote user engagement and implicit learning within a search pro- cess. We hypothesize that developing technologies to select and Search engines are used to support a wide range of different present information from documents which capture relevant top- task types ranging from simple fact finding to complex topic ex- ical information will promote topical knowledge of a user in a ploration. At present, search engines are optimized for look-up search session for more complex search tasks. In this paper, we tasks where the searcher is generally looking for a specific piece discuss a mechanism of user information presentation by develop- of information, and not for tasks that require multiple interactions ing alternative representations of document surrogates using more with information to examine a topic in a less focused way. meaningf ul summaries and concepts, one of such alternative Jiang et al. report an experimented study for known-items, known- representation is shown in Figure 3. subject, interpretive and exploratory tasks [1]. This study showed Motivation: The motivation for our study is based on the obser- that users often become less interested in the results after a few vation that as we read documents, we consume and interact with in- search iterations, since the results overlap significantly or include formation, and we increase our knowledge about a particular topic very similar information. Further, they demonstrated that present- and theme. Thus the principal question that needs to be investi- ing information using current document surrogate methods does gated is: How can retrieved information be selected and arranged not work well for different types of search tasks and information for search tasks to encourage and enhance user learning and en- needs. gagement? Document surrogates are the primary way in which poten- tially interesting documents are revealed to searchers and are ex- pected to reflect the information contained in the documents which 2. PROPOSITION is significant to the current query. However, in practice it is often We believe it is important to study the improvement in the topical hard to infer the contents of individual ranked results just by read- knowledge of a user with respect to the query, and develop meth- ing the surrogates, as used by current search engines, e.g. Google1 , ods for analyzing documents by measuring concepts within them to Bing2 etc. as shown in Figure 2. promote user learning and engagement. We hypothesize that under- standing document similarities and differences at the concept level 1 https://www.google.com can help us to identify cues to better formulate document surrogates 2 https://www.bing.com/ (in terms of the document summaries, topics, visualisations etc.) to Search as Learning (SAL), July 21, 2016, Pisa, Italy improve the learning experience. There has been quite some work in developing and analyzing The copyright for this paper remains with its authors. Copying permitted for private and academic purposes. alternative surrogates by varying: the summaries length and rep- resentation by adding thumbnails, importance sentences etc. from – In RQ2, the task is to explore different methods of present- the document [2, 4]. We propose to use the motivation and findings ing meaningful summaries and interpretive concepts to users from this previous work with the aim of capturing human topical in the form of enriched surrogates which promote engage- knowledge and learning. Our work differs from this earlier work ment and learning during search. since our main focus is on selection and representation of infor- • RQ3: Does user learning and engagement change when re- mation from a document as informative summaries and meaningful sults are presented using more descriptive and interpretive concepts inclined towards improving learning and increasing topi- concepts? cal knowledge, rather than only indicating the relevance of a docu- – In RQ3, the task is to measure learning and engagement ment. by conducting task based experimental user studies. The Learning has been measured in terms of query reformulation main idea is to measure the relation between two important strategies [3], we propose to measure learning in terms of how peo- aspects: increase in topical knowledge and enhanced ple interact and consume different parts of the presented informa- results representation. tion based on their prior topical knowledge. Analyzing user inter- action with the enhanced representation and measuring the changes in the concepts, in terms of clicks, views etc. in a session can indicate the improvement in user learning. Describing document content at the concept level would also enable the presentation of relevant information, as well as to support presentation of interest- Figure 2: Surrogate from Google search engine ing and intriguing facts related to the user query. Figure 3: Enhanced surrogate with marked semantic concepts Evaluation of alternative representations should be carried out by studying search effectiveness in a task-based study in terms of different facets such as user search experience and satisfaction, and the increase in their topical knowledge to measure engagement and learning. We believe that investigating these facets will improve understanding of information representation in search engines to Figure 1: Architecture for measuring engagement and learning promote more active, interactive and engaged search processes. in a search task 4. CONCLUSION 3. PROPOSED METHODOLOGY In this paper, we propose to study the use of richer document representations to understand and promote user learning and en- In this research, we propose a new method of result represen- gagement, as compared to traditional search engine document sur- tation by analyzing initial retrieved results to extract meaningful rogates for the same user queries. We highlight the advantages of units from the documents, once we identify the relevant and impor- richer alternate representations as compared to state-of-the-art rep- tant information from the document, we can represent it effectively resentations. We also present the main challenges which need to as meaningful surrogates to be presented to the user. An example of be addressed to support improved document representation. We our proposed alternative representation is shown in Figure 3, where believe that this research will help to measure and promote user the relevant concepts for a document on: “Depression Symptoms engagement and learning effectively, and further also lead to effec- and Treatments” are indicated. Figure 2 represents the traditional tive representation of summaries for web documents for mobile and surrogate representation by the current Google search system. emerging search environments in future. Figure 1 shows the proposed system architecture for supporting enhanced representation. There are two main processes added to Acknowledgment: This research is supported by Science Founda- the traditional search model: Concept Extraction, this deals with tion Ireland (SFI) as a part of the ADAPT Centre at Dublin City forming models and a framework to extract relevant and important University (Grant No: 12/CE/I2267). information from a document to be represented effectively e.g. us- ing a knowledge base or resources such as wikipedia, DMOZ etc.; Human learning modeling, this aims at forming models to mea- 5. REFERENCES [1] J. Jiang, D. He, and J. Allan. Searching, browsing, and sure the increase in topical knowledge of a user by analyzing their clicking in a search session: changes in user behavior by task interactions with the enhanced representation. and over time. In Proceedings of ACM SIGIR 2014, pages The main challenges to be investigated and addressed within this 607–616. proposed framework are contained in the following Research Ques- tions (RQ’s): [2] H. Joho and J. M. Jose. Effectiveness of additional representations for the search result presentation on the web. • RQ1: How to make efficient models to find and compare se- Information processing & management, 44(1):226–241, 2008. mantic concepts from documents ? [3] P. Vakkari. Exploratory searching as conceptual exploration. – In RQ1, the task is to identify effective methods for extract- Proceedings of HCIR, pages 24–27, 2010. ing semantic concepts to represent documents. [4] R. W. White, J. M. Jose, and I. Ruthven. Using top-ranking • RQ2: How can we provide representations of documents us- sentences to facilitate effective information access. JASIST, ing semantic concepts ? 2005, 56(10):1113–1125.