=Paper=
{{Paper
|id=Vol-2903/IUI21WS-ESIDA-4
|storemode=property
|title=PaperExplorer: Personalized Exploratory Search for Conference Proceedings
|pdfUrl=https://ceur-ws.org/Vol-2903/IUI21WS-ESIDA-4.pdf
|volume=Vol-2903
|authors=Behnam Rahdari,Peter Brusilovsky
|dblpUrl=https://dblp.org/rec/conf/iui/RahdariB21
}}
==PaperExplorer: Personalized Exploratory Search for Conference Proceedings==
PaperExplorer: Personalized Exploratory Search for Conference Proceedings Behnam Rahdari, Peter Brusilovsky School of Computing and Information, University of Pittsburgh, 135 North Bellefield Avenue Pittsburgh, PA 15260, USA Abstract This paper presents our attempt to create an exploratory search system, PaperExplorer, for a historic archive of conference proceedings. PaperExplorer uses concept extraction, knowledge graphs, and user-controlled recommendation to assist users with various levels of domain expertise in their information needs. Keywords Exploratory Search, Knowledge Graph, Information Exploration, Intelligent interface 1. Introduction and Background of finding research publications related to a certain con- ference. Exploratory search systems form an increasingly popu- lar category of information access and exploration tools. These systems creatively combined search, browsing, and 1.2. Controllability information analysis steps shifting user efforts from re- User controllability has been recognized as a valuable call (formulating a query) to recognition (i.e., selecting component of advanced information access interfaces. a link) and helping them to gradually learn more about The ideas of controllability were made popular by a the explored domain [1]. stream of work on user-controllable recommender sys- In this paper we present our attempt to augment the tems [4]. However the value of extended user control set of search systems focused on conference proceedings has been also demonstrated in the area of exploratory with a personalized exploratory search system PaperEx- search. plorer 1 . We hope that PaperExplorer ability to support in- For example, NameSieve [5] presented a summary of formation discovery, learning-while-searching, and per- search results in the form of entity clouds, which a con- sonalization could help a broader set of users to benefit trollable filtering and exploration of results. PeopleEx- from the assembled collection of conference proceedings. plorer [6] offered users an option to re-sort people search results based on multiple user-related factors. uRank [7] 1.1. Exploratory Search introduced a controllable interface for refining and re- organizing search result and SciNoon [8] simplifies the A number of real-life search tasks require a considerable exploratory search process for scientific groups. amount of learning during the search process to achieve adequate results. These tasks are known as exploratory search tasks [2]. Since simple search systems are usually 1.3. Open User Profile not efficient in supporting exploratory search tasks, a The idea to apply open user profiles (also known as open range of specialized systems have been developed and user models) to better support personalized information evaluated. access was among the early ideas explored in this field. More recently, few projects in this area demonstrated Open user profiles allow users to examine and possibly that the effectiveness of exploratory search could be im- change the content of their interest profiles, which are proved by using a personalized system, which builds a used to personalize their search or browsing process. profile of user interests and adapts to the individual user Since the open user profiles increase interactivity, trans- [3]. The work presented in this paper investigates the parency, and controllability of the information explo- ideas of profile-based exploratory search in the context ration process, their application was a good match to the nature of exploratory search. While first attempts to Joint Proceedings of the ACM IUI 2021 Workshops, April 13-17, 2021, introduce “bag-of-words" open user profiles had mixed College Station, USA " ber58@pitt.edu (B. Rahdari); peterb@pitt.edu (P. Brusilovsky) success [9], more recent work focused on semantic level 0000-0001-6514-912X (B. Rahdari); 0000-0002-1902-1464 user profiles demonstrated its potential for personalized (P. Brusilovsky) exploratory search [3, 10]. © 2021 Copyright 2021 for this paper by its authors. Use permitted under Cre- ative Commons License Attribution 4.0 International (CC BY 4.0). We start the paper with the presentation of PaperEx- CEUR Workshop Proceedings CEUR Workshop Proceedings (CEUR-WS.org) http://ceur-ws.org ISSN 1613-0073 plorer interface and follow with the details on concept 1 http://scythian.exp.sis.pitt.edu/ht/ Figure 1: Interface Design of Paper-Explorer representing different parts of the system. extraction, knowledge graph organization, and recom- user’s profile grows and refines, the set of recommended mendation that enable the work of this interface. concepts is updated since the system recommends in- stances similar to all concepts in the user’s profile. Each recommended concept also provides users with a short 2. The Interface of PaperExplorer description of the concept. Clicking on the question mark button next to the add button, opens up a separate win- Personalized information exploration in PaperExplorer dow containing the abstract of that concept’s Wikipedia is centered around user interest profile [11] - a collection entry. of concepts represented by keyphrases that express user interests. Unlike traditional search that requires users to specify all keyphrases in a query, PaperExplorer supports 2.3. Open User Profile users in the process of gradual discovery and refinement The slider area (Figure 1C) displays the current user pro- of their interests. It also allows the users to control the file of interest. PaperExplorer implements a content- importance of each keyphrase in recommending relevant based recommendation approach, which generates the results. PaperExplorer interface consists of the following list of recommended results (Figure 1D) using the profile. main sections. To support transparency and controllability of this pro- cess, the interest profile is visible and directly editable by 2.1. Instant Search Box the end users. To build the profile the user can add relevant concepts The search box (Figure 1A) is the gateway to the system. represented by keyphrases as explained above as well The instant search approach allows users to discover as remove less relevant keyphrases (using the red x) as relevant keyphrases representing concepts of interest they discover more relevant concepts or explore different without a fully formulated query. When a user starts interests. typing a query, a series of matching keyphrases appears Sliders associated with each keyphrase enable users to helping the user to discover a concepts of interest (e.g., control the relative importance of the represented con- User Interfaces and User Modeling). When an item is cept compared to others in their profile, ranging from selected from the list, it will automatically adds to the 1 (least important) to 10 (most important). The use of slider area (Figure 1C). at the same time, an updated list sliders for fine-tuning of user profile was motivated by of search results will be presented to the user. keyword tuning approach in uRank [7], which was con- firmed as a user-friendly and efficient in an exploratory 2.2. Recommended Keyphrases search context. All actions within the profile (adding, When at least one keyphrase is added to the user’s profile, removing, or adjusting sliders) immediately affect the the system recommends five semantically similar con- search results list. cepts (shown as keyphrases) in the Similar keyphrases area of the interface (Figure 1B). Users can add recom- 2.4. Search Results mended keyphrases to their interest profiles by clicking As soon as the user adds the first keyphrase to the inter- on the plus button to the right of each keyphrase. As the est profile, a table of the 20 most relevant publications 3.1. Data Source and Keyphrase Extraction We used the collection of proceedings from two main conferences (Hypertext and UMAP) as the main source Figure 2: Graph Schema representing the entities of the of data to build the knowledge graph and extract the knowledge graph and the relationship between them keyphrases. This collection covers all publications of these two conferences from 2008 to 2020. Using this dataset and the concept extraction explained below, we is generated (Figure 1:D). The first column of the table generated the knowledge graph covering 2023 publica- visualizes the combined relevance between keyphrases tions. 14404 keyphrases were extracted from titles and in the user interest profile and each result. The colors in abstracts of these publications. the stacked-bar (Figure 1:D1) are matched with the color We used TopicRank [12], a graph-based keyphrase of slider in the profile and the size and opacity of each extraction method to extract the initial set of candidate bar expresses the relevance of the result to each profile keyphrases from the title and abstract of the publications. keyphrase. We then used the Wikipedia API to filter all extracted The second column of table lists the titles of relevant keyphrases; only keyphrases with an entry in Wikipedia publications. Clicking on each title expands a window were kept in the knowledge graph. We further assign that holds the abstract of the paper. The mentioned weight to each publication keyphrase pair using cosine keyphrases are highlighted with corresponding colors. similarity between the bags-of-words extracted from the The opacity of the colors reflect the relevance of a Wikipedia page and the publications. keyphrase to the paper and the current value of slider for that keyphrase. To further assist the users, PaperExplorer underlines all available keyphrases in the text (both in 4. Profile-Based Search title and abstract). We deployed a two-phase search process to produce the Hovering over the underlined portion of the text opens most relevant results based on user interest profile. In the a popup window (Figure 1:D2) that enable user to (1) see first phase, a primary list of candidates is being selected the relevance of the keyphrase to the text in a form of a from the graph and the second phase assure that the vertical bar-chart, (2) add the keyphrase directly to the results are presented to the user in the right order based interest profile, and (3) report the improper keyphrases on their relevancy to the query. We describe these two to the administrator for removal. phases in more details in the following. The latter helps us to improve the quality of extracted Candidate selection: We used the Cypher Querying keyphrases and eliminate the occasional errors in the Language to generate the initial list of candidate publi- process of extraction. cations. At each instance of user interaction with the system (e.g., adding/removing keyphrases or tuning the 3. The Knowledge Graph sliders), the system considers all publications connected to at least one of the concepts of interest in the user The knowledge graph consists of three main entities - profile. publications, authors, keyphrases and their relationships Reordering the results: After generating the list of can- - extracted from our data set and hosted in a native graph didate results, the system rearranges the results in a way database Neo4j2 . that the most relevant results appear at the top of the list. Figure 2 presents the schematic representation of the In order to do that, first a complete list of keyphrases that knowledge graph. Authors are interconnected by the re- appear in the text (title and abstract) of each publication, lation Co-Author (based on co-authorship) and connected alongside with their relevancy score (weight) is being to papers by the relation Published. Papers connected to generated. Then for every keyphrase that exist in the keyphrases using the Has-Key relationship. The latter user interest profile, we multiplied its weight with the carries a weight that determines the strength of the rela- value of corresponding slider. Finally, the relevance score tionship between each keyphrase and the publication. is assigned to each candidate considering candidate’s sim- ilarity to each of profile concepts and the value of the sliders. 2 https://en.wikipedia.org/wiki/Neo4j 5. Experience and Future Work session effectiveness and interaction engagement, Journal of the Association for Information Science PaperExplorer system has been deployed online and also and Technology 71 (2020) 742–756. demonstrated to several target users. The early results [11] B. Rahdari, P. Brusilovsky, D. Babichenko, Person- indicate that the success of the system to a consider- alizing information exploration with an open user able extent depends on the quality of keyphrase extrac- model, in: 31st ACM Conference on Hypertext tion. We are interested to collaborate with experts on and Social Media (HT ’20), Association for Com- keyphrase extraction to develop approaches optimized puting Machinery, New York, NY, USA, 2020, p. 0. for exploratory search. doi:10.1145/3372923.3404797. [12] A. Bougouin, F. Boudin, B. Daille, TopicRank: Graph-based topic ranking for keyphrase extrac- References tion, in: Proceedings of the Sixth International [1] R. W. White, B. Kules, S. M. Drucker, et al., Sup- Joint Conference on Natural Language Processing, porting exploratory search, Communications of the Asian Federation of Natural Language Processing, ACM 49 (2006) 36–39. Nagoya, Japan, 2013. [2] G. Marchionini, Exploratory search: From finding to understanding, Communications of the ACM 49 (2006) 41–46. [3] F. Bakalov, B. König-Ries, A. Nauerz, M. Welsch, IntrospectiveViews: An interface for scrutinizing semantic user models, in: 18th International Con- ference on User Modeling, Adaptation, and Person- alization, Springer, 2010, pp. 219–230. [4] B. P. Knijnenburg, S. Bostandjiev, J. O’Donovan, A. Kobsa, Inspectability and control in social rec- ommenders, in: 6th ACM Conference on Recom- mender Systems, 2012, pp. 43–50. [5] J.-w. Ahn, P. Brusilovsky, J. Grady, D. He, R. Florian, Semantic annotation based exploratory search for information analysts, Information Processing & Management 46 (2010) 383–402. [6] S. Han, D. He, J. Jiang, Z. Yue, Supporting ex- ploratory people search: a study of factor trans- parency and user control, in: Proceedings of the 22nd ACM international conference on Informa- tion & Knowledge Management, ACM, 2013, pp. 449–458. [7] C. di Sciascio, V. Sabol, E. E. Veas, Rank as you go: User-driven exploration of search results, in: 21st International Conference on Intelligent User Interfaces, 2016, pp. 118–129. [8] Y. Nedumov, A. Babichev, I. Mashonsky, N. Sem- ina, Scinoon: Exploratory search system for scientific groups, in: IUI 2019 Workshop on Exploratory Search and Interactive Data Ana- lytics, 2019. URL: http://ceur-ws.org/Vol-2327/ IUI19WS-ESIDA-3.pdf. [9] J.-w. Ahn, P. Brusilovsky, J. Grady, D. He, S. Y. Syn, Open user profiles for adaptive news systems: help or harm?, in: the 16th international conference on World Wide Web, WWW ’07, ACM, 2007, pp. 11–20. [10] T. Ruotsalo, G. Jacucci, S. Kaski, Interactive faceted query suggestion for exploratory search: Whole-