=Paper=
{{Paper
|id=Vol-1628/DC_3
|storemode=property
|title=Intelligent Nudging to Support Interactive Exploration of a Data Graph
|pdfUrl=https://ceur-ws.org/Vol-1628/DC_3.pdf
|volume=Vol-1628
|authors=Marwan Al-Tawil
|dblpUrl=https://dblp.org/rec/conf/ht/Al-Tawil16
}}
==Intelligent Nudging to Support Interactive Exploration of a Data Graph==
Intelligent Nudging to Support Interactive Exploration of Big Data Graphs Marwan Al-Tawil School of Computing, University of Leeds, UK scmata@leeds.ac.uk ABSTRACT effectiveness, motivation, knowledge expansion) [12, 19]. This research investigates how to support the user exploration Specifically, we focus on knowledge utility – how useful a through big data graphs. Current successful approaches to trajectory in a data graph is to expand one’s knowledge in the interactive exploration take into account the utility from a user’s domain. Earlier research has acknowledged that data graph point of view. In this PhD, we are focusing on knowledge utility – exploration can promote expansion of domain knowledge through how useful the trajectories in a data graph are for expanding serendipitous learning (e.g. users discover concepts or user’s domain knowledge. The main goal of this research is to relationships they were unaware of) [5, 15]. However, not all design intelligent nudging techniques to direct the user to ‘good’ paths are beneficial for knowledge expansion, and ways for trajectories for knowledge expansion. Our earlier work identifying ‘good’ trajectories are required. investigating empirical nudging strategies for users exploration, Identifying good trajectories in a data graph requires anchoring suggests that paths which start with familiar and highly inclusive entities that serve as knowledge bridges to learn new things. entities and bring something unfamiliar are likely to increase the Our earlier work has acknowledged that when the user explores learning effect of users exploration. This direct us to investigate familiar entities, nudging should direct the user to explore subsumption theory for meaningful learning and adopt it as our something new [15]. This calls to investigate the subsumption underpinning theoretical model to generate good trajectories, theory for meaningful learning [6]. According to this theory, to where familiar and highly inclusive entities are used as knowledge incorporate new knowledge, the most familiar and inclusive anchors to bring and learn new knowledge. This calls for entities in the user’s cognition are used as knowledge anchors to developing an automatic approach to identify knowledge anchors subsume and learn new knowledge. Subsequently the new in a data graph. We follow an analogy with basic level objects in knowledge can take on meaning by becoming anchored with domain taxonomies that underpin our automated approach for the basic concepts in the user’s cognitive structures. identifying knowledge anchors. Several metrics for extracting However, identifying knowledge anchors in big data graphs is not knowledge anchors in a data graph are developed and examined. a trivial task and brings forth various research challenges, including: dealing with larger number of entities, from 100s of CCS Concepts entities in a typical ontology versus millions of entities in a typical G.2.2 [Mathematics of Computing]: Graph Theory – Graph data graph, and the need to exploit large number of data instances algorithms; H.1.2, 4.7 [Information systems]: Data management in the data graph. systems - Graph-based database models, WWW - RDF. The broader challenge of this PhD is: to design intelligent nudging techniques to direct the user to ‘good’ trajectories Keywords through big data graphs for knowledge expansion. To meet this Big Data graphs; interactive exploration; knowledge utility; challenge, we address two research questions: knowledge anchors. Question 1: How to develop automatic ways to identify data graph entities that provide knowledge anchors for navigation 1. INTRODUCTION AND MOTIVATION paths? This question can be seen as focusing on the Cognitive With the emergence and the growing rates of RDF Linked Data Science notion of basic level objects1 [7], to develop metrics to graphs, many applications take advantage of the exploration of the automatically identify knowledge anchors in a data graph. knowledge encoded in the graphs to support users’ interactive Question 2: How can we use knowledge anchors in a data graph exploration [3, 4, 17]. Consequently, more and more big data to design navigation paths for expanding users domain graphs are being exposed to users for exploratory search tasks knowledge? In the second question, subsumption theories will be such as learning or investigating, where the users usually discover adopted to nudge the user through navigation paths. We aim to new connections and associations [12]. Layman users who are maximize the serendipitous learning through bringing the users engaged in exploratory search sessions will usually have no (or first to the anchoring entities and then direct them to new and limited) familiarity with the specific domain and little (or no) interesting concepts at different levels of abstraction in the graph. awareness of the encoded knowledge in the graph. In other words, the users’ cognitive structures about the domain may not match the semantic structure of the data graph. This can provide major 2. RELATED WORK obstacles to interactive exploration, especially when the users Recent research on exploratory search through linked data graphs need to learn new things, resulting into confusion and frustration. has been examining different ways to provide intelligent support for users’ navigation. This has brought together research from the This research aims to support users' interactive exploration in big data graphs through directing the users to trajectories that can 1 The term “basic level objects” has been used in Cognitive Science. Other bring some benefit (utility) for the users (e.g. efficiency, developments, e.g. Formal Concept Analysts, call them “concepts. Semantic Web, personalization, HCI and Cognitive Science to meaningful learning [6]. Two subsumption strategies, the shape novel tools for interactive exploration of semantic data [14]. subordinate and the super-ordinate strategies, will be used Work on personalized exploration includes improving search [24]. On the one hand, subordinate strategy can be used to efficiency by considering user interests [8, 9, 17] or diversifying direct the user to explore unfamiliar members linked to the user exploration paths with recommendations based on the anchoring entities in the data graph (i.e. nudge to explore browsing history [10]. Extracting semantic patterns from linked subclass entities of an anchor). n the other hand, the data sources to improve diversity in recommendation results to super-ordinate strategy will be used when the anchoring users has been proposed in [18]. Diversity is measured based on entities are members of new unfamiliar and more inclusive the semantic distance of topics and genres of the results. The work entities (i.e. nudge to explore superclass entities of an anchor). in [13] presents an approach to rank RDF statements with the expectation that some statements will be more valuable or 4. CURRENT OUTCOMES interesting to users than other statements within some context. We formally describe and implement the metrics and the A wide range of tools for offering interactive exploration using corresponding algorithms for identifying knowledge anchors [1]. linked data technologies can be found in a recent survey [12]. The metrics were implemented by running SPARQL queries over Our work brings a new dimension to this research stream by the MusicPinta data graph stored in a triplestore [5]. The looking at the knowledge utility of the exploration path. We performance of the algorithms is examined using benchmarking hypothesize that the cognitive learning theory of ‘meaningful sets with basic level entities identified by humans, corresponding learning’ [6] can be used to design paths with high knowledge to the cognitive structures humans form on the part of the world utility, where new knowledge is subsumed under familiar and represented in the data graph. A free-naming tasks based user highly inclusive abstract entities. To identify knowledge anchors study in the music domain using the MusicPinta data graph, was in a data graph, we operationalize the notion of basic level objects carried out to identify the benchmarking sets. This resulted in two [7]. The problem of extracting important concepts from such benchmarking sets, StrongAnchors set that includes entities information spaces using the notion of basic level objects has closest to the human cognitive structures, and WeakAnchors set been tackled by two approaches, ontology summarization [2, 11] that includes entities people are likely to recognize when they are and in Formal Concept Analysis (FCA) [22, 23]. These on the lower level abstraction in the graph. Based on quantitative approaches utilize basic level objects with the aim of identifying and qualitative analysis, the strengths and limitations of each key concepts to help domain experts in understanding and metrics are assessed, and a hybridization approach is proposed. reengineering of an ontology or a concept lattice respectively. In our work, we apply the notion of basic level objects in a data 5. CONCLUSION AND FUTURE WORK Interactive data exploration is becoming a key daily life activity. graph to identify anchoring entities which are likely to be familiar It involves exploring large amount of data and deciding where to to layman users who are not domain experts. The formal go next in the graph. The success of big data graphs to support framework that maps Rosch’s definitions of basic level objects interactive exploration brings forth the challenge of building and cue validity [7] to data graphs is a major contribution of our intelligent approaches to nudge the user through beneficial paths work. We are unique in our use for these anchoring entities to with high knowledge utility. This emphasizes the importance of support interactive exploration of a data graph. identifying anchoring entities in the graph that can be used to subsume and learn new knowledge. 3. PROPOSED APPROACH We follow two approaches to address the two research questions Moving forward, The immediate future work is to apply the in this work, respectively: metrics for identifying knowledge anchors in another domain. The INSPIRE2 data graph about career options will be used to identify 1. Develop automatic ways to identify data graph entities that anchoring career points, that can be used in assisting the users in can be used as knowledge anchors for navigation paths. We identifying paths that will be beneficial for expanding their achieve this objective by adopting the notion of basic level awareness of their career options, including short or longer-term objects which was introduced in Cognitive Science research career paths. In the long run, we aim to develop nudging [7], illustrating that domain taxonomies include category techniques using the subsumption strategies. An initial probing objects which are at the basic level of abstraction. Basic level algorithm can be used to identify users familiarity [20] (i.e. use categories “carry the most information, possess the highest knowledge anchors to identify new or interesting entities for the category cue validity, and are, thus, the most differentiated users). Another option is to have a quick probing at every from one another” [7]. We adopt two approaches to identify knowledge anchor during the exploration. A user study will be knowledge anchors: distinctiveness approach, that is based on conducted to evaluate the navigation strategies. Random paths the formal definition of cue validity, to identify the most will be our base-line for evaluating the navigation strategies. differentiated basic categories whose attributes are associated amongst the category members but are not associated to The impact of this work is not limited to support data graph members of other categories; and homogeneity approach to exploration. It can be also applied to ontology summarization, identify basic categories whose members share many where anchoring entities allow capturing a lay person’s view of attributes together. The homogeneity approach is the domain. Also, knowledge anchors can be used to initiate complementary with the distinctiveness feature. A basic a dialog to solve the ‘cold start' problem in personalization category object with high cue validity will have high number and adaptation. of entities common to its members. 2. Develop navigation paths using knowledge anchors. To 2 achieve this objective we adopt subsumption strategies for INSPIRE is a system under development in Birkbeck, University of London, about career guidance domain, particularly career transitions. 6. REFERENCES [13] Dean, M., Basu, P., Carterette, B., Partridge, C. and Hendler, [1] Marwan Al-Tawil, Vania Dimitrova, Dhavalkumar Thakker J. What to Send First? A Study of Utility in the Semantic and Brandon Bennett. Identifying Knowledge Anchors in a Web. In (LHD+SemQuant), 2012, @ ISWC2012.. Data Graph. In HT2016, Halifax, Canada. [14] Waitelonis, J., Knuth, M., Wolf, L., Hercher, J., and H. Sack. [2] Troullinou, G., Kondylakis, H., Daskalaki , E., Plexousakis , The Path is the Destination-Enabling a New Search D. RDF Digest: Efficient Summarization of RDF/S KBs. In Paradigm with Linked Data. In LD in the Future Internet @ ESWC, 2015. Future Internet Assembly, 2010. [3] Popov, I.O., Schraefel, M., Hall, W., Shadbolt, N. [15] Al-Tawil, M., Thakker, D. and Dimitrova, V. Nudging to Connecting the Dots: A Multi-pivot approach to data Expand User’s Domain Knowledge while Exploring Linked exploration. In ISWC, 2011. Data. In (IESD), 2014, @ ISWC2014. [4] Thellmann, K., Galkin, M., Orlandi, F., and Auer, S. [16] Tanaka, J., & Taylor, M. Object Categories and Expertise: Is LinkDaViz – Automatic Binding of Linked Data to the Basic Level in the Eye of the Beholder? Cognitive Visualizations. In ISWC, 2015. Psychology, 1991, 23, pp 457-482. [5] Thakker, D., Dimitrova, V., Lau, L., Yang-Turner, F. & [17] Marie, N., Corby, O., Gandon, F. and Ribiere, M. Composite Despotakis, D. Assisting User Browsing over Linked Data: interests’ exploration thanks to on-the-fly linked data Requirements Elicitation with a User Study. In ICWE 2013 spreading activation. In Hypertext 2013. [6] Ausubel, D. A Subsumption Theory of Meaningful Verbal [18] Maccatrozzo , V., Aroyo , L., Robert , R. Crowdsourced Learning and Retention. In Journal of General Psychology. Evaluation of Semantic Patterns for Recommen-dations. In Volume 66, Issue 2, 1962, pp. 213-224. UMAP 2013. LBR. [7] Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M., [19] Nunes, T., Schwabe, D. Exploration of Semi-Structured Data &Boyes-Braem, P. Basic objects in natural categories. Sources. In (IESD), 2014, @ ISWC2014. Cognitive Psychology, 1976, 8, 382-439. [20] Al-Tawil, M., Dimitrova, V., Thakker, D. Using Basic Level [8] Sah, M. & Wade, V. Personalized Concept-based Search and Concepts in a Linked Data Graph to Detect User's Domain Exploration on the Web of Data using Results Familiarity. In UMAP2015, Dublin, Ireland. Categorization. In ESWC 2013. [21] Thakker, D., Despotakis, D., Dimitrova, V., Lau, L., Brna, P. (2012). Taming digital traces for informal learning: A [9] Rossel,O. Implemention of a “search and browse” scenario semantic-driven approach. In Proceedings of EC-TEL 2012. for the LinkedData. In IESD, 2014. [22] Belohlavek, R., Trnecka, M. Basic Level in Formal Concept [10] Vocht1, et, al. A Visual Exploration Workflow as Enablerfor Analysis: Interesting Concepts and Psychological the Exploitation of Linked Open Data. In IESD, 2014. Ramifications. In IJCAI 2013. [11] Peroni, S., Motta, E., d'Aquin, M. Identifying key concepts [23] Belohlavek, R., Trnecka, M. Basic level of concepts in in an ontology through the integration of cognitive principles formal concept analysis. In ICFCA, 2012, pp 28-44. with statistical and topological measures. In ASWC, 2008. [24] Ausubel D., Novak, J., Hasian, H. Educational Psychology: a [12] Marie, N., Gandon, F. Survey of linked data based cognitive view - Rinehart Winston, New York, 1978. exploration. In IESD@ISWC2014.