Enabling Contextualized Knowledge Description with Non-symbolic Integration Elton Soares, Raphael Thiago, Wallas Santos, Rodrigo Santos, and Marcio Moreno IBM Research, Brazil, Av Pasteur 146 Rio de Janeiro - RJ, Brazil {eltons,wallas.sousa,rodrigo.costa}@ibm.com,{mmoreno, raphaelt}@br.ibm.com Abstract. Formal knowledge descriptions have been one of the main tools for enabling knowledge representation and reasoning in AI. The W3C Semantic Web initiative proposed a set of standards, including on a common data interchange format that enabled greater interoperability and integration between knowledge descriptions generated by different organizations in the academy and industry. Nonetheless, these standards have been unable to provide contextualized integration of knowledge descriptions effectively, as they do not provide constructs to explicitly define contextualized n-dimensional (e.g. time and space, n-ary and hi- erarchical facts, etc.) relationships between concepts from different de- scriptions, leading to inefficiencies in the reasoning and query processing over decontextualized relationships, and inconsistencies in the modeling approaches utilized for dealing with this limitation. The main goal of this demo is to present how a hybrid knowledge representation model and description language, namely Hyperknowledge, allows contextual- ized integration of knowledge descriptions using constructs that enable the explicit definition of rich contextual relationships. This demonstra- tion will be performed using the Knowledge Explorer System (KES), that supports visualization and management of the effective contextual- ization of Hyperknowledge expressivity w.r.t. n-dimensional relationships between concepts from multiple descriptions and between those concepts with their corresponding non-symbolic content. Keywords: Hyperknowledge; Hybrid Knowledge Representation; Knowl- edge Management; Knowledge Visualization; Knowledge Curation 1 Introduction As the AI industry and research community grow, the number and size of knowl- edge descriptions produced and shared using the W3C Semantic Web Stan- dards 2 (WSWS) tend to grow at a similar pace. Copyright c 2020 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). 2 https://www.w3.org/standards/semanticweb/ Meanwhile, it has become clear that knowledge descriptions are not always absolute, but instead are assumed to hold under certain circumstances that define a specific context [7]. Descriptions may need to be contextualized, for example, with regards to dimensions such as time and space, as a fact that holds in a certain period, or a specific location of the world, might not be true in another period or location. Existing alternatives based on the WSWS, such as RDF graphs/datasets 3 , the OWL Time Ontology 4 or Ontology Patterns for N-ary relations 5 can be used to work around the expressivity limitations of the underlying conceptual model, but they fail to provide a general approach in the following aspects, respectively: Multiple valid interpretations depending on the assumptions made for graph naming and meaning; addresses only one specific contextual dimension using the same constructs that are used to represent the knowledge descriptions associated with it; propose multiple ways of representing the same n-ary relations using binary relations, as the underlying conceptual model is unable to represent them as first-class constructs. Therefore, as the WSWS does not prescribe a standard approach to contextu- alize knowledge descriptions, the context of those descriptions is often expressed in their identifiers or using textual labels and annotations, which is both ineffi- cient, w.r.t reasoning and query processing but also hard to standardize across multiple organizations. Several theories of context have been investigated in seminal works of the fields of AI and knowledge representation, for example, the formalization of con- texts as first-class constructs [1] and the use of a ”box” metaphor to properly rep- resent contexts [7]. Both of these theories served as inspiration for the approach presented in this work, that not only addresses the problem of contextualized knowledge description integration, but also its integration with non-symbolic content (i.e. multimodal content such as machine learning (ML) model, image, video, text, audio, etc.) in a given context. The integration of knowledge and non-symbolic content has been deeply in- fluenced by the semantic gap [8] between what the content means and its knowl- edge representation, which has been an open issue until the recent proposal of a hybrid conceptual model, namely Hyperknowledge [3], capable of representing relationships between conceptual description and non-symbolic content. Previous solutions from the non-symbolic AI community, such as metadata standards and models, attempted to bridge this gap by allowing the specification of predefined fields to describe low-level aspects of non-symbolic content (e.g. MPEG-7 6 and Dublin Core 7 ). Meanwhile, solutions from the symbolic AI 3 https://www.w3.org/TR/rdf11-datasets 4 https://www.w3.org/TR/owl-time/ 5 https://www.w3.org/2001/sw/BestPractices/OEP/n-aryRelations-20060323 6 https://mpeg.chiariglione.org/standards/mpeg-7 7 https://dublincore.org/specifications/dublin-core/ community, such as the WSWS, focused on formally describing abstract concepts and semantic relationships between them (e.g., RDF 8 , OWL 9 ). Hyperknowledge fills the gap left by previous solutions by providing first- class constructs that can be used to explicitly describe relationships between concepts and n-dimensional fragments of non-symbolic content [5]. It also pro- vides constructs that enable the contextualization of those relationships, there- fore enabling the description of rich relationships involving concepts from mul- tiple ontologies in a given context. 2 The Hyperknowledge Conceptual Model The Hyperknowledge conceptual model is composed of three main groups of entities: terminal nodes, composite nodes, and links [2]. A terminal node is composed of a collection of information units. The exact notion of what constitutes an information unit is part of the node definition and depends on its specialization. A context node is a composite node (or a set) that may contain links, terminal nodes, and composite nodes. Links define relationships among nodes’ anchors. There are different types of links such as SPO (subject-predicate-object) links, causal links, and constraint links. Every node, either composite or terminal, can be connected with other nodes through anchors, that represent portions or fragments of a node. Anchors can be used to represent n-dimensional fragments of a node (e.g time and space dimensions), enabling the creation of links between either concept or content nodes within those dimensions. Also, as every entity in the Hyperknowledge model can be contained in a composite node, it is possible to create links that represent the relationships between nodes in a given context. Therefore, the n-dimensional anchors and the context nodes can be used to explicitly contextualize n-dimensional relationships between concepts from multiple knowledge descriptions and corresponding non-symbolic content, which are represented as content nodes. 3 Knowledge Description and Non-Symbolic Integration To enable the integration of knowledge descriptions and non-symbolic content using Hyperknowledge, we devised an architecture that enables both human users (e.g., knowledge engineers, domain experts, and developers) and other systems (e.g. AI assistants and tools) to interact with the same Hyperknowlege Base (HKBase). An overview of this architecture is depicted in Figure 1 where we highlight the process of ingestion of knowledge descriptions, expressed using WSWS (e.g., 8 https://www.w3.org/RDF/ 9 https://www.w3.org/OWL/ RDF and OWL), and non-symbolic content (e.g., ML models and images). Hu- man users can perform this process using KES while other systems can com- municate directly with the HKBase using a RESTful API [4]. During the in- gestion of the knowledge descriptions, HKBase converts those descriptions to the Hyperknowledge representation and stores the original descriptions and the corresponding Hyperknowledge descriptions using one of its storage options. For this demonstration, the main storage option used for the Hyperknowl- edge descriptions is a triple store as it enables efficient ingestion and querying of knowledge descriptions based on WSWS, which makes it easier to integrate applications that were originally implemented using WSWS. As illustrated by the dashed arrows and boxes in Figure 1 the Hyperknowledge descriptions could be stored and queried using a document or graph database as well. For the ingestion of non-symbolic content, we currently make use of object storage but future implementations will make use of optimized storage options for each type of content. Our main object storage option is IBM Cloud Object Storage 10 but our sys- tem can be used with any AWS S3 11 compatible object storage. In that sense, we currently have support to all Hyperknowledge features using either MongoDB 12 or JanusGraph 13 , but the extended RDF compatibility and SPARQL support are optimized when using a triple store, which in the current implementation is Apache Jena 14 . Storage Knowledge Engineers, Domain Experts & Developers AI Assistants & Tools ledge rknow Hype riptions Triple Store Datasets Desc Knowledge Document Database Descriptions Knowledge Explorer System (KES) Hyperknowledge Base Non-Symbolic Graph Database Content Non -S ym b Cont olic Object Storage ent Fig. 1. Knowledge Description and Non-Symbolic Integration Architecture. 10 https://www.ibm.com/cloud/object-storage 11 https://docs.aws.amazon.com/AmazonS3 12 https://www.mongodb.com/ 13 https://janusgraph.org/ 14 https://jena.apache.org/ 4 Demo Script The main goal of this demo is to show Hyperknowledge features that support the contextualized integration of multiple knowledge descriptions and non-symbolic content through KES. In this sense, we defined a dataset containing traffic sim- ulation images extracted from a visual perception benchmark [6], concepts ex- tracted from these images, and several ontologies used in Intelligent Transporta- tion Systems to show how KES can support the contextualized specification of rich relationships between concepts from multiple knowledge descriptions for the Smart Mobility domain. The demo shows the collaborative capabilities of the sys- tem by illustrating different users having a synchronized view of the same knowl- edge description, while also being able to explore it individually, by expanding nodes of interest or visualizing the result of queries. The images will be exploited to demonstrate KES capability of contextualized integration of knowledge de- scriptions with non-symbolic content. The use of the ontologies is to illustrate the support to contextualized knowledge description integration. Finally, queries performed using KES will demonstrate its capability to answer contextualized spatial queries over non-symbolic content and knowledge descriptions. Demo video 1: Collaborative ingestion and contextualized integration of traffic knowledge descriptions and non-symbolic content. https://ibm.box.com/v/iswc2020-kes-demo1-video1 Demo video 2: Individualized exploration of the knowledge descriptions. https://ibm.box.com/v/iswc2020-kes-demo1-video2 Demo video 3: Visualization of contextualized spatial queries over traffic im- ages and knowledge descriptions. https://ibm.box.com/v/iswc2020-kes-demo1-video3 References 1. MCCARTHY, J.: Notes on formalizing context, ijcai (1993) 2. Moreno, M., Brandão, R., Santos, R., Sousa, W., Cerqueira, R.: Representing ex- perts’ interpretive trails with hyperknowledge specifications. In: Proceedings of the 5th International Conference on Frontiers of Educational Technologies (2019) 3. Moreno, M.F., Brandao, R., Cerqueira, R.: Extending hypermedia conceptual mod- els to support hyperknowledge specifications. International Journal of Semantic Computing 11(01) (2017) 4. Moreno, M.F., Santos, R.C., dos Santos, W.H., Cerqueira, R.: Kes: The knowledge explorer system. In: International Semantic Web Conference (P&D/Industry/BlueSky) (2018) 5. Moreno, M.F., Santos, R.C.M., Santos, W.H.S.d., Fiorini, S.R., Silva, R.M.d.G.: Multimedia search and temporal reasoning. arXiv preprint arXiv:1911.08225 (2019) 6. Richter, S.R., Hayder, Z., Koltun, V.: Playing for benchmarks. In: IEEE Interna- tional Conference on Computer Vision, ICCV 2017. pp. 2232–2241 (2017) 7. Serafini, L., Homola, M.: Contextualized knowledge repositories for the semantic web. Journal of Web Semantics 12 (2012) 8. Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Transactions on pattern analysis and machine intelligence 22(12) (2000)