Enabling Contextualized Knowledge Description
        with Non-symbolic Integration

    Elton Soares, Raphael Thiago, Wallas Santos, Rodrigo Santos, and Marcio
                                   Moreno

          IBM Research, Brazil, Av Pasteur 146 Rio de Janeiro - RJ, Brazil
            {eltons,wallas.sousa,rodrigo.costa}@ibm.com,{mmoreno,
                              raphaelt}@br.ibm.com


       Abstract. Formal knowledge descriptions have been one of the main
       tools for enabling knowledge representation and reasoning in AI. The
       W3C Semantic Web initiative proposed a set of standards, including on
       a common data interchange format that enabled greater interoperability
       and integration between knowledge descriptions generated by different
       organizations in the academy and industry. Nonetheless, these standards
       have been unable to provide contextualized integration of knowledge
       descriptions effectively, as they do not provide constructs to explicitly
       define contextualized n-dimensional (e.g. time and space, n-ary and hi-
       erarchical facts, etc.) relationships between concepts from different de-
       scriptions, leading to inefficiencies in the reasoning and query processing
       over decontextualized relationships, and inconsistencies in the modeling
       approaches utilized for dealing with this limitation. The main goal of
       this demo is to present how a hybrid knowledge representation model
       and description language, namely Hyperknowledge, allows contextual-
       ized integration of knowledge descriptions using constructs that enable
       the explicit definition of rich contextual relationships. This demonstra-
       tion will be performed using the Knowledge Explorer System (KES),
       that supports visualization and management of the effective contextual-
       ization of Hyperknowledge expressivity w.r.t. n-dimensional relationships
       between concepts from multiple descriptions and between those concepts
       with their corresponding non-symbolic content.

       Keywords: Hyperknowledge; Hybrid Knowledge Representation; Knowl-
       edge Management; Knowledge Visualization; Knowledge Curation


1     Introduction

As the AI industry and research community grow, the number and size of knowl-
edge descriptions produced and shared using the W3C Semantic Web Stan-
dards 2 (WSWS) tend to grow at a similar pace.
  Copyright c 2020 for this paper by its authors. Use permitted under Creative Com-
  mons License Attribution 4.0 International (CC BY 4.0).
2
  https://www.w3.org/standards/semanticweb/
    Meanwhile, it has become clear that knowledge descriptions are not always
absolute, but instead are assumed to hold under certain circumstances that define
a specific context [7]. Descriptions may need to be contextualized, for example,
with regards to dimensions such as time and space, as a fact that holds in a
certain period, or a specific location of the world, might not be true in another
period or location.
    Existing alternatives based on the WSWS, such as RDF graphs/datasets 3 ,
the OWL Time Ontology 4 or Ontology Patterns for N-ary relations 5 can be
used to work around the expressivity limitations of the underlying conceptual
model, but they fail to provide a general approach in the following aspects,
respectively: Multiple valid interpretations depending on the assumptions made
for graph naming and meaning; addresses only one specific contextual dimension
using the same constructs that are used to represent the knowledge descriptions
associated with it; propose multiple ways of representing the same n-ary relations
using binary relations, as the underlying conceptual model is unable to represent
them as first-class constructs.
    Therefore, as the WSWS does not prescribe a standard approach to contextu-
alize knowledge descriptions, the context of those descriptions is often expressed
in their identifiers or using textual labels and annotations, which is both ineffi-
cient, w.r.t reasoning and query processing but also hard to standardize across
multiple organizations.
    Several theories of context have been investigated in seminal works of the
fields of AI and knowledge representation, for example, the formalization of con-
texts as first-class constructs [1] and the use of a ”box” metaphor to properly rep-
resent contexts [7]. Both of these theories served as inspiration for the approach
presented in this work, that not only addresses the problem of contextualized
knowledge description integration, but also its integration with non-symbolic
content (i.e. multimodal content such as machine learning (ML) model, image,
video, text, audio, etc.) in a given context.
    The integration of knowledge and non-symbolic content has been deeply in-
fluenced by the semantic gap [8] between what the content means and its knowl-
edge representation, which has been an open issue until the recent proposal of
a hybrid conceptual model, namely Hyperknowledge [3], capable of representing
relationships between conceptual description and non-symbolic content.
    Previous solutions from the non-symbolic AI community, such as metadata
standards and models, attempted to bridge this gap by allowing the specification
of predefined fields to describe low-level aspects of non-symbolic content (e.g.
MPEG-7 6 and Dublin Core 7 ). Meanwhile, solutions from the symbolic AI

3
  https://www.w3.org/TR/rdf11-datasets
4
  https://www.w3.org/TR/owl-time/
5
  https://www.w3.org/2001/sw/BestPractices/OEP/n-aryRelations-20060323
6
  https://mpeg.chiariglione.org/standards/mpeg-7
7
  https://dublincore.org/specifications/dublin-core/
community, such as the WSWS, focused on formally describing abstract concepts
and semantic relationships between them (e.g., RDF 8 , OWL 9 ).
    Hyperknowledge fills the gap left by previous solutions by providing first-
class constructs that can be used to explicitly describe relationships between
concepts and n-dimensional fragments of non-symbolic content [5]. It also pro-
vides constructs that enable the contextualization of those relationships, there-
fore enabling the description of rich relationships involving concepts from mul-
tiple ontologies in a given context.


2     The Hyperknowledge Conceptual Model

The Hyperknowledge conceptual model is composed of three main groups of
entities: terminal nodes, composite nodes, and links [2].
    A terminal node is composed of a collection of information units. The exact
notion of what constitutes an information unit is part of the node definition
and depends on its specialization. A context node is a composite node (or a
set) that may contain links, terminal nodes, and composite nodes. Links define
relationships among nodes’ anchors. There are different types of links such as
SPO (subject-predicate-object) links, causal links, and constraint links.
    Every node, either composite or terminal, can be connected with other nodes
through anchors, that represent portions or fragments of a node. Anchors can
be used to represent n-dimensional fragments of a node (e.g time and space
dimensions), enabling the creation of links between either concept or content
nodes within those dimensions. Also, as every entity in the Hyperknowledge
model can be contained in a composite node, it is possible to create links that
represent the relationships between nodes in a given context.
    Therefore, the n-dimensional anchors and the context nodes can be used
to explicitly contextualize n-dimensional relationships between concepts from
multiple knowledge descriptions and corresponding non-symbolic content, which
are represented as content nodes.


3     Knowledge Description and Non-Symbolic Integration

To enable the integration of knowledge descriptions and non-symbolic content
using Hyperknowledge, we devised an architecture that enables both human
users (e.g., knowledge engineers, domain experts, and developers) and other
systems (e.g. AI assistants and tools) to interact with the same Hyperknowlege
Base (HKBase).
   An overview of this architecture is depicted in Figure 1 where we highlight
the process of ingestion of knowledge descriptions, expressed using WSWS (e.g.,
8
    https://www.w3.org/RDF/
9
    https://www.w3.org/OWL/
RDF and OWL), and non-symbolic content (e.g., ML models and images). Hu-
man users can perform this process using KES while other systems can com-
municate directly with the HKBase using a RESTful API [4]. During the in-
gestion of the knowledge descriptions, HKBase converts those descriptions to
the Hyperknowledge representation and stores the original descriptions and the
corresponding Hyperknowledge descriptions using one of its storage options.
    For this demonstration, the main storage option used for the Hyperknowl-
edge descriptions is a triple store as it enables efficient ingestion and querying
of knowledge descriptions based on WSWS, which makes it easier to integrate
applications that were originally implemented using WSWS. As illustrated by
the dashed arrows and boxes in Figure 1 the Hyperknowledge descriptions could
be stored and queried using a document or graph database as well.
    For the ingestion of non-symbolic content, we currently make use of object
storage but future implementations will make use of optimized storage options
for each type of content.
   Our main object storage option is IBM Cloud Object Storage 10 but our sys-
tem can be used with any AWS S3 11 compatible object storage. In that sense, we
currently have support to all Hyperknowledge features using either MongoDB 12
or JanusGraph 13 , but the extended RDF compatibility and SPARQL support
are optimized when using a triple store, which in the current implementation is
Apache Jena 14 .


                                                                                               Storage
                       Knowledge Engineers, Domain
                          Experts & Developers       AI Assistants & Tools             ledge
                                                                                 rknow
                                                                             Hype riptions          Triple Store
 Datasets                                                                       Desc

     Knowledge
                                                                                                Document Database
     Descriptions           Knowledge Explorer
                               System (KES)          Hyperknowledge Base
     Non-Symbolic
                                                                                                  Graph Database
       Content

                                                                             Non
                                                                                -S ym
                                                                                      b
                                                                              Cont olic            Object Storage
                                                                                    ent


        Fig. 1. Knowledge Description and Non-Symbolic Integration Architecture.


10
   https://www.ibm.com/cloud/object-storage
11
   https://docs.aws.amazon.com/AmazonS3
12
   https://www.mongodb.com/
13
   https://janusgraph.org/
14
   https://jena.apache.org/
4    Demo Script
The main goal of this demo is to show Hyperknowledge features that support the
contextualized integration of multiple knowledge descriptions and non-symbolic
content through KES. In this sense, we defined a dataset containing traffic sim-
ulation images extracted from a visual perception benchmark [6], concepts ex-
tracted from these images, and several ontologies used in Intelligent Transporta-
tion Systems to show how KES can support the contextualized specification of
rich relationships between concepts from multiple knowledge descriptions for the
Smart Mobility domain. The demo shows the collaborative capabilities of the sys-
tem by illustrating different users having a synchronized view of the same knowl-
edge description, while also being able to explore it individually, by expanding
nodes of interest or visualizing the result of queries. The images will be exploited
to demonstrate KES capability of contextualized integration of knowledge de-
scriptions with non-symbolic content. The use of the ontologies is to illustrate
the support to contextualized knowledge description integration. Finally, queries
performed using KES will demonstrate its capability to answer contextualized
spatial queries over non-symbolic content and knowledge descriptions.
Demo video 1: Collaborative ingestion and contextualized integration of traffic
knowledge descriptions and non-symbolic content.
https://ibm.box.com/v/iswc2020-kes-demo1-video1
Demo video 2: Individualized exploration of the knowledge descriptions.
https://ibm.box.com/v/iswc2020-kes-demo1-video2
Demo video 3: Visualization of contextualized spatial queries over traffic im-
ages and knowledge descriptions.
https://ibm.box.com/v/iswc2020-kes-demo1-video3

References
1. MCCARTHY, J.: Notes on formalizing context, ijcai (1993)
2. Moreno, M., Brandão, R., Santos, R., Sousa, W., Cerqueira, R.: Representing ex-
   perts’ interpretive trails with hyperknowledge specifications. In: Proceedings of the
   5th International Conference on Frontiers of Educational Technologies (2019)
3. Moreno, M.F., Brandao, R., Cerqueira, R.: Extending hypermedia conceptual mod-
   els to support hyperknowledge specifications. International Journal of Semantic
   Computing 11(01) (2017)
4. Moreno, M.F., Santos, R.C., dos Santos, W.H., Cerqueira, R.: Kes: The
   knowledge explorer system. In: International Semantic Web Conference
   (P&D/Industry/BlueSky) (2018)
5. Moreno, M.F., Santos, R.C.M., Santos, W.H.S.d., Fiorini, S.R., Silva, R.M.d.G.:
   Multimedia search and temporal reasoning. arXiv preprint arXiv:1911.08225 (2019)
6. Richter, S.R., Hayder, Z., Koltun, V.: Playing for benchmarks. In: IEEE Interna-
   tional Conference on Computer Vision, ICCV 2017. pp. 2232–2241 (2017)
7. Serafini, L., Homola, M.: Contextualized knowledge repositories for the semantic
   web. Journal of Web Semantics 12 (2012)
8. Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image
   retrieval at the end of the early years. IEEE Transactions on pattern analysis and
   machine intelligence 22(12) (2000)