<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Improving the Cost of Updates in Virtual Knowledge Graphs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Romuald Esdras Wandji</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Diego Calvanese</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computing Science, Umeå Universitet</institution>
          ,
          <addr-line>Umeå</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Research Centre for Knowledge and Data, Faculty of Engineering, Free University of Bozen-Bolzano</institution>
          ,
          <addr-line>Bolzano</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Virtual Knowledge Graph (VKG) is known as a data integration paradigm used to eficiently manage the heterogeneity of richly structured data that is common inside several organizations, in interorganizational settings, and more openly on the Web. Although such a paradigm continues to gain importance in both foundational and applied research, updates in VKG systems remain an open challenge that has received less attention. Yet, a solution to such a problem would be of great importance, as it would allow VKG systems to be full-fledged, thus allowing end-users to fully manage source data through the lens of the ontology they are exposed to. This research aims to propose a comprehensive framework for instance-level updates in VKGs, where updates posed over the ontology have to be translated into source-level updates and, more importantly, how the side efects related to the propagation of ontology-based updates to the underlying data source can be minimized.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Knowledge Representation</kwd>
        <kwd>Virtual Knowledge Graph (VKG)</kwd>
        <kwd>Ontology-based Data Access</kwd>
        <kwd>View Updates</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        As a rapidly growing field, Virtual Knowledge Graphs (VKGs) are robust tools for integrating
heterogeneous data sources with the help of ontologies. VKG systems are virtual approaches
that allow users to issue high-level ontological queries, which are automatically translated into
equivalent low-level queries (like SQL in a relational setting) that the database engine at the
underlying data source can execute. Formerly known as ontology-based data Access (OBDA),
a VKG system consists of three main components: an ontology, a set of data sources, and a
declarative mapping between the two [
        <xref ref-type="bibr" rid="ref1 ref2 ref3 ref4">1, 2, 3, 4</xref>
        ]. The ontology is a unified and abstract view of
multiple local data sources, thus allowing for more expressive data querying and improving
data integration, and is represented using a formal language. The ontology language specifically
tailored to the VKG setting is the OWL 2 QL profile (i.e., sub-language) [
        <xref ref-type="bibr" rid="ref5 ref6">5, 6</xref>
        ] of the Web Ontology
Language (OWL 2) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], standardized by the W3C. The data sources to be integrated, which are
typically relational, contain the information relevant for the domain of interest and are accessed
and managed by (possibly) diferent organizations. Finally, the mapping is a set of declarative
assertions expressed in the R2RML language [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] that describe how to populate the ontology
from the data retrieved from the sources. In the VKG approach, the facts generated via the
mapping from the underlying data source are kept virtual and are thus available to the user at
query time. The main reasoning service provided by VKG systems so far is query answering,
which is carried out through query rewriting and query unfolding [
        <xref ref-type="bibr" rid="ref1 ref9">1, 9</xref>
        ].
      </p>
      <p>
        Problem Statement and Contribution. VKGs provide the ability to query information
stored in data sources using Semantic Web technologies, such as Resource Description
Framework (RDF) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and OWL 2, which allows the user to leverage the open-world assumption
provided by the Semantic Web while maintaining the data in sources that traditionally operate
under the closed-world assumption. However, by taking advantage of a Knowledge Graph’s
capacity to handle incomplete information, it would be desirable to also provide support for
update operations over the source data through the lens of the ontology. Such a feature will
allow data and content owners to detach from low-level details of the underlying source
structure and organization. Unfortunately, the issue of updates in VKGs, which accounts for the
translation of a set of (complete/incomplete) insertion or deletion operations over the ontology
into equivalent operations over the data source, has received little attention so far and remains
a challenging task. A solution to this problem would be of great practical relevance since it
would allow the management of the key operations that are of interest in an information system
(i.e., queries and updates) through the lens of an ontology.
      </p>
      <p>This research aims at introducing in VKGs the notion of ontology-based update and evolution
and to study foundational and applied issues related to this extension. In particular, it would be
possible to: (a) Insert new objects into a class of the ontology and populate the corresponding
relations that are mapped to this class. (b) Add a new data property instance to an object in
a class and populate the corresponding attribute(s) that are mapped to this class. (c) Connect
two objects in two classes of the ontology through an object property instance and populate
the corresponding attributes and relations that are mapped to these classes. (d) Remove an
object, an instance of a data property, or an instance of an object property by deleting the
corresponding data from the underlying mapped relations. (e) Perform a combination of multiple
operations of the types above. Overall, this research aims at extending the capabilities of the
VKG framework from “read-only” to “write-also”, so as to dynamically manage and evolve data
through ontology-based operations.</p>
      <p>Importance. Enriching VKGs with update and evolution capabilities while maintaining
consistency represents an important step toward the practical usefulness of the VKG paradigm,
as it will impact how modern information systems handle data, making them more flexible
and responsive to changes. Using low-level languages like SQL to manage complex and large
data can be challenging and time consuming as it requires domain experts for maintenance.
However, by leveraging ontologies specified in user-friendly languages, organizations could
simplify data management, reducing reliance on domain experts, lowering operational costs,
and increasing organizational agility.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Research Questions</title>
      <p>
        We observe that in the typical context of incomplete information provided by an ontology, each
of the insertion operations and their combination may generate an inconsistency in the data
with respect to the axioms contained in the ontology. In order to characterize the semantics
of such a system it becomes therefore necessary to rely on a suitably-defined
consistencytolerant semantics, e.g., based on the notion of repair. A second challenge in VKG systems
comes from the presence of mappings, due to the inherent ambiguity that such mappings
introduce when there is the need to propagate an ontology-based update (even one that does
not generate an inconsistency, such as a delete operation) to the source data. Indeed, a VKG
mapping is essentially a view that defines an ontology element (class or property) in terms of a
query over the data source. Hence, each update over the ontology element translates into an
update over the view that combines all queries that correspond to mappings for that ontology
element, and thus faces the view-update problem that has a long history in relational database
management [
        <xref ref-type="bibr" rid="ref11 ref12 ref13">11, 12, 13</xref>
        ].
      </p>
      <p>This scenario poses a set of challenges and research questions that we aim to investigate:
RQ1 Under which conditions can update operations over a (virtual) knowledge graph in the
presence of an ontology be rewritten into update operations performed directly over
the (virtual) objects that constitute the knowledge graph (without the need to take into
account the ontology axioms)?
RQ2 Under which conditions can update operations over a VKG defined by a data source and
a set of declarative mappings, be faithfully realized through update operations over the
data source? Under which conditions is the translation uniquely determined?
RQ3 How can one find the source update operation that realizes a given update operation over
a VKG? When the VKG update is not realizable, how can one find the best approximation
(within the space of all possible source updates)? How can one eficiently search over the
space of all source updates?
RQ4 When an update over a VKG is either not realizable or not uniquely determined, which
additional information is it necessary to maintain in order to find an efective solution to
the view-update problem for VKG mappings?
RQ5 How can update operations over a VKG be implemented efectively in a state-of-the art</p>
      <p>VKG system that supports query rewriting?</p>
    </sec>
    <sec id="sec-3">
      <title>Acknowledgments</title>
      <p>This work has been partially supported by the Wallenberg AI, Autonomous Systems and Software
Program (WASP) funded by the Knut and Alice Wallenberg Foundation, by the Province of
Bolzano and DFG through the project D2G2 (DFG grant n. 500249124), and by the HEU project
CyclOps (under GA n. 101135513).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          , G. De Giacomo,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          ,
          <article-title>Linking data to ontologies</article-title>
          ,
          <source>J. on Data Semantics</source>
          <volume>10</volume>
          (
          <year>2008</year>
          )
          <fpage>133</fpage>
          -
          <lpage>173</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>540</fpage>
          -77688-
          <issue>8</issue>
          _
          <fpage>5</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          , G. De Giacomo,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rodriguez-Muro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          ,
          <article-title>Ontologies and databases: The DL-Lite approach</article-title>
          , in: S.
          <string-name>
            <surname>Tessaris</surname>
          </string-name>
          , E. Franconi (Eds.),
          <source>Reasoning Web: Semantic Technologies for Informations Systems - 5th Int. Summer School Tutorial Lectures (RW)</source>
          , volume
          <volume>5689</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2009</year>
          , pp.
          <fpage>255</fpage>
          -
          <lpage>356</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lembo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Poggi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rosati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zakharyaschev</surname>
          </string-name>
          ,
          <article-title>Ontology-based data access: A survey</article-title>
          ,
          <source>in: Proc. of the 27th Int. Joint Conf. on Artificial Intelligence (IJCAI)</source>
          ,
          <source>IJCAI Org.</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>5511</fpage>
          -
          <lpage>5519</lpage>
          . doi:
          <volume>10</volume>
          .24963/ijcai.
          <year>2018</year>
          /777.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cogrel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <article-title>Virtual Knowledge Graphs: An overview of systems and use cases</article-title>
          ,
          <source>Data Intelligence</source>
          <volume>1</volume>
          (
          <year>2019</year>
          )
          <fpage>201</fpage>
          -
          <lpage>223</lpage>
          . doi:
          <volume>10</volume>
          .1162/dint_a_
          <fpage>00011</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Motik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Cuenca</given-names>
            <surname>Grau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fokoue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          , OWL 2
          <string-name>
            <given-names>Web</given-names>
            <surname>Ontology Language Profiles (Second Edition</surname>
          </string-name>
          ),
          <source>W3C Recommendation, World Wide Web Consortium</source>
          ,
          <year>2012</year>
          . Available at http://www.w3.org/TR/owl2-profiles/.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Krötzsch</surname>
          </string-name>
          ,
          <article-title>OWL 2 profiles: An introduction to lightweight ontology languages, in: Reasoning Web: Semantic Technologies for Advanced Query Answering -</article-title>
          8th
          <source>Int. Summer School Tutorial Lectures (RW)</source>
          , volume
          <volume>7487</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2012</year>
          , pp.
          <fpage>112</fpage>
          -
          <lpage>183</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -33158-
          <issue>9</issue>
          _
          <fpage>4</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bao</surname>
          </string-name>
          , et al., OWL 2
          <string-name>
            <given-names>Web</given-names>
            <surname>Ontology Language Document Overview (Second Edition</surname>
          </string-name>
          ),
          <source>W3C Recommendation, World Wide Web Consortium</source>
          ,
          <year>2012</year>
          . Available at http://www.w3.org/ TR/owl2-overview/.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Das</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sundara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cyganiak</surname>
          </string-name>
          ,
          <article-title>R2RML: RDB to RDF Mapping Language</article-title>
          , W3C Recommendation, World Wide Web Consortium,
          <year>2012</year>
          . Available at http://www.w3.org/TR/r2rml/.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cogrel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Komla-Ebri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kontchakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rezk</surname>
          </string-name>
          , M. RodriguezMuro, G. Xiao,
          <article-title>Ontop: Answering SPARQL queries over relational databases</article-title>
          ,
          <source>Semantic Web J</source>
          .
          <volume>8</volume>
          (
          <year>2017</year>
          )
          <fpage>471</fpage>
          -
          <lpage>487</lpage>
          . doi:
          <volume>10</volume>
          .3233/SW-160217.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J. Z.</given-names>
            <surname>Pan</surname>
          </string-name>
          , Resource Description Framework, in: Handbook on Ontologies, Springer,
          <year>2009</year>
          , pp.
          <fpage>71</fpage>
          -
          <lpage>90</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>540</fpage>
          -92673-
          <issue>3</issue>
          _
          <fpage>3</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>F.</given-names>
            <surname>Bancilhon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Spyratos</surname>
          </string-name>
          ,
          <article-title>Update semantics of relational views</article-title>
          ,
          <source>ACM Trans. on Database Systems</source>
          <volume>6</volume>
          (
          <year>1981</year>
          )
          <fpage>557</fpage>
          -
          <lpage>575</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Cosmadakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. H.</given-names>
            <surname>Papadimitriou</surname>
          </string-name>
          , Updates of relational views,
          <source>J. of the ACM</source>
          <volume>31</volume>
          (
          <year>1984</year>
          )
          <fpage>742</fpage>
          -
          <lpage>760</lpage>
          . doi:
          <volume>10</volume>
          .1145/1634.
          <year>1887</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Kakas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mancarella</surname>
          </string-name>
          ,
          <article-title>Database updates through abduction</article-title>
          ,
          <source>in: Proc. of the 16th Int. Conf. on Very Large Data Bases (VLDB)</source>
          , volume
          <volume>90</volume>
          ,
          <year>1990</year>
          , pp.
          <fpage>650</fpage>
          -
          <lpage>661</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>