<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Model-theoretic Characterizations of Rule-based Ontologies?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Marco Console</string-name>
          <email>console@diag.uniroma1.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Phokion G. Kolaitis</string-name>
          <email>kolaitis@ucsc.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Pieris</string-name>
          <email>apieris@inf.ed.ac.uk</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Sapienza, University of Rome</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>UC Santa Cruz &amp; IBM Research</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Edinburgh</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>Model theory is the study of the interaction between formulas in some logical formalism and their models, that is, structures that satisfy the formulas. There are two directions in this interaction, namely, from syntax to semantics and from semantics to syntax. The rst direction aims to identify structural properties possessed by all models of formulas having common syntactic features. For example, it is easy to show that every universal rst-order sentence is preserved under substructures. The second direction aims to characterize formulas in terms of their structural properties. For example, the well known Los-Tarski Theorem asserts that if a rst-order sentence is preserved under substructures, then it is logically equivalent to a universal rst-order sentence. In general, establishing results in the second direction is a much harder task than establishing results in the rst. In other words, obtaining model-theoretic characterizations of formulas or of classes of formulas is a far greater challenge than identifying structural properties possessed by all models of formulas with common syntactic features. Makowsky and Vardi [9] were the rst to obtain model-theoretic characterizations of classes of database dependencies expressed in suitable fragments of rst-order logic. Furthermore, they classi ed the work on model-theoretic characterizations into two distinct approaches, which they called the preservation and the axiomatizability approach: Preservation Approach. One considers two logical formalisms L and L0, where L0 is typically a proper fragment of L, and the goal is to obtain modeltheoretic characterizations of the following form: a set of L-formulas is equivalent to a set 0 of L0-formulas if and only if the models of satisfy certain structural properties. For example, the Los-Tarski Theorem is a prime example in the preservation approach. Axiomatizability Approach. One considers a logical formalism L, and the goal is to obtain model-theoretic characterizations of the following form: a class C of structures is axiomatizable by a (possibly in nite) set of Lformulas if and only if the structures in C satisfy certain structural properties. Makowsky and Vardi [9] further distinguished the special case of nite axiomatizability, where the goal is to obtain model-theoretic characterizations of when a class of structures is axiomatizable by a nite set of L-formulas. They</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>then obtained model-theoretic characterizations of axiomatizability and
nite axiomatizability of classes of database dependencies.</p>
      <p>
        Database dependencies were originally used to formalize integrity constraints
in databases with much of the early work in this area focusing on the implication
problem between database dependencies (see [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for a survey). A prominent class
of database dependencies is that of tuple-generating dependencies (tgds), that
is, rst-order sentences of the form 8x8y ( (x; y) ! 9z (x; z)), where (x; y) is
a (possibly empty) conjunction of atoms, and (x; z) is a (non-empty)
conjunction of atoms. Later on, tgds found numerous uses in other data management
tasks. They have been used as schema-mapping speci cation languages and, as
such, have been successfully deployed in the study of data exchange [
        <xref ref-type="bibr" rid="ref1 ref5">1, 5</xref>
        ] and
data integration [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. More recently, tgds have been used for modeling ontologies.
In fact, tgds have been extensively studied in the context of ontologies under
di erent names, such as Datalog+/- [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and existential rules [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        Several model-theoretic characterizations following the preservation approach
have been established for tgds [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], as well as for description logics [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
Concerning the axiomatizability/ nite axiomatizability approach, model-theoretic
characterizations of full tgds (i.e., tgds with no existentially quanti ed variables in
the right-hand side) were obtained in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], while such characterizations of
sourceto-target tgds were obtained in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] (source-to-target tgds have been used to
formalize data exchange between a source schema and a target schema). These
results notwithstanding, however, the study of model-theoretic characterizations
of sets of arbitrary tgds in the axiomatizability/ nite axiomatizability approach
has remained largely unexplored so far.
      </p>
      <p>Summary of Results. Motivated by the preceding state of a airs, we
embark here on a systematic investigation of model-theoretic characterizations of
classes of tgds in the nite axiomatizability approach. This investigation is
carried out in the context of ontologies. From a syntactic point of view, an ontology
is speci ed by a set of formulas in some formalism. From a semantic point of
view, an ontology can be identi ed with the set of all structures ( nite or in nite)
that satisfy the formulas specifying the ontology. Hence, as a semantic object,
an ontology is an isomorphism-closed class of ( nite or in nite) structures over
some xed relational schema. Our goal is to answer the following question: what
are necessary and su cient conditions for an ontology (an isomorphism-closed
class of structures) to be speci ed by a set of tgds? The main outcome of this
investigation is to characterize the ontologies that are nitely axiomatizable by
a nite set of arbitrary tgds or by a nite set of tgds that belong to one of the
main subclasses of tgds, namely, full, frontier-guarded, guarded, and linear tgds.</p>
      <p>Our model-theoretic characterizations make use of structural properties
encountered in earlier model-theoretic characterizations, such as the ontology
being closed under direct products, and containing critical structures of every nite
cardinality, where a structure is critical if each of its relations contains all
possible tuples from the domain of the structure. The main innovation, however,
is the introduction and use of the notion of (n; m)-locality, where, intuitively, n
represents the number of universal quanti ers in the tgds and m represents the
number of existential quanti ers in the tgds. Our rst main result asserts that
an ontology O is axiomatizable by a nite set of tgds if and only if O is closed
under direct products, contains critical structures of every nite cardinality, and
is (n; m)-local for some non-negative integers n and m. The notion of (n;
m)locality turns out to be delicate, yet exible. Indeed, this notion can be tailored
to other classes of tgds, so that it gives rise to the re ned notions of
frontierguarded (n; m)-locality, guarded (n; m)-locality, and linear (n; m)-locality. Using
these re ned notions, we obtain model-theoretic characterizations of ontologies
that are axiomatizable by a nite set of, respectively, frontier-guarded, guarded,
and linear tgds.</p>
      <p>Acknowledgments. Part of Marco Console's work was done while he was a
research associate at the University of Edinburgh. Part of Andreas Pieris' work
was done while he was visiting the University of California, Santa Cruz. Marco
Console has been partially supported by MIUR under the PRIN 2017 project
\HOPE" (prot. 2017MMJJRE), and by the EU under the H2020-EU.2.1.1 project
TAILOR, grant id. 952215. Phokion Kolaitis has been partially supported by
NSF Award No. 1814152. Andreas Pieris was supported by the EPSRC grant
EP/S003800/1 \EQUID".</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Arenas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barcelo</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Libkin</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Murlak</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Foundations of Data Exchange</article-title>
          . Cambridge University Press (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Cal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gottlob</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lukasiewicz</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Marnette</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pieris</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          : Datalog+/
          <article-title>-: A family of logical knowledge representation and query languages for new applications</article-title>
          . In: LICS. pp.
          <volume>228</volume>
          {
          <issue>242</issue>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. ten Cate,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Kolaitis</surname>
          </string-name>
          ,
          <string-name>
            <surname>P.G.</surname>
          </string-name>
          :
          <article-title>Structural characterizations of schema-mapping languages</article-title>
          .
          <source>In: ICDT</source>
          . pp.
          <volume>63</volume>
          {
          <issue>72</issue>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Console</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kolaitis</surname>
            ,
            <given-names>P.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pieris</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Model-theoretic characterizations of rulebased ontologies</article-title>
          .
          <source>In: PODS</source>
          . pp.
          <volume>416</volume>
          {
          <issue>428</issue>
          (
          <year>2021</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Fagin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kolaitis</surname>
            ,
            <given-names>P.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Miller</surname>
            ,
            <given-names>R.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Popa</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Data exchange: semantics and query answering</article-title>
          .
          <source>Theor. Comput. Sci</source>
          .
          <volume>336</volume>
          (
          <issue>1</issue>
          ),
          <volume>89</volume>
          {
          <fpage>124</fpage>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Fagin</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vardi</surname>
          </string-name>
          , M.Y.:
          <article-title>The Theory of Data Dependencies - A Survey</article-title>
          .
          <source>In: Symposia in Applied Mathematics</source>
          , pp.
          <volume>19</volume>
          {
          <issue>71</issue>
          (
          <year>1986</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Lenzerini</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Data integration: A theoretical perspective</article-title>
          .
          <source>In: PODS</source>
          . pp.
          <volume>233</volume>
          {
          <issue>246</issue>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lutz</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piro</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wolter</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Description logic tboxes: Model-theoretic characterizations and rewritability</article-title>
          .
          <source>In: IJCAI</source>
          . pp.
          <volume>983</volume>
          {
          <issue>988</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Makowsky</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vardi</surname>
          </string-name>
          , M.Y.:
          <article-title>On the expressive power of data dependencies</article-title>
          .
          <source>Acta Inf</source>
          .
          <volume>23</volume>
          (
          <issue>3</issue>
          ),
          <volume>231</volume>
          {
          <fpage>244</fpage>
          (
          <year>1986</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Mugnier</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thomazo</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>An introduction to ontology-based query answering with existential rules</article-title>
          .
          <source>In: Reasoning Web</source>
          . pp.
          <volume>245</volume>
          {
          <issue>278</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Zhang</surname>
            , H., Zhang,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jiang</surname>
          </string-name>
          , G.:
          <article-title>Model-theoretic characterizations of existential rule languages</article-title>
          .
          <source>In: IJCAI</source>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>