Model-theoretic Characterizations of Rule-based Ontologies? Marco Console1 , Phokion G. Kolaitis2 , and Andreas Pieris3 1 Sapienza, University of Rome console@diag.uniroma1.it 2 UC Santa Cruz & IBM Research kolaitis@ucsc.edu 3 University of Edinburgh apieris@inf.ed.ac.uk Model theory is the study of the interaction between formulas in some logical formalism and their models, that is, structures that satisfy the formulas. There are two directions in this interaction, namely, from syntax to semantics and from semantics to syntax. The first direction aims to identify structural proper- ties possessed by all models of formulas having common syntactic features. For example, it is easy to show that every universal first-order sentence is preserved under substructures. The second direction aims to characterize formulas in terms of their structural properties. For example, the well known Los-Tarski Theorem asserts that if a first-order sentence is preserved under substructures, then it is logically equivalent to a universal first-order sentence. In general, establishing results in the second direction is a much harder task than establishing results in the first. In other words, obtaining model-theoretic characterizations of formu- las or of classes of formulas is a far greater challenge than identifying structural properties possessed by all models of formulas with common syntactic features. Makowsky and Vardi [9] were the first to obtain model-theoretic character- izations of classes of database dependencies expressed in suitable fragments of first-order logic. Furthermore, they classified the work on model-theoretic char- acterizations into two distinct approaches, which they called the preservation and the axiomatizability approach: Preservation Approach. One considers two logical formalisms L and L0 , where L0 is typically a proper fragment of L, and the goal is to obtain model- theoretic characterizations of the following form: a set Σ of L-formulas is equivalent to a set Σ 0 of L0 -formulas if and only if the models of Σ sat- isfy certain structural properties. For example, the Los-Tarski Theorem is a prime example in the preservation approach. Axiomatizability Approach. One considers a logical formalism L, and the goal is to obtain model-theoretic characterizations of the following form: a class C of structures is axiomatizable by a (possibly infinite) set of L- formulas if and only if the structures in C satisfy certain structural properties. Makowsky and Vardi [9] further distinguished the special case of finite axiom- atizability, where the goal is to obtain model-theoretic characterizations of when a class of structures is axiomatizable by a finite set of L-formulas. They ? This is an extended abstract of [4]. “Copyright c 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).” then obtained model-theoretic characterizations of axiomatizability and fi- nite axiomatizability of classes of database dependencies. Database dependencies were originally used to formalize integrity constraints in databases with much of the early work in this area focusing on the implication problem between database dependencies (see [6] for a survey). A prominent class of database dependencies is that of tuple-generating dependencies (tgds), that is, first-order sentences of the form ∀x̄∀ȳ (φ(x̄, ȳ) → ∃z̄ ψ(x̄, z̄)), where φ(x̄, ȳ) is a (possibly empty) conjunction of atoms, and ψ(x̄, z̄) is a (non-empty) conjunc- tion of atoms. Later on, tgds found numerous uses in other data management tasks. They have been used as schema-mapping specification languages and, as such, have been successfully deployed in the study of data exchange [1, 5] and data integration [7]. More recently, tgds have been used for modeling ontologies. In fact, tgds have been extensively studied in the context of ontologies under different names, such as Datalog+/- [2] and existential rules [10]. Several model-theoretic characterizations following the preservation approach have been established for tgds [11], as well as for description logics [8]. Concern- ing the axiomatizability/finite axiomatizability approach, model-theoretic char- acterizations of full tgds (i.e., tgds with no existentially quantified variables in the right-hand side) were obtained in [9], while such characterizations of source- to-target tgds were obtained in [3] (source-to-target tgds have been used to formalize data exchange between a source schema and a target schema). These results notwithstanding, however, the study of model-theoretic characterizations of sets of arbitrary tgds in the axiomatizability/finite axiomatizability approach has remained largely unexplored so far. Summary of Results. Motivated by the preceding state of affairs, we em- bark here on a systematic investigation of model-theoretic characterizations of classes of tgds in the finite axiomatizability approach. This investigation is car- ried out in the context of ontologies. From a syntactic point of view, an ontology is specified by a set of formulas in some formalism. From a semantic point of view, an ontology can be identified with the set of all structures (finite or infinite) that satisfy the formulas specifying the ontology. Hence, as a semantic object, an ontology is an isomorphism-closed class of (finite or infinite) structures over some fixed relational schema. Our goal is to answer the following question: what are necessary and sufficient conditions for an ontology (an isomorphism-closed class of structures) to be specified by a set of tgds? The main outcome of this investigation is to characterize the ontologies that are finitely axiomatizable by a finite set of arbitrary tgds or by a finite set of tgds that belong to one of the main subclasses of tgds, namely, full, frontier-guarded, guarded, and linear tgds. Our model-theoretic characterizations make use of structural properties en- countered in earlier model-theoretic characterizations, such as the ontology be- ing closed under direct products, and containing critical structures of every finite cardinality, where a structure is critical if each of its relations contains all pos- sible tuples from the domain of the structure. The main innovation, however, is the introduction and use of the notion of (n, m)-locality, where, intuitively, n represents the number of universal quantifiers in the tgds and m represents the number of existential quantifiers in the tgds. Our first main result asserts that an ontology O is axiomatizable by a finite set of tgds if and only if O is closed under direct products, contains critical structures of every finite cardinality, and is (n, m)-local for some non-negative integers n and m. The notion of (n, m)- locality turns out to be delicate, yet flexible. Indeed, this notion can be tailored to other classes of tgds, so that it gives rise to the refined notions of frontier- guarded (n, m)-locality, guarded (n, m)-locality, and linear (n, m)-locality. Using these refined notions, we obtain model-theoretic characterizations of ontologies that are axiomatizable by a finite set of, respectively, frontier-guarded, guarded, and linear tgds. Acknowledgments. Part of Marco Console’s work was done while he was a research associate at the University of Edinburgh. Part of Andreas Pieris’ work was done while he was visiting the University of California, Santa Cruz. Marco Console has been partially supported by MIUR under the PRIN 2017 project “HOPE” (prot. 2017MMJJRE), and by the EU under the H2020-EU.2.1.1 project TAILOR, grant id. 952215. Phokion Kolaitis has been partially supported by NSF Award No. 1814152. Andreas Pieris was supported by the EPSRC grant EP/S003800/1 “EQUID”. References 1. Arenas, M., Barceló, P., Libkin, L., Murlak, F.: Foundations of Data Exchange. Cambridge University Press (2014) 2. Calı̀, A., Gottlob, G., Lukasiewicz, T., Marnette, B., Pieris, A.: Datalog+/-: A fam- ily of logical knowledge representation and query languages for new applications. In: LICS. pp. 228–242 (2010) 3. ten Cate, B., Kolaitis, P.G.: Structural characterizations of schema-mapping lan- guages. In: ICDT. pp. 63–72 (2009) 4. Console, M., Kolaitis, P.G., Pieris, A.: Model-theoretic characterizations of rule- based ontologies. In: PODS. pp. 416–428 (2021) 5. Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: semantics and query answering. Theor. Comput. Sci. 336(1), 89–124 (2005) 6. Fagin, R., Vardi, M.Y.: The Theory of Data Dependencies - A Survey. In: Symposia in Applied Mathematics, pp. 19–71 (1986) 7. Lenzerini, M.: Data integration: A theoretical perspective. In: PODS. pp. 233–246 (2002) 8. Lutz, C., Piro, R., Wolter, F.: Description logic tboxes: Model-theoretic character- izations and rewritability. In: IJCAI. pp. 983–988 (2011) 9. Makowsky, J.A., Vardi, M.Y.: On the expressive power of data dependencies. Acta Inf. 23(3), 231–244 (1986) 10. Mugnier, M., Thomazo, M.: An introduction to ontology-based query answering with existential rules. In: Reasoning Web. pp. 245–278 (2014) 11. Zhang, H., Zhang, Y., Jiang, G.: Model-theoretic characterizations of existential rule languages. In: IJCAI (2020)