=Paper=
{{Paper
|id=Vol-3739/abstract-19
|storemode=property
|title=Spectra of Cardinality Queries over Description Logic Knowledge Bases (Extended Abstract)
|pdfUrl=https://ceur-ws.org/Vol-3739/abstract-19.pdf
|volume=Vol-3739
|authors=Quentin Maniรจre,Marcin Przybyลko
|dblpUrl=https://dblp.org/rec/conf/dlog/ManiereP24
}}
==Spectra of Cardinality Queries over Description Logic Knowledge Bases (Extended Abstract)==
Spectra of Cardinality Queries over Description Logic Knowledge Bases (Extended Abstract) Quentin Maniรจre1,2 , Marcin Przybyลko1 1 Department of Computer Science, Leipzig University, Germany 2 Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Dresden/Leipzig, Germany Keywords Description Logics, Counting queries, Spectrum, Complexity of reasoning 1. Introduction A recent line of research has explored ways of leveraging ontology-mediated query answering (OMQA) to support counting queries, a class of aggregate queries that allows to perform analytics on data. Several semantics for such queries have been investigated, differing on how the possibility of multiple models can be taken into account [1, 2, 3]. In this paper we adopt the semantic proposed in [2], extending [4], that defines a counting query as a conjunctive query (CQ) in which some variables have been designated as counting variables. Such a query is evaluated in every model of the knowledge base (KB) by considering every possible homomorphism of the query into the model and by then returning the number of obtained assignments for the counting variables. A certain answer of the counting query over the KB is then defined as an interval J๐, ๐ K that contains all the possible answers across all possible models, i.e. a uniform lower bound ๐ and a uniform upper bound ๐ . The complexity of deciding whether a given interval is a certain answer under this semantics is now well-understood for a variety of DLs [5, 6]. In the present paper, rather than providing uniform bounds, we aim to compute (a representation of) the set of possible answers, which is a subset (of tuples) of natural numbers with infinity, i.e. a subset of Nโ := {0, 1, 2, . . . , โ}. We call this subset the spectrum of the counting query, inspired by the notion of spectrum of a formula, that is the set of the possible cardinalities of its models [7, 8]. To do so, we first investigate the possible shapes of spectra for counting conjunctive queries (CCQs) and ontologies expressed in the ๐โ๐โโฑ. Traditional CQ answering is well-understood in this expressive DL [9, 10] that supports functionality constraints whose interactions with counting queries have never been studied to the best of our knowledge (those proposed in [5] and denoted ๐ฉ โ being more restricted). One of the challenges encountered in our work is to clarify how to represent spectra. Indeed, the set of possible answers of a CCQ across models of a KB might, a priori, be an arbitrarily complex set of natural numbers, and thus hard to describe by means other than providing the CCQ-KB couple. We aim to identify classes of ontology-mediated queries (OMQs) whose spectra admit an effective representation. By effective, we intend that (i) the representation is finite, ideally with a size that can be bounded by the size of the OMQ, (ii) independent of the specific description logic, and (iii) spectrum membership can be efficiently tested, i.e. membership can be tested in polynomial time with respect to the size of the integer and of the representation. Finding such a representation, whenever it exists, can be viewed as a precomputation allowing for further analytics. DL 2024: 37th International Workshop on Description Logics, June 18โ21, 2024, Bergen, Norway $ quentin.maniere@uni-leipzig.de (Q. Maniรจre); marcin.przybylko@uni-leipzig.de (M. Przybyลko) 0000-0001-9618-8359 (Q. Maniรจre); 0000-0003-1859-7055 (M. Przybyลko) ยฉ 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 2. Contributions Our main contributions are the following. First, we introduce the notion of a spectrum of a CCQ and formalize the problem of computing effective representation thereof, if it exists. We show that connected and individual-free CCQs evaluated on ๐โ๐โโฑ KBs always admit effectively representable spectra, as those must be finitely generated subsets of Nโ . This further motivates a focus on cardinality queries, i.e. Boolean atomic CCQs, introduced in [11], which admit effective representation. We fully characterize the possible shapes of spectra for concept cardinality queries on ๐โ๐โโฑ KBs, in particular showing that every finitely generated subset of Nโ can be realized. We also study several sublogics of ๐โ๐โโฑ, from โฐโ and DL-Litecore , for most of which we obtain full characterizations of possible shapes of spectra. For some, only simpler shapes, such as J๐, +โK, are possible (see Table 1). For the โฐโโโฑโฅ DL, corresponding to the Horn fragment of ๐โ๐โโฑ, we notably use tailored variations of the cycle-reversion techniques introduced to tackle finite model reasoning in such DLs [12, 13, 14]. Our work also features a wealth of examples to prove whether a given shape can indeed be obtained. We further investigate the case of role cardinality queries which feature challenging shapes of spectra already for โฐโโฅ KBs. Through connections with the concept case and refinements of the corresponding constructions, we are able to show that, also in the case of role cardinality queries, the possible shapes for โฐโโโฑโฅ KBs are better-behaved than for general ๐โ๐โโฑ KBs. We conclude with bounds regarding the data complexity of computing some of those effective representations, notably relying on existing results on DLs augmented with closed predicates. The rest of this extended abstract highlights preliminaries regarding spectra, shows a full char- acterization for ๐โ๐โโฑ and concept cardinality queries, and discusses our tailored version of the cycle-reversion technique. 3. Effective representations of spectra We assume familiarity with the DL ๐โ๐โโฑ, its sublogics and semantics [15]. We assume two disjoint sets of variables and counting variables. A counting conjunctive query (CCQ) takes the form ๐(๐ฅ ยฏ) = โ๐ฆยฏ โ๐งยฏ ๐(๐ฅยฏ , ๐ฆยฏ, ๐งยฏ), where ๐ฅ ยฏ and ๐ฆยฏ are tuples of distinct variables, ๐งยฏ is a tuple of distinct counting variables and ๐ is a conjunction of concept and role atoms whose terms are drawn from ๐ฅ ยฏ โช ๐ฆยฏ โช ๐งยฏ or individual names. For a given tuple ๐ ยฏ of individuals, and a given model โ of a KB ๐ฆ, we define: #๐(๐ ยฏ)โ := #{๐|๐งยฏ | ๐ : ๐ โ โ homomorphism such that ๐(๐ฅ ยฏ) = ๐ยฏ}. The spectrum of ๐ and ๐ ยฏ over ๐ฆ โ is further defined as: Sp๐ฆ (๐(๐ ยฏ)) := {#๐(๐ ยฏ) | โ |= ๐ฆ}. While we do not know whether all spectra can be effectively represented, in the sense explained in the introduction, we identify a class of CCQs, namely connected and individual-free CCQs, whose spectra admit such a representation based on the following property: Lemma 1. If ๐ is connected and individual-free, then Sp๐ฆ (๐) is closed under addition. It is indeed well-known that every subset of Nโ closed under addition is finitely generated (see e.g. [16], Chapter 2, Proposition 4.1). In particular, for every spectrum Sp๐ฆ (๐) of a satisfiable connected and individual-free CCQ ๐, there exist a finite subset ๐ of Nโ and two numbers ๐, ๐ โ Nโ such that Sp๐ฆ (๐) = ๐ โช {๐ + ๐ ยท ๐ | ๐ โ N}. Therefore, the problem of computing Sp๐ฆ (๐) for such a pair (๐ฆ, ๐) can be properly defined as the task of computing ๐, ๐ and ๐. Notice that Sp๐ฆ (๐) = โ if and only if ๐ฆ is unsatisfiable; and, similarly, Sp๐ฆ (๐) = {0} if and only if ๐ฆ is satisfiable but ๐ is unsatisfiable with respect to ๐ฆ. Spectra non-closed under addition are easily found by dropping the above restriction to connected and individual-free CCQs, as shown by the following: Example 1. Consider the empty KB ๐ฆ and the two Boolean CCQs ๐1 := โ๐ง1 โ๐ง2 C(๐ง1 ) โง C(๐ง2 ) and ๐2 := โ๐ง1 โ๐ง2 r(a, ๐ง1 ) โง r(a, ๐ง2 ), where ๐ง1 and ๐ง2 are counting variables. It is easily verified that Sp๐ฆ (๐1 ) = Sp๐ฆ (๐2 ) = {๐2 | ๐ โ N} โช {โ}. Table 1 Possible shapes of spectra for some description logics, here ๐ โ N and ๐ is any subsemigroup of N. โ indicates no other shape is possible. โ J๐, โK โ {0} {0} โช J๐, โK {โ} {0, โ} ๐ โช {โ} ๐โ๐โโฑ โ โ โ โ โ โ โ โ โฐโโโฑโฅ ยท โ โ โ โ โ โ ยท DL-Liteโฑ โ โ โ โ โ โ โ ยท DL-Litecore , ๐โ๐โ, ๐โ๐โฑ โ โ โ โ โ ยท ยท ยท โฐโโโฑ โ โ โ ยท ยท โ ยท ยท โฐโ โ โ ยท ยท ยท ยท ยท ยท 4. Spectrum of a concept cardinality query We now present two results regarding the query ๐C := โ๐ง C(๐ง), where C is a concept name and ๐ง a counting variable. Computing theโ spectrum of ๐C over a KB ๐ฆ thus corresponds to the natural task of deciding the possible values of โCโ โ across the models โ of ๐ฆ. Naturally, ๐C satisfies preconditions of โ Lemma 1 and, thus, its spectrum is finitely generated. Conversely, one can ask which sets are spectra of concept cardinality queries. For a DL โ, we say that a set ๐ is โ-concept realizable if there is a concept ๐ถ and an โ KB ๐ฆ such that Sp๐ฆ (๐C ) = ๐ . For ๐โ๐โโฑ KBs, we have the following complete characterization. Theorem 1. A subset of Nโ is ๐โ๐โโฑ-concept realizable iff it is โ , {0}, or any subsemigroup of Nโ containing โ. The โonly-ifโ direction is essentially a consequence of Lemma 1, while the โifโ direction is a general- ization of the following example in which Sp๐ฆ (๐C ) = 2N โช {โ}. Example 2. Consider the KB ๐ฆ with the empty ABox and the following ๐โ๐โโฑ TBox enforcing that ๐ is a bijection between ๐ด and ๐ต: CโกAโB A โ B โ โฅ A โ โr.B B โ โrโ .A โค โ โค 1r.โค โค โ โค 1rโ .โค We now turn to โฐโโโฑโฅ KBs, for which we are not able to obtain a full characterization of the realizable spectrum. However, we prove that for a set to be realizable, it must have ๐ผ = 1, that is the possible shapes simplify to ๐ โช J๐, โK for some ๐ โ Nโ and ๐ โ J0, ๐K. Theorem 2. If a subset of Nโ is โฐโโโฑโฅ -concept realizable, then it has shape โ , {0}, {โ}, {0, โ}, or ๐ โช J๐, โK for some ๐ โ N and ๐ โ J0, ๐K. The key ingredient to prove the aboveโ is aโ construction of two (potentially infinite) models โ and ๐ฅ of an โฐโโโฑโฅ KB ๐ฆ = (๐ฏ , ๐) in which โC๐ฅ โ = โCโ โ + 1 < โ. To this end, we refine a cycle-reversion โ โ technique which has been developed to study finite reasoning in โฐโโโฑโฅ [14]. More precisely, we tailor the notion of cycles to characterize under which conditions the extension of concept C may be finite, and then carefully manipulate the corresponding models to produce the above โ and ๐ฅ . Definition 1 below describes the cycles of interest for our study. An inverse functional path (IFP) is a sequence ๐พ0 , r1 , ๐พ1 , . . . , r๐ , ๐พ๐ where ๐ โฅ 1, ๐พ0 , . . . , ๐พ๐ are conjunctions of concept names and r1 , . . . , r๐ are (potentially inverse) roles such that for all 0 โค ๐ < ๐ we have ๐ฏ |= ๐พ๐ โ โr๐+1 .๐พ๐+1 and ๐ฏ |= ๐พ๐+1 โ โค 1rโ ๐+1 .๐พ๐ . The interesting cycles for a concept C are the IFPs looping on themselves and forcing the presence of (at least) one instance of C โper instance of the cycleโ, as follows: Definition 1. An IFP ๐พ0 , r1 , ๐พ1 , . . . , r๐ , ๐พ๐ is a C-generating cycle if ๐ฏ |= ๐พ๐ โ ๐พ0 and there exists an IFP ๐ฟ0 , s1 , ๐ฟ1 , . . . , s๐ , ๐ฟ๐ such that ๐ฏ |= ๐ฟ๐ โ C and ๐ฏ |= ๐พ๐ โ ๐ฟ0 for some 0 โค ๐ โค ๐. Reversing those cycles now means to consider the โฐโโโฑโฅ TBox ๐ฏC obtained from ๐ฏ by adding, for each C-generating cycle ๐พ0 , r1 , ๐พ1 , . . . , r๐ , ๐พ๐ and each 0 โค ๐ < ๐, the axioms ๐พ๐+1 โ โrโ ๐ .๐พ๐ and ๐พ๐ โ โค 1r๐+1 .๐พ๐+1 . A key result towards the proof of Theorem 2 is now: Lemma 2. There is a model โ of ๐ฆ such that โCโ โ < โ if and only if the KB (๐ฏC , ๐) is satisfiable. โ โ 5. Perspectives A full characterization of spectra shapes for โฐโโโฑโฅ appears challenging as Theorem 2 suggests that those spectra may have the shapes of arbitrary numerical semigroups. Furthermore, while Theorem 1 offers a complete characterization for ๐โ๐โโฑ, how to compute the corresponding effective representa- tions remains an open question. We believe that it could also be interesting to study the impact of our results on the closely related problem of answering (Boolean atomic) queries under the bag semantics. While the semantics adopted in the present paper does not coincide with bag semantics, as discussed for example in [17, 5], con- siderations regarding the spectra and some of the corresponding techniques might be adapted to this setting. Acknowledgments The authors acknowledge the financial support by the Federal Ministry of Education and Research of Germany and by the Sรคchsische Staatsministerium fรผr Wissenschaft Kultur und Tourismus in the program Center of Excellence for AI-research โCenter for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzigโ, project identification number: ScaDS.AI. Second author was supported by the DFG project LU 1417/3-1 QTEC. References [1] D. Calvanese, E. Kharlamov, W. Nutt, C. Thorne, Aggregate queries over ontologies, in: Proceedings of the 2nd International Workshop on Ontologies and Information Systems for the Semantic Web (ONISW), 2008, pp. 97โ104. [2] M. Bienvenu, Q. Maniรจre, M. Thomazo, Answering counting queries over DL-Lite ontologies, in: Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI), 2020, pp. 1608โ1614. [3] C. Feier, C. Lutz, M. Przybylko, Answer counting under guarded TGDs, in: Proceedings of the 24th International Conference on Database Theory (ICDT), 2021, pp. 11:1โ11:22. [4] E. V. Kostylev, J. L. Reutter, Complexity of answering counting aggregate queries over DL-Lite, Journal of Web Semantics (JWS) 33 (2015) 94โ111. [5] D. Calvanese, J. Corman, D. Lanti, S. Razniewski, Counting query answers over a DL-Lite knowl- edge base, in: Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI), 2020, pp. 1658โ1666. [6] M. Bienvenu, Q. Maniรจre, M. Thomazo, Counting queries over ๐โ๐โโ ontologies, in: Proceedings of the 19th International Conference on Principles of Knowledge Representation and Reasoning (KR), 2022, pp. 53โ62. [7] R. Fagin, Generalized first-order spectra and polynomial-time recognizable sets, Complexity of computation 7 (1974) 43โ73. [8] A. Durand, R. Fagin, B. Loescher, Spectra with only unary function symbols, in: Proceedings of the 11th International Workshop on Computer Science Logic (CSL), 1997, pp. 189โ202. [9] B. Glimm, I. Horrocks, C. Lutz, U. Sattler, Conjunctive query answering for the description logic ๐ฎโโ๐ฌ, Journal of Artificial Intelligence Research (JAIR) 31 (2008) 157โ204. [10] C. Lutz, The complexity of conjunctive query answering in expressive description logics, in: Proceedings of the 4th International Joint Conference on Automated Reasoning (IJCAR), 2008, pp. 179โ193. [11] M. Bienvenu, Q. Maniรจre, M. Thomazo, Cardinality queries over DL-Lite ontologies, in: Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI), 2021, pp. 1801โ1807. [12] S. S. Cosmadakis, P. C. Kanellakis, M. Y. Vardi, Polynomial-time implication problems for unary inclusion dependencies, Journal of the ACM 37 (1990) 15โ46. [13] R. Rosati, Finite model reasoning in DL-Lite, in: Proceedings of the 5th European Semantic Web Conference (ESWC), 2008, pp. 215โ229. [14] Y. A. Ibรกรฑez-Garcรญa, C. Lutz, T. Schneider, Finite Model Reasoning in Horn Description Logics, in: Proceedings of the 14th International Conference on Principles of Knowledge Representation and Reasoning (KR), 2014, pp. 490โ509. [15] F. Baader, I. Horrocks, C. Lutz, U. Sattler, An Introduction to Description Logic, Cambridge University Press, 2017. [16] P. A. Grillet, Commutative Semigroups, Springer New York, NY, 2001. [17] C. Nikolaou, E. V. Kostylev, G. Konstantinidis, M. Kaminski, B. Cuenca Grau, I. Horrocks, Founda- tions of ontology-based data access under bag semantics, Journal of Artificial Intelligence (AIJ) (2019) 91โ132.