=Paper= {{Paper |id=Vol-3739/abstract-19 |storemode=property |title=Spectra of Cardinality Queries over Description Logic Knowledge Bases (Extended Abstract) |pdfUrl=https://ceur-ws.org/Vol-3739/abstract-19.pdf |volume=Vol-3739 |authors=Quentin Maniรจre,Marcin Przybyล‚ko |dblpUrl=https://dblp.org/rec/conf/dlog/ManiereP24 }} ==Spectra of Cardinality Queries over Description Logic Knowledge Bases (Extended Abstract)== https://ceur-ws.org/Vol-3739/abstract-19.pdf
                         Spectra of Cardinality Queries over Description Logic
                         Knowledge Bases (Extended Abstract)
                         Quentin Maniรจre1,2 , Marcin Przybyล‚ko1
                         1
                             Department of Computer Science, Leipzig University, Germany
                         2
                             Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Dresden/Leipzig, Germany

                         Keywords
                         Description Logics, Counting queries, Spectrum, Complexity of reasoning




                         1. Introduction
                         A recent line of research has explored ways of leveraging ontology-mediated query answering (OMQA)
                         to support counting queries, a class of aggregate queries that allows to perform analytics on data.
                         Several semantics for such queries have been investigated, differing on how the possibility of multiple
                         models can be taken into account [1, 2, 3].
                            In this paper we adopt the semantic proposed in [2], extending [4], that defines a counting query
                         as a conjunctive query (CQ) in which some variables have been designated as counting variables.
                         Such a query is evaluated in every model of the knowledge base (KB) by considering every possible
                         homomorphism of the query into the model and by then returning the number of obtained assignments
                         for the counting variables. A certain answer of the counting query over the KB is then defined as an
                         interval J๐‘š, ๐‘€ K that contains all the possible answers across all possible models, i.e. a uniform lower
                         bound ๐‘š and a uniform upper bound ๐‘€ . The complexity of deciding whether a given interval is a
                         certain answer under this semantics is now well-understood for a variety of DLs [5, 6].
                            In the present paper, rather than providing uniform bounds, we aim to compute (a representation of)
                         the set of possible answers, which is a subset (of tuples) of natural numbers with infinity, i.e. a subset of
                         Nโˆž := {0, 1, 2, . . . , โˆž}. We call this subset the spectrum of the counting query, inspired by the notion
                         of spectrum of a formula, that is the set of the possible cardinalities of its models [7, 8]. To do so, we
                         first investigate the possible shapes of spectra for counting conjunctive queries (CCQs) and ontologies
                         expressed in the ๐’œโ„’๐’žโ„โ„ฑ. Traditional CQ answering is well-understood in this expressive DL [9, 10]
                         that supports functionality constraints whose interactions with counting queries have never been
                         studied to the best of our knowledge (those proposed in [5] and denoted ๐’ฉ โˆ’ being more restricted).
                            One of the challenges encountered in our work is to clarify how to represent spectra. Indeed, the set
                         of possible answers of a CCQ across models of a KB might, a priori, be an arbitrarily complex set of
                         natural numbers, and thus hard to describe by means other than providing the CCQ-KB couple. We aim
                         to identify classes of ontology-mediated queries (OMQs) whose spectra admit an effective representation.
                         By effective, we intend that (i) the representation is finite, ideally with a size that can be bounded by
                         the size of the OMQ, (ii) independent of the specific description logic, and (iii) spectrum membership
                         can be efficiently tested, i.e. membership can be tested in polynomial time with respect to the size of
                         the integer and of the representation. Finding such a representation, whenever it exists, can be viewed
                         as a precomputation allowing for further analytics.




                            DL 2024: 37th International Workshop on Description Logics, June 18โ€“21, 2024, Bergen, Norway
                          $ quentin.maniere@uni-leipzig.de (Q. Maniรจre); marcin.przybylko@uni-leipzig.de (M. Przybyล‚ko)
                           0000-0001-9618-8359 (Q. Maniรจre); 0000-0003-1859-7055 (M. Przybyล‚ko)
                                        ยฉ 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
2. Contributions
Our main contributions are the following. First, we introduce the notion of a spectrum of a CCQ
and formalize the problem of computing effective representation thereof, if it exists. We show that
connected and individual-free CCQs evaluated on ๐’œโ„’๐’žโ„โ„ฑ KBs always admit effectively representable
spectra, as those must be finitely generated subsets of Nโˆž . This further motivates a focus on cardinality
queries, i.e. Boolean atomic CCQs, introduced in [11], which admit effective representation. We fully
characterize the possible shapes of spectra for concept cardinality queries on ๐’œโ„’๐’žโ„โ„ฑ KBs, in particular
showing that every finitely generated subset of Nโˆž can be realized. We also study several sublogics
of ๐’œโ„’๐’žโ„โ„ฑ, from โ„ฐโ„’ and DL-Litecore , for most of which we obtain full characterizations of possible
shapes of spectra. For some, only simpler shapes, such as J๐‘š, +โˆžK, are possible (see Table 1). For
the โ„ฐโ„’โ„โ„ฑโŠฅ DL, corresponding to the Horn fragment of ๐’œโ„’๐’žโ„โ„ฑ, we notably use tailored variations
of the cycle-reversion techniques introduced to tackle finite model reasoning in such DLs [12, 13, 14].
Our work also features a wealth of examples to prove whether a given shape can indeed be obtained.
We further investigate the case of role cardinality queries which feature challenging shapes of spectra
already for โ„ฐโ„’โŠฅ KBs. Through connections with the concept case and refinements of the corresponding
constructions, we are able to show that, also in the case of role cardinality queries, the possible shapes
for โ„ฐโ„’โ„โ„ฑโŠฅ KBs are better-behaved than for general ๐’œโ„’๐’žโ„โ„ฑ KBs. We conclude with bounds regarding
the data complexity of computing some of those effective representations, notably relying on existing
results on DLs augmented with closed predicates.
   The rest of this extended abstract highlights preliminaries regarding spectra, shows a full char-
acterization for ๐’œโ„’๐’žโ„โ„ฑ and concept cardinality queries, and discusses our tailored version of the
cycle-reversion technique.


3. Effective representations of spectra
We assume familiarity with the DL ๐’œโ„’๐’žโ„โ„ฑ, its sublogics and semantics [15]. We assume two disjoint
sets of variables and counting variables. A counting conjunctive query (CCQ) takes the form ๐‘ž(๐‘ฅ        ยฏ) =
โˆƒ๐‘ฆยฏ โˆƒ๐‘งยฏ ๐œ“(๐‘ฅยฏ , ๐‘ฆยฏ, ๐‘งยฏ), where ๐‘ฅ
                              ยฏ and ๐‘ฆยฏ are tuples of distinct variables, ๐‘งยฏ is a tuple of distinct counting
variables and ๐œ“ is a conjunction of concept and role atoms whose terms are drawn from ๐‘ฅ           ยฏ โˆช ๐‘ฆยฏ โˆช ๐‘งยฏ
or individual names. For a given tuple ๐‘Ž     ยฏ of individuals, and a given model โ„ of a KB ๐’ฆ, we define:
#๐‘ž(๐‘Ž ยฏ)โ„ := #{๐œ‹|๐‘งยฏ | ๐œ‹ : ๐‘ž โ†’ โ„ homomorphism such that ๐œ‹(๐‘ฅ        ยฏ) = ๐‘Žยฏ}. The spectrum of ๐‘ž and ๐‘Ž ยฏ over ๐’ฆ
                                                โ„
is further defined as: Sp๐’ฆ (๐‘ž(๐‘Ž  ยฏ)) := {#๐‘ž(๐‘Ž ยฏ) | โ„ |= ๐’ฆ}.
   While we do not know whether all spectra can be effectively represented, in the sense explained
in the introduction, we identify a class of CCQs, namely connected and individual-free CCQs, whose
spectra admit such a representation based on the following property:

Lemma 1. If ๐‘ž is connected and individual-free, then Sp๐’ฆ (๐‘ž) is closed under addition.

   It is indeed well-known that every subset of Nโˆž closed under addition is finitely generated (see e.g.
[16], Chapter 2, Proposition 4.1). In particular, for every spectrum Sp๐’ฆ (๐‘ž) of a satisfiable connected
and individual-free CCQ ๐‘ž, there exist a finite subset ๐‘† of Nโˆž and two numbers ๐‘€, ๐‘› โˆˆ Nโˆž such that
Sp๐’ฆ (๐‘ž) = ๐‘† โˆช {๐‘€ + ๐‘˜ ยท ๐‘› | ๐‘˜ โˆˆ N}. Therefore, the problem of computing Sp๐’ฆ (๐‘ž) for such a pair (๐’ฆ, ๐‘ž)
can be properly defined as the task of computing ๐‘†, ๐‘€ and ๐‘›.
   Notice that Sp๐’ฆ (๐‘ž) = โˆ… if and only if ๐’ฆ is unsatisfiable; and, similarly, Sp๐’ฆ (๐‘ž) = {0} if and only if ๐’ฆ
is satisfiable but ๐‘ž is unsatisfiable with respect to ๐’ฆ. Spectra non-closed under addition are easily found
by dropping the above restriction to connected and individual-free CCQs, as shown by the following:

Example 1. Consider the empty KB ๐’ฆ and the two Boolean CCQs ๐‘ž1 := โˆƒ๐‘ง1 โˆƒ๐‘ง2 C(๐‘ง1 ) โˆง C(๐‘ง2 ) and
๐‘ž2 := โˆƒ๐‘ง1 โˆƒ๐‘ง2 r(a, ๐‘ง1 ) โˆง r(a, ๐‘ง2 ), where ๐‘ง1 and ๐‘ง2 are counting variables. It is easily verified that
Sp๐’ฆ (๐‘ž1 ) = Sp๐’ฆ (๐‘ž2 ) = {๐‘›2 | ๐‘› โˆˆ N} โˆช {โˆž}.
    Table 1
    Possible shapes of spectra for some description logics, here ๐‘š โˆˆ N and ๐‘‰ is any subsemigroup of N. โ‹†
    indicates no other shape is possible.
                 โ„’                     J๐‘š, โˆžK     โˆ…    {0}   {0} โˆช J๐‘š, โˆžK      {โˆž}     {0, โˆž}    ๐‘‰ โˆช {โˆž}
             ๐’œโ„’๐’žโ„โ„ฑ                 โ‹†      โœ“       โœ“    โœ“           โœ“            โœ“        โœ“           โœ“
             โ„ฐโ„’โ„โ„ฑโŠฅ                 ยท      โœ“       โœ“    โœ“           โœ“            โœ“        โœ“           ยท
            DL-Liteโ„ฑ               โ‹†      โœ“       โœ“    โœ“           โœ“            โœ“        โœ“           ยท
    DL-Litecore , ๐’œโ„’๐’žโ„, ๐’œโ„’๐’žโ„ฑ       โ‹†      โœ“       โœ“    โœ“           โœ“            ยท        ยท           ยท
              โ„ฐโ„’โ„โ„ฑ                 โ‹†      โœ“       โœ“    ยท           ยท            โœ“        ยท           ยท
                  โ„ฐโ„’               โ‹†      โœ“       ยท    ยท           ยท            ยท        ยท           ยท


4. Spectrum of a concept cardinality query
We now present two results regarding the query ๐‘žC := โˆƒ๐‘ง C(๐‘ง), where C is a concept name and ๐‘ง a
counting variable. Computing theโƒ’ spectrum    of ๐‘žC over a KB ๐’ฆ thus corresponds to the natural task of
deciding the possible values of โƒ’Cโ„ โƒ’ across the models โ„ of ๐’ฆ. Naturally, ๐‘žC satisfies preconditions of
                                    โƒ’

Lemma 1 and, thus, its spectrum is finitely generated. Conversely, one can ask which sets are spectra
of concept cardinality queries. For a DL โ„’, we say that a set ๐‘‰ is โ„’-concept realizable if there is a
concept ๐ถ and an โ„’ KB ๐’ฆ such that Sp๐’ฆ (๐‘žC ) = ๐‘‰ . For ๐’œโ„’๐’žโ„โ„ฑ KBs, we have the following complete
characterization.

Theorem 1. A subset of Nโˆž is ๐’œโ„’๐’žโ„โ„ฑ-concept realizable iff it is โˆ…, {0}, or any subsemigroup of Nโˆž
containing โˆž.

   The โ€œonly-ifโ€ direction is essentially a consequence of Lemma 1, while the โ€œifโ€ direction is a general-
ization of the following example in which Sp๐’ฆ (๐‘žC ) = 2N โˆช {โˆž}.

Example 2. Consider the KB ๐’ฆ with the empty ABox and the following ๐’œโ„’๐’žโ„โ„ฑ TBox enforcing that ๐‘Ÿ is
a bijection between ๐ด and ๐ต:

          Cโ‰กAโŠ”B          A โŠ“ B โŠ‘ โŠฅ A โŠ‘ โˆƒr.B           B โŠ‘ โˆƒrโˆ’ .A       โŠค โŠ‘ โ‰ค 1r.โŠค     โŠค โŠ‘ โ‰ค 1rโˆ’ .โŠค

  We now turn to โ„ฐโ„’โ„โ„ฑโŠฅ KBs, for which we are not able to obtain a full characterization of the
realizable spectrum. However, we prove that for a set to be realizable, it must have ๐›ผ = 1, that is the
possible shapes simplify to ๐‘† โˆช J๐‘š, โˆžK for some ๐‘š โˆˆ Nโˆž and ๐‘† โІ J0, ๐‘šK.

Theorem 2. If a subset of Nโˆž is โ„ฐโ„’โ„โ„ฑโŠฅ -concept realizable, then it has shape โˆ…, {0}, {โˆž}, {0, โˆž}, or
๐‘† โˆช J๐‘š, โˆžK for some ๐‘š โˆˆ N and ๐‘† โІ J0, ๐‘šK.

   The key ingredient to prove the aboveโƒ’ is aโƒ’ construction   of two (potentially infinite) models โ„ and ๐’ฅ
of an โ„ฐโ„’โ„โ„ฑโŠฅ KB ๐’ฆ = (๐’ฏ , ๐’œ) in which โƒ’C๐’ฅ โƒ’ = โƒ’Cโ„ โƒ’ + 1 < โˆž. To this end, we refine a cycle-reversion
                                                    โƒ’ โƒ’

technique which has been developed to study finite reasoning in โ„ฐโ„’โ„โ„ฑโŠฅ [14]. More precisely, we tailor
the notion of cycles to characterize under which conditions the extension of concept C may be finite,
and then carefully manipulate the corresponding models to produce the above โ„ and ๐’ฅ . Definition 1
below describes the cycles of interest for our study.
   An inverse functional path (IFP) is a sequence ๐พ0 , r1 , ๐พ1 , . . . , r๐‘› , ๐พ๐‘› where ๐‘› โ‰ฅ 1, ๐พ0 , . . . , ๐พ๐‘› are
conjunctions of concept names and r1 , . . . , r๐‘› are (potentially inverse) roles such that for all 0 โ‰ค ๐‘– < ๐‘›
we have ๐’ฏ |= ๐พ๐‘– โŠ‘ โˆƒr๐‘–+1 .๐พ๐‘–+1 and ๐’ฏ |= ๐พ๐‘–+1 โŠ‘ โ‰ค 1rโˆ’         ๐‘–+1 .๐พ๐‘– .
   The interesting cycles for a concept C are the IFPs looping on themselves and forcing the presence of
(at least) one instance of C โ€œper instance of the cycleโ€, as follows:

Definition 1. An IFP ๐พ0 , r1 , ๐พ1 , . . . , r๐‘› , ๐พ๐‘› is a C-generating cycle if ๐’ฏ |= ๐พ๐‘› โŠ‘ ๐พ0 and there exists
an IFP ๐ฟ0 , s1 , ๐ฟ1 , . . . , s๐‘š , ๐ฟ๐‘š such that ๐’ฏ |= ๐ฟ๐‘š โŠ‘ C and ๐’ฏ |= ๐พ๐‘– โŠ‘ ๐ฟ0 for some 0 โ‰ค ๐‘– โ‰ค ๐‘›.
  Reversing those cycles now means to consider the โ„ฐโ„’โ„โ„ฑโŠฅ TBox ๐’ฏC obtained from ๐’ฏ by adding, for
each C-generating cycle ๐พ0 , r1 , ๐พ1 , . . . , r๐‘› , ๐พ๐‘› and each 0 โ‰ค ๐‘– < ๐‘›, the axioms ๐พ๐‘–+1 โŠ‘ โˆƒrโˆ’
                                                                                               ๐‘– .๐พ๐‘– and
๐พ๐‘– โŠ‘ โ‰ค 1r๐‘–+1 .๐พ๐‘–+1 . A key result towards the proof of Theorem 2 is now:

Lemma 2. There is a model โ„ of ๐’ฆ such that โƒ’Cโ„ โƒ’ < โˆž if and only if the KB (๐’ฏC , ๐’œ) is satisfiable.
                                                    โƒ’ โƒ’


5. Perspectives
A full characterization of spectra shapes for โ„ฐโ„’โ„โ„ฑโŠฅ appears challenging as Theorem 2 suggests that
those spectra may have the shapes of arbitrary numerical semigroups. Furthermore, while Theorem 1
offers a complete characterization for ๐’œโ„’๐’žโ„โ„ฑ, how to compute the corresponding effective representa-
tions remains an open question.
   We believe that it could also be interesting to study the impact of our results on the closely related
problem of answering (Boolean atomic) queries under the bag semantics. While the semantics adopted
in the present paper does not coincide with bag semantics, as discussed for example in [17, 5], con-
siderations regarding the spectra and some of the corresponding techniques might be adapted to this
setting.


Acknowledgments
The authors acknowledge the financial support by the Federal Ministry of Education and Research
of Germany and by the Sรคchsische Staatsministerium fรผr Wissenschaft Kultur und Tourismus in
the program Center of Excellence for AI-research โ€œCenter for Scalable Data Analytics and Artificial
Intelligence Dresden/Leipzigโ€, project identification number: ScaDS.AI. Second author was supported
by the DFG project LU 1417/3-1 QTEC.


References
 [1] D. Calvanese, E. Kharlamov, W. Nutt, C. Thorne, Aggregate queries over ontologies, in: Proceedings
     of the 2nd International Workshop on Ontologies and Information Systems for the Semantic Web
     (ONISW), 2008, pp. 97โ€“104.
 [2] M. Bienvenu, Q. Maniรจre, M. Thomazo, Answering counting queries over DL-Lite ontologies, in:
     Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI), 2020, pp.
     1608โ€“1614.
 [3] C. Feier, C. Lutz, M. Przybylko, Answer counting under guarded TGDs, in: Proceedings of the
     24th International Conference on Database Theory (ICDT), 2021, pp. 11:1โ€“11:22.
 [4] E. V. Kostylev, J. L. Reutter, Complexity of answering counting aggregate queries over DL-Lite,
     Journal of Web Semantics (JWS) 33 (2015) 94โ€“111.
 [5] D. Calvanese, J. Corman, D. Lanti, S. Razniewski, Counting query answers over a DL-Lite knowl-
     edge base, in: Proceedings of the 29th International Joint Conference on Artificial Intelligence
     (IJCAI), 2020, pp. 1658โ€“1666.
 [6] M. Bienvenu, Q. Maniรจre, M. Thomazo, Counting queries over ๐’œโ„’๐’žโ„‹โ„ ontologies, in: Proceedings
     of the 19th International Conference on Principles of Knowledge Representation and Reasoning
     (KR), 2022, pp. 53โ€“62.
 [7] R. Fagin, Generalized first-order spectra and polynomial-time recognizable sets, Complexity of
     computation 7 (1974) 43โ€“73.
 [8] A. Durand, R. Fagin, B. Loescher, Spectra with only unary function symbols, in: Proceedings of
     the 11th International Workshop on Computer Science Logic (CSL), 1997, pp. 189โ€“202.
 [9] B. Glimm, I. Horrocks, C. Lutz, U. Sattler, Conjunctive query answering for the description logic
     ๐’ฎโ„‹โ„๐’ฌ, Journal of Artificial Intelligence Research (JAIR) 31 (2008) 157โ€“204.
[10] C. Lutz, The complexity of conjunctive query answering in expressive description logics, in:
     Proceedings of the 4th International Joint Conference on Automated Reasoning (IJCAR), 2008, pp.
     179โ€“193.
[11] M. Bienvenu, Q. Maniรจre, M. Thomazo, Cardinality queries over DL-Lite ontologies, in: Proceedings
     of the 30th International Joint Conference on Artificial Intelligence (IJCAI), 2021, pp. 1801โ€“1807.
[12] S. S. Cosmadakis, P. C. Kanellakis, M. Y. Vardi, Polynomial-time implication problems for unary
     inclusion dependencies, Journal of the ACM 37 (1990) 15โ€“46.
[13] R. Rosati, Finite model reasoning in DL-Lite, in: Proceedings of the 5th European Semantic Web
     Conference (ESWC), 2008, pp. 215โ€“229.
[14] Y. A. Ibรกรฑez-Garcรญa, C. Lutz, T. Schneider, Finite Model Reasoning in Horn Description Logics, in:
     Proceedings of the 14th International Conference on Principles of Knowledge Representation and
     Reasoning (KR), 2014, pp. 490โ€“509.
[15] F. Baader, I. Horrocks, C. Lutz, U. Sattler, An Introduction to Description Logic, Cambridge
     University Press, 2017.
[16] P. A. Grillet, Commutative Semigroups, Springer New York, NY, 2001.
[17] C. Nikolaou, E. V. Kostylev, G. Konstantinidis, M. Kaminski, B. Cuenca Grau, I. Horrocks, Founda-
     tions of ontology-based data access under bag semantics, Journal of Artificial Intelligence (AIJ)
     (2019) 91โ€“132.