<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>DL</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Spectra of Cardinality Queries over Description Logic Knowledge Bases (Extended Abstract)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Quentin Manière</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marcin Przybyłko</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI)</institution>
          ,
          <addr-line>Dresden/Leipzig</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, Leipzig University</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>37</volume>
      <fpage>18</fpage>
      <lpage>21</lpage>
      <abstract>
        <p>A recent line of research has explored ways of leveraging ontology-mediated query answering (OMQA) to support counting queries, a class of aggregate queries that allows to perform analytics on data. Several semantics for such queries have been investigated, difering on how the possibility of multiple models can be taken into account [1, 2, 3]. In this paper we adopt the semantic proposed in [2], extending [4], that defines a counting query as a conjunctive query (CQ) in which some variables have been designated as counting variables. Such a query is evaluated in every model of the knowledge base (KB) by considering every possible homomorphism of the query into the model and by then returning the number of obtained assignments for the counting variables. A certain answer of the counting query over the KB is then defined as an interval J,  K that contains all the possible answers across all possible models, i.e. a uniform lower bound  and a uniform upper bound  . The complexity of deciding whether a given interval is a certain answer under this semantics is now well-understood for a variety of DLs [5, 6]. In the present paper, rather than providing uniform bounds, we aim to compute (a representation of) the set of possible answers, which is a subset (of tuples) of natural numbers with infinity, i.e. a subset of N∞ := {0, 1, 2, . . . , ∞}. We call this subset the spectrum of the counting query, inspired by the notion of spectrum of a formula, that is the set of the possible cardinalities of its models [7, 8]. To do so, we ifrst investigate the possible shapes of spectra for counting conjunctive queries (CCQs) and ontologies expressed in the ℒℐℱ . Traditional CQ answering is well-understood in this expressive DL [9, 10] that supports functionality constraints whose interactions with counting queries have never been studied to the best of our knowledge (those proposed in [5] and denoted  − being more restricted). One of the challenges encountered in our work is to clarify how to represent spectra. Indeed, the set of possible answers of a CCQ across models of a KB might, a priori, be an arbitrarily complex set of natural numbers, and thus hard to describe by means other than providing the CCQ-KB couple. We aim to identify classes of ontology-mediated queries (OMQs) whose spectra admit an efective representation . By efective, we intend that (i) the representation is finite, ideally with a size that can be bounded by the size of the OMQ, (ii) independent of the specific description logic, and (iii) spectrum membership can be eficiently tested, i.e. membership can be tested in polynomial time with respect to the size of the integer and of the representation. Finding such a representation, whenever it exists, can be viewed as a precomputation allowing for further analytics.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Description Logics</kwd>
        <kwd>Counting queries</kwd>
        <kwd>Spectrum</kwd>
        <kwd>Complexity of reasoning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Contributions</title>
      <p>Our main contributions are the following. First, we introduce the notion of a spectrum of a CCQ
and formalize the problem of computing efective representation thereof, if it exists. We show that
connected and individual-free CCQs evaluated on ℒℐℱ KBs always admit efectively representable
spectra, as those must be finitely generated subsets of N∞. This further motivates a focus on cardinality
queries, i.e. Boolean atomic CCQs, introduced in [11], which admit efective representation. We fully
characterize the possible shapes of spectra for concept cardinality queries on ℒℐℱ KBs, in particular
showing that every finitely generated subset of N∞ can be realized. We also study several sublogics
of ℒℐℱ , from ℰ ℒ and DL-Litecore, for most of which we obtain full characterizations of possible
shapes of spectra. For some, only simpler shapes, such as J, +∞K, are possible (see Table 1). For
the ℰ ℒℐℱ⊥ DL, corresponding to the Horn fragment of ℒℐℱ , we notably use tailored variations
of the cycle-reversion techniques introduced to tackle finite model reasoning in such DLs [ 12, 13, 14].
Our work also features a wealth of examples to prove whether a given shape can indeed be obtained.
We further investigate the case of role cardinality queries which feature challenging shapes of spectra
already for ℰ ℒ⊥ KBs. Through connections with the concept case and refinements of the corresponding
constructions, we are able to show that, also in the case of role cardinality queries, the possible shapes
for ℰ ℒℐℱ⊥ KBs are better-behaved than for general ℒℐℱ KBs. We conclude with bounds regarding
the data complexity of computing some of those efective representations, notably relying on existing
results on DLs augmented with closed predicates.</p>
      <p>The rest of this extended abstract highlights preliminaries regarding spectra, shows a full
characterization for ℒℐℱ and concept cardinality queries, and discusses our tailored version of the
cycle-reversion technique.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Efective representations of spectra</title>
      <p>We assume familiarity with the DL ℒℐℱ , its sublogics and semantics [15]. We assume two disjoint
sets of variables and counting variables. A counting conjunctive query (CCQ) takes the form (¯) =
∃¯ ∃¯  (¯, ¯, ¯), where ¯ and ¯ are tuples of distinct variables, ¯ is a tuple of distinct counting
variables and  is a conjunction of concept and role atoms whose terms are drawn from ¯ ∪ ¯ ∪ ¯
or individual names. For a given tuple ¯ of individuals, and a given model ℐ of a KB , we define:
#(¯)ℐ := #{ |¯ |  :  → ℐ homomorphism such that  (¯) = ¯}. The spectrum of  and ¯ over 
is further defined as: Sp((¯)) := {#(¯)ℐ | ℐ |= }.</p>
      <p>While we do not know whether all spectra can be efectively represented, in the sense explained
in the introduction, we identify a class of CCQs, namely connected and individual-free CCQs, whose
spectra admit such a representation based on the following property:
Lemma 1. If  is connected and individual-free, then Sp() is closed under addition.</p>
      <p>It is indeed well-known that every subset of N∞ closed under addition is finitely generated (see e.g.
[16], Chapter 2, Proposition 4.1). In particular, for every spectrum Sp() of a satisfiable connected
and individual-free CCQ , there exist a finite subset  of N∞ and two numbers ,  ∈ N∞ such that
Sp() =  ∪ { +  ·  |  ∈ N}. Therefore, the problem of computing Sp() for such a pair (, )
can be properly defined as the task of computing ,  and .</p>
      <p>Notice that Sp() = ∅ if and only if  is unsatisfiable; and, similarly, Sp() = {0} if and only if 
is satisfiable but  is unsatisfiable with respect to . Spectra non-closed under addition are easily found
by dropping the above restriction to connected and individual-free CCQs, as shown by the following:
Example 1. Consider the empty KB  and the two Boolean CCQs 1 := ∃1 ∃2 C(1) ∧ C(2) and
2 := ∃1 ∃2 r(a, 1) ∧ r(a, 2), where 1 and 2 are counting variables. It is easily verified that
Sp(1) = Sp(2) = {2 |  ∈ N} ∪ {∞}.</p>
      <p>ℒ
ℒℐℱ
ℰℒℐℱ⊥</p>
      <p>DL-Liteℱ
DL-Litecore, ℒℐ, ℒℱ
ℰℒℐℱ
ℰℒ
⋆
·
⋆
⋆
⋆
⋆</p>
      <p>J, ∞K
✓
✓
✓
✓
✓
✓
∅
✓
✓
✓
✓
✓
·
{0}
✓
✓
✓
✓
·
·
·
✓
✓
✓
✓
·
{0} ∪ J, ∞K
{0, ∞}</p>
      <p>∪ {∞}
{∞}
✓
✓
✓
·
✓
·
·
✓
✓
✓
·
·
·
✓
·
·
·
·</p>
    </sec>
    <sec id="sec-4">
      <title>4. Spectrum of a concept cardinality query</title>
      <p>We now present two results regarding the query C := ∃ C(), where C is a concept name and  a
counting variable. Computing the spectrum of C over a KB  thus corresponds to the natural task of
deciding the possible values of ⃒⃒ Cℐ ⃒⃒ across the models ℐ of . Naturally, C satisfies preconditions of
Lemma 1 and, thus, its spectrum is finitely generated. Conversely, one can ask which sets are spectra
of concept cardinality queries. For a DL ℒ, we say that a set  is ℒ-concept realizable if there is a
concept  and an ℒ KB  such that Sp(C) =  . For ℒℐℱ KBs, we have the following complete
characterization.</p>
      <p>Theorem 1. A subset of N∞ is ℒℐℱ -concept realizable if it is ∅, {0}, or any subsemigroup of N∞
containing ∞.</p>
      <p>The “only-if” direction is essentially a consequence of Lemma 1, while the “if” direction is a
generalization of the following example in which Sp(C) = 2N ∪ {∞}.</p>
      <p>Example 2. Consider the KB  with the empty ABox and the following ℒℐℱ TBox enforcing that  is
a bijection between  and :</p>
      <p>C ≡</p>
      <p>A ⊔ B</p>
      <p>A ⊓ B ⊑ ⊥</p>
      <p>A ⊑ ∃r.B</p>
      <p>B ⊑ ∃r− .A
⊤ ⊑ ≤ 1r.⊤
⊤ ⊑ ≤ 1r− .⊤</p>
      <p>We now turn to ℰ ℒℐℱ⊥ KBs, for which we are not able to obtain a full characterization of the
realizable spectrum. However, we prove that for a set to be realizable, it must have  = 1, that is the
possible shapes simplify to  ∪ J, ∞K for some  ∈ N∞ and  ⊆ J0, K.</p>
      <p>Theorem 2. If a subset of N∞ is ℰ ℒℐℱ⊥-concept realizable, then it has shape ∅, {0}, {∞}, {0, ∞}, or
 ∪ J, ∞K for some  ∈ N and  ⊆ J0, K.</p>
      <p>The key ingredient to prove the above is a construction of two (potentially infinite) models ℐ and 
of an ℰ ℒℐℱ⊥ KB  = ( , ) in which ⃒⃒ C ⃒⃒ = ⃒⃒ Cℐ ⃒⃒ + 1 &lt; ∞. To this end, we refine a cycle-reversion
technique which has been developed to study finite reasoning in ℰ ℒℐℱ⊥ [14]. More precisely, we tailor
the notion of cycles to characterize under which conditions the extension of concept C may be finite,
and then carefully manipulate the corresponding models to produce the above ℐ and  . Definition 1
below describes the cycles of interest for our study.</p>
      <p>An inverse functional path (IFP) is a sequence 0, r1, 1, . . . , r,  where  ≥ 1, 0, . . . ,  are
conjunctions of concept names and r1, . . . , r are (potentially inverse) roles such that for all 0 ≤  &lt; 
we have  |=  ⊑ ∃r+1.+1 and  |= +1 ⊑ ≤ 1r−+1..</p>
      <p>The interesting cycles for a concept C are the IFPs looping on themselves and forcing the presence of
(at least) one instance of C “per instance of the cycle”, as follows:
Definition 1. An IFP 0, r1, 1, . . . , r,  is a C-generating cycle if  |=  ⊑ 0 and there exists
an IFP 0, s1, 1, . . . , s,  such that  |=  ⊑ C and  |=  ⊑ 0 for some 0 ≤  ≤ .</p>
      <p>Reversing those cycles now means to consider the ℰ ℒℐℱ⊥ TBox C obtained from  by adding, for
each C-generating cycle 0, r1, 1, . . . , r,  and each 0 ≤  &lt; , the axioms +1 ⊑ ∃r− . and
 ⊑ ≤ 1r+1.+1. A key result towards the proof of Theorem 2 is now:
Lemma 2. There is a model ℐ of  such that ⃒⃒ Cℐ ⃒⃒ &lt; ∞ if and only if the KB (C, ) is satisfiable.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Perspectives</title>
      <p>A full characterization of spectra shapes for ℰ ℒℐℱ⊥ appears challenging as Theorem 2 suggests that
those spectra may have the shapes of arbitrary numerical semigroups. Furthermore, while Theorem 1
ofers a complete characterization for ℒℐℱ , how to compute the corresponding efective
representations remains an open question.</p>
      <p>
        We believe that it could also be interesting to study the impact of our results on the closely related
problem of answering (Boolean atomic) queries under the bag semantics. While the semantics adopted
in the present paper does not coincide with bag semantics, as discussed for example in [
        <xref ref-type="bibr" rid="ref5">17, 5</xref>
        ],
considerations regarding the spectra and some of the corresponding techniques might be adapted to this
setting.
      </p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The authors acknowledge the financial support by the Federal Ministry of Education and Research
of Germany and by the Sächsische Staatsministerium für Wissenschaft Kultur und Tourismus in
the program Center of Excellence for AI-research “Center for Scalable Data Analytics and Artificial
Intelligence Dresden/Leipzig”, project identification number: ScaDS.AI. Second author was supported
by the DFG project LU 1417/3-1 QTEC.
[10] C. Lutz, The complexity of conjunctive query answering in expressive description logics, in:
Proceedings of the 4th International Joint Conference on Automated Reasoning (IJCAR), 2008, pp.
179–193.
[11] M. Bienvenu, Q. Manière, M. Thomazo, Cardinality queries over DL-Lite ontologies, in: Proceedings
of the 30th International Joint Conference on Artificial Intelligence (IJCAI), 2021, pp. 1801–1807.
[12] S. S. Cosmadakis, P. C. Kanellakis, M. Y. Vardi, Polynomial-time implication problems for unary
inclusion dependencies, Journal of the ACM 37 (1990) 15–46.
[13] R. Rosati, Finite model reasoning in DL-Lite, in: Proceedings of the 5th European Semantic Web</p>
      <p>Conference (ESWC), 2008, pp. 215–229.
[14] Y. A. Ibáñez-García, C. Lutz, T. Schneider, Finite Model Reasoning in Horn Description Logics, in:
Proceedings of the 14th International Conference on Principles of Knowledge Representation and
Reasoning (KR), 2014, pp. 490–509.
[15] F. Baader, I. Horrocks, C. Lutz, U. Sattler, An Introduction to Description Logic, Cambridge</p>
      <p>University Press, 2017.
[16] P. A. Grillet, Commutative Semigroups, Springer New York, NY, 2001.
[17] C. Nikolaou, E. V. Kostylev, G. Konstantinidis, M. Kaminski, B. Cuenca Grau, I. Horrocks,
Foundations of ontology-based data access under bag semantics, Journal of Artificial Intelligence (AIJ)
(2019) 91–132.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Kharlamov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Nutt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Thorne</surname>
          </string-name>
          ,
          <article-title>Aggregate queries over ontologies</article-title>
          ,
          <source>in: Proceedings of the 2nd International Workshop on Ontologies and Information Systems for the Semantic Web (ONISW)</source>
          ,
          <year>2008</year>
          , pp.
          <fpage>97</fpage>
          -
          <lpage>104</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Manière</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Thomazo</surname>
          </string-name>
          ,
          <article-title>Answering counting queries over DL-Lite ontologies</article-title>
          ,
          <source>in: Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1608</fpage>
          -
          <lpage>1614</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Feier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Przybylko</surname>
          </string-name>
          ,
          <article-title>Answer counting under guarded TGDs</article-title>
          ,
          <source>in: Proceedings of the 24th International Conference on Database Theory (ICDT)</source>
          ,
          <year>2021</year>
          , pp.
          <volume>11</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          :
          <fpage>22</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E. V.</given-names>
            <surname>Kostylev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Reutter</surname>
          </string-name>
          ,
          <article-title>Complexity of answering counting aggregate queries over DL-Lite</article-title>
          ,
          <source>Journal of Web Semantics (JWS) 33</source>
          (
          <year>2015</year>
          )
          <fpage>94</fpage>
          -
          <lpage>111</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Corman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Lanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Razniewski</surname>
          </string-name>
          ,
          <article-title>Counting query answers over a DL-Lite knowledge base</article-title>
          ,
          <source>in: Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>1658</fpage>
          -
          <lpage>1666</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Manière</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Thomazo</surname>
          </string-name>
          ,
          <article-title>Counting queries over ℒℋℐ ontologies</article-title>
          ,
          <source>in: Proceedings of the 19th International Conference on Principles of Knowledge Representation and Reasoning (KR)</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>53</fpage>
          -
          <lpage>62</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <article-title>Generalized first-order spectra and polynomial-time recognizable sets</article-title>
          ,
          <source>Complexity of computation 7</source>
          (
          <year>1974</year>
          )
          <fpage>43</fpage>
          -
          <lpage>73</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Durand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fagin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Loescher</surname>
          </string-name>
          ,
          <article-title>Spectra with only unary function symbols</article-title>
          ,
          <source>in: Proceedings of the 11th International Workshop on Computer Science Logic (CSL)</source>
          ,
          <year>1997</year>
          , pp.
          <fpage>189</fpage>
          -
          <lpage>202</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Glimm</surname>
          </string-name>
          , I. Horrocks,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          , U. Sattler,
          <article-title>Conjunctive query answering for the description logic ℋℐ</article-title>
          ,
          <source>Journal of Artificial Intelligence Research (JAIR) 31</source>
          (
          <year>2008</year>
          )
          <fpage>157</fpage>
          -
          <lpage>204</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>