<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Data Complexity in Expressive Description Logics With Path Expressions (Extended Abstract)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Bartosz Bednarczyk</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computational Logic Group, Technische Universität Dresden</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Computer Science, University of Wrocław</institution>
          ,
          <country country="PL">Poland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Among the plethora of various features available in extensions of the basic DL called ℒ, an especially prominent one is · reg, supported by the popular  -family of DLs [1]. Using · reg, one can specify regular path constraints and hence, allow the user to navigate the underlying graph data-structure. In recent years, many extensions of ℒreg for ontology-engineering were proposed [2, 3, 4], and the complexity of their reasoning problems is mostly well-understood [1, 5, 6]. Many years ago, Vardi [7] proposed the vital notion of data complexity. When measuring the complexity of the knowledge-base (KB) satisfiability problem (KBSat), we treat the ontology (TBox) as fixed upfront and only the user's data (ABox) varies. The satisfiability problem for DLs is usually NP-complete w.r.t. the data complexity, including the two-variable counting logic [8] (encoding DLs up to ℒℬℐSelf) as well as ℛℐ [9], the Self [10] was logical core of OWL2. Regarding DLs with path expressions, NP-completeness of ℒℐreg established recently. This also applies to the (non)entailment of positive two-way regular path queries. In our recent IJCAI paper [11] we study the data complexity of  ℐ,  , and  ℐ, namely the maximal decidable fragments of  ℐ that possess the so-called quasi-forest model property, a suitable generalisation of the well-known forest model property for ℒ. For the uniformity of our approach, we actually focus on the satisfiability problem for full  ℐ but over quasi-forests. We obtain: Theorem 1. KBSat for  ℐ over quasi-forests is NP-complete w.r.t. the data complexity. In particular, knowledge-base satisfiability of  ℐ,  , and  ℐ is NP-complete w.r.t. the data complexity. ◀ This completes the data complexity landscape for decidable fragments of  ℐ that remained open for more than a decade, and reproves known results on the ℛ family by Kazakov [9]. Recall that regular path queries are queries of the form ℛ(x , y ) for a regular expression ℛ (with tests). The built-in support of regular path expression in the  family of DLs allow us then to reduce the non-entailment problem (,  ) ̸|= ℛ(x , y ) to the satisfiability of (,  ∪ {⊤ ⊑ ¬∃ℛ.⊤}). Hence: Theorem 2. The entailment problem of regular path queries for  ℐ-KBs over quasi-forests is coNPcomplete w.r.t. the data complexity. In particular, this applies to  ℐ,  ,  ℐ, and to the corresponding fragments of OWL2, namely ℛℐ, ℛ, and ℛℐ. ◀ We are currently working on lifting the results of Theorem 2 to the class of positive two-way regular path queries. We expect to publish them as the journal version of our IJCAI paper, somewhere next year. As a bonus result, we show how our algorithm used to establish Theorem 1 can be adapted to the entailment of unions rooted (connected and containing at least one answer variable) conjunctive queries over  ℐ-KB (in terms of combined complexity). The coNExpTime upper bound for the rooted entailment over  ℐ-KBs is supplemented with a novel matching lower bound for ℒ extended with Self. Theorem 3. The query entailment problem over  ℐ-KB is coNExpTime-complete for the class of (unions of) rooted conjunctive queries. The lower bound holds already for ℒSelf. ◀</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries</title>
      <p>We fix countably-infinite pairwise-disjoint sets NI, NC, NR of individual, concept, and role names.
We assume that the reader is familiar with the standard definitions concerning description logics
(DLs) and we do not recall them here. The set of simple roles NsRp is defined with the grammar
s ::= r ∈ NR | s− | s∩s | s∪s | s∖s. Simple roles are interpreted as expected by invoking the
underlying set-theoretic operations. ℐ-KBs  := (,  ) are, as usual, composed of ABoxes  and
TBoxes  . We assume that  only contains axioms ± r (a, b) and ± A(a) for non-complex concepts A.
ℐ-TBoxes are polynomial-time reducible into the following Scott’s normal form.</p>
      <p>A ≡ B, A ≡ ¬</p>
      <p>B, A ≡ { o}, A ≡ B ⊓ B′, r = s, A ≡ ∃</p>
      <p>A.⊤, A ≡ ∃ r .Self, A ≡ (≥  r ).⊤
for A, B, B′ ∈ NC ∪ {⊤, ⊥}, o ∈ NI, r ∈ NR, s ∈ NsRp,  ∈ N, and an automaton A over the finite
alphabet composed of simple roles and tests C? for atomic concepts C. As usual, the equivalence ≡
replaces the two concept inclusions ⊑. We interpret ∃A.⊤ as the set of all elements d for which there
is a path  starting from d that realises A (written:  |= A), namely A accepts some word of the form
1r1 . . . | |− 1r| |− 1| |, where r ∈ NsRp and  are (possibly empty) sequences of tests, satisfying
( ,  +1) ∈ rℐ and   ∈ Cℐ for all  ≤ |  | and tests C? in . The other DL features are defined as usual.
We define the DLs ℐ, ℐ, and , respectively, by dropping (a) number restrictions, (b)
nominals, and (c) role inverses from the syntax of ℐ.</p>
      <p>
        The key ingredients of our paper are quasi-forests [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. We introduce them under the standard
set-theoretic reconstruction of the notion of an N-forest as a prefix-closed subset of N+ without .
Definition 1. Let NI and NI be finite subsets of NI, Root ∈ NC, and child , edge ∈ NR. An
interpretation ℐ is an (NI, NI )-quasi-forest if its domain Δℐ is an N-forest,
      </p>
      <p>Rootℐ = Δℐ ∩N = {aℐ | a ∈ NI} = {aℐ | a ∈ (NI ∪ NI )},
child ℐ = {(d, d· ) | d, d·  ∈ Δℐ ,  ∈ N},
edgeℐ = ⋃︀r∈NsRp r ℐ ,
and for all roles r ℐ and all pairs (d, e) ∈ r ℐ at least one of the following conditions hold: (i) d and e
belong to Rootℐ , (ii) one of d or e equals oℐ for some name o ∈ N , (iii) d = e, (iv) (d, e) ∈ child ℐ ,
I
or (v) (d, e) ∈ (child − )ℐ . A quasi-forest model of  := (,  ) is an (ind(), ind( ))-quasi-forest
satisfying , where ind() and ind( ) are sets of all individual names from  and  . ◀</p>
      <p>The names from NI are dubbed nominals, and are denoted with decorated letters o. We employ
the notions from graph theory such as node, root, child, parent, or descendant, defined as expected in
accordance with Δℐ , child ℐ and Rootℐ . For instance, d is a descendant of c whenever (c, d) ∈ (child ℐ )+.
A quasi-forest is N-bounded if all nodes have at most N children.</p>
      <p>A slightly wider notion of quasi-forests was employed by Calvanese et al. [1, Prop. 3.3] and Ortiz [4,
L. 3.4.1, Thm. 3.4.2] to reason about the  family. Based on their results we prove:
Lemma 1. Let :=(,  ) be a KB of , ℐ, or ℐ.  is satisfiable if  has a quasi-forest
model. Moreover, for all (unions of) CQs q , we have  ̸|= q if there exists an N-bounded quasi-forest
model of  violating q (for some N exponential in | |). Deciding if a ℐ-KB has a quasi-forest model
is ExpTime-complete w.r.t. the combined complexity. ◀
3. Data Complexity: An Overview and Some Tricks
To establish NP-completeness of the satisfiability for ℐ, , and ℐ in a uniform and elegant
way, we focus on the satisfiability of ℐ over quasi-forests. As the proof is highly technical, we
only mention a few of its important ingredients and provide a very rough sketch below.</p>
      <p>One of the key ingredients is an exponential time algorithm for deciding if a ℐ-KB has a
quasi-forest model [4, L. 3.4.1, Thm. 3.4.2]. While this algorithm cannot be used directly to solve the
satisfiability problem in optimal time, we stress that we can still employ it as a black-box to decide
quasiforest satisfiability of KBs that have sizes independent from the ABox. In our approach, we intend to
construct a quasi-forest model of an input ℐ-KB in two steps, i.e. we construct its root part (dubbed
the clearing) separately from its subtrees. Our algorithm first pre-computes (an exponential w.r.t. the
size of the TBox but of constant size if the TBox is fixed) set of quasi-forest-satisfiable ℐ-concepts
that indicate possible subtrees that can be “plugged in” to the clearing of the intended model. Then it
guesses (in NP) the intended clearing and verifies its consistency in PTime based on the pre-computed
concepts and roles. For the feasibility of our “modular construction” a lot of bookkeeping needs to be
done. Most importantly, certain decorations are employed to “relativise” and decide the satisfaction of
automata concepts and number restrictions by elements in an incomplete, fragmented forest.
I. The first type of decorations, given an automaton A, aggregate information about existing paths
realising A and starting at one of the roots of the intended model. As a single such path may visit
several subtrees, we cut such paths into relevant pieces and summarise them by means of “shortcut”
roles and ℐ-concepts describing paths fully contained inside a single subtree. One of the core
results is a complete characterisation of how paths in quasi-forest look like, and how they can be
decomposed into interesting pieces, axiomatisable in ℐ. Such “basic” paths are depicted below
(the decorated o denote nominals, while the other letters denote arbitrary roots).</p>
      <p>a
b
a
a
a
o
a
o
o
a
o
a
ö
II. The second type of decorations “localise” counting in the presence of nominals, as the nominals may
have successors outside their own subtree and the clearing. For instance, suppose that we deal with
the number restriction (≥ 2 r ).⊤ in an ({a, b}, {o})-quasi-forest ℐ sketched below (with r depicted as
a green arrow). For diferent thresholds t ∈ {=0, =1, =2, &gt;2} we label the roots of ℐ with concepts
Clrngtr , Cldtr , and Destr,o, informing how many r -successor a given root has among (i) the clearing, (b)
its children, and how many (c) for each nominal o, and how many of its descendants are r -successors
of o. The decorated quasi-forest is:</p>
      <p>CCDlherslndr=g,r=1or=10a</p>
      <p>CCDlheorslndr=g,r=2or=22
b CCDlherslnd≥rg,r=2or=+011</p>
      <p>These two “small tricks”, obfuscated by various technical dificulties, are the core ideas behind our
quasi-forest-satisfiability algorithm. We encourage the reader to check the full paper for details.
4. Hardness of Rooted Entailment in ℒSelf : An Overview and Tricks
We reduce from the NExpTime-complete 2 × 2 torus tiling problem [12, Cor. 4.15]. For its instance
(D, ¯c), we construct a rooted CQ q and an ℒSelf-KB  such that  ̸|= q if (D, ¯c) has a solution.
Models of  represent 2 × 2 tori (possibly incorrectly) labelled with tiles, while matches of q detect
violations of tilings. We represent tori by binary trees of height 2, where every leaf determines a
position (, ) in 2 × 2 and carries a selection of concepts Ad11 , . . . , Ad and Ad++11 , . . . , Ad22 ,
for bit-strings 1 . . .  and +1 . . . 2 encoding  and . Unusually, we equip every leaf representing
(, ) with three extra children, encoding (, ), (+1, ) and (, +1) (counting modulo 2). This
“localises” vertical and horizontal successors of each element. Call the resulting structures treepods.
00</p>
      <p>e
001</p>
      <p>Ad10
0</p>
      <p>e
01</p>
      <p>e
011
Ad10, Ad20 e
e</p>
      <p>Ad10, Ad21</p>
      <p>Ad11, Ad20 e
e</p>
      <p>Ad21, Ad21
e
e
e
e
e
e</p>
      <p>e</p>
      <p>e
10</p>
      <p>e
101</p>
      <p>Ad11
Lvl1
Lvl2</p>
      <p>The verification of tiling relations in treepods can be implemented locally in ℒ, but only under a
naïve assumption that all the nodes representing the same position of a torus carry precisely the same
tile. Such a “tile equality test” will be implemented with the query. To make this feasible, we design an
alternative way of encoding positions and tiles in treepods by means of tentacles, namely the outgoing
paths endowed with various self-loops replacing the leaves of treepods. An example tentacle replacing
the node 100 in the above treepod is depicted on Figure 2.</p>
      <p>⋆
100</p>
      <p>Lvl3
Ad11, Ad20
ad 11</p>
      <p>By replacing each leaf of a treepod with a suitable tentacle, we obtain jellyfishes . Note that the node
100 that “stores its address” with concepts Ad11, Ad20, and , has now an outgoing ad 11ad 2r 1?-path.
0
Similarly, 012 has an outgoing ad 10ad 2r 1? path. Our ℒSelf-KB  axiomatises jellyfishes, while
0
our query q compares such path-based addresses of nodes. Self-loops are crucial here as they are used
to simulate disjunction over paths.</p>
      <p>Acknowledgements I thank my PhD supervisor Sebastian Rudolph for his constant support and
extreme patience. This work was supported by the ERC Consolidator Grant No. 771779 (DeciGUT).
The full version of this paper is available on ArXiV, accessible via: https://arxiv.org/abs/2406.07095</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Eiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <article-title>Regular Path Queries in Expressive Description Logics with Nominals</article-title>
          ,
          <source>in: IJCAI</source>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bienvenu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Simkus</surname>
          </string-name>
          , Nested Regular Path Queries in Description Logics, in: KR,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Simkus</surname>
          </string-name>
          ,
          <article-title>Verification of Evolving Graph-structured Data under Expressive Path Constraints</article-title>
          , in: ICDT,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ortiz</surname>
          </string-name>
          , Query Answering in Expressive Description Logics: Techniques and
          <string-name>
            <given-names>Complexity</given-names>
            <surname>Results</surname>
          </string-name>
          ,
          <source>Ph.D. thesis</source>
          , TU Wien, AT,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Bednarczyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rudolph</surname>
          </string-name>
          ,
          <article-title>Worst-Case Optimal Querying of Very Expressive Description Logics with Path Expressions and Succinct Counting</article-title>
          , in: IJCAI,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>B.</given-names>
            <surname>Bednarczyk</surname>
          </string-name>
          , E. Kieroński,
          <article-title>Finite Entailment of Local Queries in the Z Family of Description Logics</article-title>
          , in: AAAI,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M. Y.</given-names>
            <surname>Vardi</surname>
          </string-name>
          ,
          <article-title>The Complexity of Relational Query Languages (Extended Abstract)</article-title>
          ,
          <source>in: STOC</source>
          ,
          <year>1982</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>I.</given-names>
            <surname>Pratt-Hartmann</surname>
          </string-name>
          ,
          <article-title>Data-Complexity of the Two-Variable Fragment with Counting Quantifiers, Information and Computation (</article-title>
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kazakov</surname>
          </string-name>
          ,
          <article-title>RIQ and SROIQ are harder than SHOIQ</article-title>
          , in: KR,
          <year>2008</year>
          , pp.
          <fpage>274</fpage>
          -
          <lpage>284</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Jung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Martel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Schneider</surname>
          </string-name>
          ,
          <article-title>Querying the Unary Negation Fragment with Regular Path Expressions</article-title>
          , in: ICDT,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>B.</given-names>
            <surname>Bednarczyk</surname>
          </string-name>
          ,
          <article-title>Data Complexity in Expressive Description Logics with Path Expressions</article-title>
          , in: IJCAI,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <article-title>The Complexity of Description Logics with Concrete Domains</article-title>
          ,
          <source>Ph.D. thesis</source>
          , RWTH Aachen University, Germany,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>