Reasoning with the Core Chase: the Case of SHACL
                         Validation over ELHI Knowledge Bases
                         Anouk Oudshoorn1 , Magdalena Ortiz1 and Mantas Šimkus1
                         1
                             Institute of Logic and Computation, TU Wien, Austria


                                        Abstract
                                        Many approaches to query answering in Description Logics (DLs) adopt the certain answer semantics, which is
                                        natural for standard query languages without negation, but falls short when negation is involved. Since negation
                                        plays a crucial role in many applications, alternative semantics are needed. Core universal models are increasingly
                                        accepted as a means to give a semantics to queries with negation. In this paper, we consider the problem of
                                        evaluating (possibly recursive) SHACL expressions over the core universal model of an ℰℒℋℐ KB. SHACL,
                                        recently proposed as a standard constraint language for RDF data, is a natural and powerful query language, and
                                        the fragment we consider adds regular path navigation to semi-positive monadic Datalog with acyclic rule bodies.
                                        We first propose a construction of a finite representation of the core universal model, based on a novel calculus for
                                        ℰℒℋℐ that gives up the data-independence to avoid the best-case exponential behaviour of similar approaches,
                                        and which we believe can lead towards an efficient implementation. Then we leverage this finite representation
                                        to reduce validation in the presence of ontologies to plain SHACL validation, similarly to previous algorithms
                                        but avoiding their best-case exponential behaviour. Our algorithms yield tight data- and combined-complexity
                                        bounds for the studied SHACL validation problem, which coincide with plain consistency testing in ℰℒℋℐ.

                                        Keywords
                                        core chase, query rewriting, SHACL, ELHI, Horn DLs


                         1. Introduction
                         Evaluating queries over description logic (DL) knowledge bases has been the subject of huge research
                         efforts over almost two decades. Here, the certain answer semantics has played a prominent role in
                         elegantly handling information incompleteness for large classes of queries: answering a query consists
                         of computing the intersection of query answers over all models of the input knowledge base (KB). This
                         problem can easily become computationally very costly for expressive DLs, and it is provably more
                         expensive than standard reasoning in most of the popular extensions of 𝒜ℒ𝒞 [1]. But fortunately, this
                         computational hurdle can usually be avoided in the so-called Horn DLs, which disallow ontological
                         statements that involve disjunction, leading in turn to the universal model property [2, 3]. This property
                         guarantees that a consistent KB always has a model that represents all models (technically, that can be
                         homomorphically mapped into any model) and is thus sufficient to compute the certain answers to all
                         conjunctive queries.
                            The universal model property is closely related to the notion of a chase procedure known from
                         databases, originally studied for reasoning about integrity constraints [4]. There are several chase
                         variants with different properties (e.g., oblivious, Skolem, core) [5, 6, 7], and all of them can be used to
                         obtain a model that is universal in the sense above. Query answering is still far from trivial even when
                         universal models exists. For standard DLs, the chase does not terminate in general and universal models
                         can be infinite, so we need to reason about them without building them explicitly. Nevertheless, the
                         universal model property of Horn DLs has enabled the development of several algorithms and practical
                         implementations of query answering over large classes of positive queries, see e.g., [2, 3, 8, 9, 10, 11] and
                         their references.


                             DL 2024: 37th International Workshop on Description Logics, June 18–21, 2024, Bergen, Norway
                          $ anouk.oudshoorn@tuwien.ac.at (A. Oudshoorn); magdalena.ortiz@tuwien.ac.at (M. Ortiz); mantas.simkus@tuwien.ac.at
                          (M. Šimkus)
                           0009-0006-4638-5948 (A. Oudshoorn); 0000-0002-2344-9658 (M. Ortiz); 0000-0003-0632-0294 (M. Šimkus)
                                       © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
   Unfortunately, the situation is very different when queries involve negation as failure, which allows
us to query about the absence of information, e.g., to ask for objects that do not (provably) have
some property. Such queries are highly desirable in KR and have been advocated in several papers
[12, 13, 14, 15]. However, they are not closed under homomorphisms, which means that different
universal models may lead to different answers. Apparently innocuous differences in the chase procedure
may affect the query answers, and the certain answer semantics becomes inadequate; see, e.g., the
discussion in [14] and [15]. The core universal model is increasingly accepted as a way to define the
semantics of these queries [14, 15], since it can be seen as the structure that best represents the intended
meaning of the given input data and the terminological axioms, without any superfluous structures or
redundant elements.
   The first major contribution of this paper is a method for computing a finite representation of a
(possibly infinite)core universal model of an ℰℒℋℐ KB. This method includes a data-aware version of
a consequence-based calculus for ℰℒℋℐ. Specifically, we define a collection of reasoning rules that
incrementally builds a finite representation of a core universal model for a given input ABox. Unlike
previous consequence-based inference procedures for ℰℒℋℐ, are the rules in our calculus aware of
input data. This avoids the main drawback of the previous approaches: data-independent computations
easily lead to best-case exponential behavior.
   After presenting our calculus, we use it to address a timely and challenging problem: validation
of (possibly recursive) SHACL in the presence of ontologies. SHACL is a recently introduced W3C
recommendation for writing constraints over RDF data [16]. It has also been formalised in logic and
has connections to DLs [17, 18, 19, 20]. The SHACL standard envisions SHACL validation performed in
the presence of ontologies but does not describe how this should be realised. SHACL is closely related
to DLs and typical formalisms studied in the setting of ontology-mediated queries. We concentrate on
semi-positive SHACL shapes, which allow for negation of ontology predicates, but not of shape names.
This fragment can be seen as semi-positive monadic Datalog queries with acyclic rule bodies and regular
path expressions, and it already raises the issues with the certain answer semantics discussed above.
   Following our previous work [15], we use a core universal model of the data and ontology for
validation. This leads us to our second contribution: an algorithm for evaluating SHACL expressions
over the finite representation of a core universal model of an ℰℒℋℐ KB. This procedure is based on the
rewriting from [15], but unlike that work, the algorithm presented here is not best-case exponential,
potentially paving the way towards implementation. Moreover, we show that checking if a given
ℰℒℋℐ KB validates a shapes graph is ExpTime-complete and PTime-complete in combined and data
complexity, respectively. These bounds are the best we could hope for, since they coincide with the
complexity of consistency testing in ℰℒℋℐ.


2. Preliminaries
Let 𝑁𝐶 , 𝑁𝐼 , and 𝑁𝑅 be infinite sets of concept names, individual names, and role names, respectively.
Let 𝑁𝐶+ := 𝑁𝐶 ∪ {⊤} and 𝑁𝑅+ := {𝑝, 𝑝− | 𝑝 ∈ 𝑁𝑅 } denote roles. For every 𝑝 ∈ 𝑁𝑅 , let (𝑝− )− = 𝑝.
For each set of roles 𝑅 ⊆ 𝑁𝑅+ , set 𝑅− := {𝑟− | 𝑟 ∈ 𝑁𝑅+ }. Let 𝑁𝐵 be an infinite set of blank nodes
disjoint from 𝑁𝐼 . Call 𝐴(𝑐) and 𝑟(𝑐, 𝑑) atoms, for each 𝐴 ∈ 𝑁𝐶 , 𝑟 ∈ 𝑁𝑅+ and {𝑐, 𝑑} ⊆ 𝑁𝐼 ∪ 𝑁𝐵 . We let
Δ𝒜 denote the domain of 𝒜, i.e. the elements of 𝑁𝐼 ∪ 𝑁𝐵 that appear in 𝒜. In case Δ𝒜 ⊆ 𝑁𝐼 , we call
𝒜 an ABox. In the rest of this paper, we assume that for all ABoxes 𝒜, if 𝑟(𝑎, 𝑏) ∈ 𝒜, then 𝑟− (𝑏, 𝑎) ∈ 𝒜
too. For a tuple ⃗𝑥 = (𝑥1 , . . . , 𝑥𝑛 ) and 1 ≤ 𝑗 ≤ 𝑛, we let 𝜋𝑖 (𝑥
                                                                   ⃗ ) = 𝑥𝑖 be its 𝑖-th projection.
Morphisms. Let 𝒜 and 𝒜′ be sets of atoms. A homomorphism from 𝒜 to 𝒜′ is a function
                ′
ℎ : Δ𝒜 → Δ𝒜 such that for all {𝑐, 𝑑} ⊆ 𝑁𝐼 ∪ 𝑁𝐵 , all 𝐴 ∈ 𝑁𝐶 and all 𝑝 ∈ 𝑁𝑅 , (i) if 𝑐 ∈ Δ𝒜 ∩ 𝑁𝐼 ,
then ℎ(𝑐) = 𝑐, (ii) if 𝐴(𝑐) ∈ 𝒜, then 𝐴(ℎ(𝑐)) ∈ 𝒜′ , and (iii) if {𝑝(𝑐, 𝑑), 𝑝− (𝑑, 𝑐)} ∩ 𝒜 ̸= {},
then {𝑝(ℎ(𝑐), ℎ(𝑑)), 𝑝− (ℎ(𝑑), ℎ(𝑐))} ∩ 𝒜′ ̸= {}. A homomorphism is called strong when (ii) and
(iii) are strengthened to “𝐴(𝑐) ∈ 𝒜 iff 𝐴(ℎ(𝑐)) ∈ 𝒜′ ” and “{𝑝(𝑐, 𝑑), 𝑝− (𝑑, 𝑐)} ∩ 𝒜 =   ̸ {} iff
{𝑝(ℎ(𝑐), ℎ(𝑑)), 𝑝− (ℎ(𝑑), ℎ(𝑐))} ∩ 𝒜′ ̸= {}”, respectively. An embedding is a strong injective ho-
momorphism, an isomorphism is a surjective embedding and an endomorphism of 𝒜 is a homomorphism
from 𝒜 to itself. A set of atoms is a core when all its endomorphisms are embeddings. The core of a set
of atoms 𝒜 is a set of atoms ℬ ⊆ 𝒜, such that (i) there exists an endomorphism ℎ from 𝒜 to ℬ, (ii) ℬ is
                                                                    core
the restriction to the image of ℎ, and (iii) ℬ is a core.We write 𝒜 −−→ ℬ. Each finite set of atoms has a
core that is unique up to isomorphism [21].
The Description logic ℰℒℋℐ. We are considering a normalised version of ℰℒℋℐ, i.e., the axioms we
consider have one of the following normal forms.

                        𝐴1 ⊓ . . . ⊓ 𝐴𝑛 ⊑ 𝐵    𝐴1 ⊑ ∃𝑟.𝐴2     ∃𝑟.𝐴1 ⊑ 𝐵      𝑟 ⊑ 𝑟′

Here, {𝐴1 , . . . , 𝐴𝑛 } ⊆ 𝑁𝐶+ , 𝐵 ∈ 𝑁𝐶+ ∪ {⊥} and {𝑟, 𝑟′ } ⊆ 𝑁𝑅+ . An ℰℒℋℐ TBox 𝒯 is a set of such
axioms. An (ℰℒℋℐ) knowledge base (𝒯 , 𝒜) consists of an ABox 𝒜 and an ℰℒℋℐ TBox 𝒯 . As usual,
we use first-order interpretations to define the semantics. Note that any set of atoms 𝒜 can be viewed
as an interpretation with domain Δ𝒜 .
Building the chase. The function 𝑓𝑥 , for each 𝑥 ∈ 𝑁𝐼 ∪ 𝑁𝐵 , that translates concepts into a set
of atoms is inductively defined in the following way: 𝑓𝑥 (⊤) := {}, 𝑓𝑥 (𝐴) := {𝐴(𝑥)}, 𝑓𝑥 (∃𝑟.𝐴) :=
{𝑟(𝑥, 𝑦), 𝑟− (𝑦, 𝑥), 𝐴(𝑦)} for some fresh variable 𝑦 and 𝑓𝑥 (𝐶 ⊓ 𝐶 ′ ) := 𝑓𝑥 (𝐶) ∪ 𝑓𝑥 (𝐶 ′ ). We say 𝑐 is an
applicable match for an axiom 𝐶 ⊑ 𝐷 ∈ 𝒯 in a set of atoms 𝒜, when there exists a homomorphism
ℎ from 𝑓𝑥 (𝐶) to 𝒜, such that ℎ(𝑥) = 𝑐 and there is no homomorphism from 𝑓𝑥 (𝐷) to 𝒜 such that
ℎ(𝑥) = 𝑐. We use the notation (𝑐, 𝐶 ⊑ 𝐷) ∈ m∃ (𝒯 , 𝒜) when 𝐷 is of the form ∃𝑟.𝐴 for some 𝑟 ∈ 𝑁𝑅+ ,
𝐴 ∈ 𝑁𝐶+ , and the notation (𝑐, 𝐶 ⊑ 𝐷) ∈ m(𝒯 , 𝒜) when 𝐷 ∈ 𝑁𝐶 . Similarly, (𝑐, 𝑑) is an applicable
match for 𝑟 ⊑ 𝑟′ ∈ 𝒯 in a set of atoms 𝒜, when 𝑟(𝑐, 𝑑) ∈ 𝒜 and 𝑟′ (𝑐, 𝑑) ̸∈ 𝒜. In that case, we write
((𝑐, 𝑑), 𝑟 ⊑ 𝑟′ ) ∈ m(𝒯 , 𝒜).
                                                                                            𝒯 ,𝐼
Definition 1. For an ℰℒℋℐ TBox 𝒯 , a set 𝐼 ⊆ 𝑁𝐼 ∪ 𝑁𝐵 and ABoxes 𝒜, 𝒜′ , we let 𝒜 −−→ 𝒜′ if
                       ⋃︁                       ⋃︁                       ⋃︁
     𝒜′ = 𝒜 ∪                   𝑓𝑐 (𝐷) ∪                𝑓𝑐 (𝐷) ∪                   𝑟′ (𝑐, 𝑑).
                 (𝑐,𝐶⊑𝐷)∈m∃ (𝒯 ,𝒜),𝑐∈𝐼          (𝑐,𝐶⊑𝐷)∈m(𝒯 ,𝒜)           ((𝑐,𝑑),𝑟⊑𝑟′ )∈m(𝒯 ,𝒜)

When 𝐼 = 𝑁𝐼 ∪ 𝑁𝐵 , we drop the 𝐼.

  Note that the 𝐼 restricts the set of nodes for which we consider applicable existential matches. We
use the notation 𝒜(→)𝜔 𝒜′ to denote that there exists an 𝑛 such that 𝒜(→)𝑛 𝒜′ and for all 𝒜′′ such
that 𝒜′ → 𝒜′′ we find 𝒜′ = 𝒜′′ , for each binary relation →. With ∘ we denote the concatenation of
two binary relations, that is, 𝒜 → ∘ →′ ℬ iff there exists 𝒜′ such that 𝒜 → 𝒜′ and 𝒜′ →′ ℬ, for each
pair of binary relations → and →′ .
Core chase. In the core chase, all applicable matches are fired simultaneously, followed by a core
check. This procedure is repeated until it terminates (which is not guaranteed): given a knowledge
                                                                                           𝒯     core
base 𝒯 , 𝒜, the core chase is the unique, up to isomorphism, structure ℬ such that 𝒜(−     → ∘ −−→)𝜔 ℬ
[7]. This procedure finds a finite universal (core) model whenever it exists, but it does not specify how
to create an infinite structure from a series of finite chase structures: simply taking the union of ℬ𝑖 ,
              𝒯      core
such that 𝒜(− → ∘ −−→)𝑖 ℬ𝑖 , does in general not produce a core. Take for instance 𝒜 = {𝐴(𝑐)} and
𝒯 = {𝐴 ⊑ ∃𝑟.𝐴, 𝐴 ⊑ ∃𝑠.𝐴, 𝑟 ⊑ 𝑠}. The core chase is generalised to infinite structures in [22] by
the so-called stable chase. The austere universal model, introduced in [15] for DL-Liteℛ , is constructed
using a good successor configuration that ensures that the axioms are locally satisfied while preserving
conditions of a core at each step. Here we follow a similar approach, and moreover, we prove that these
local conditions are enough to obtain the core chase (whenever the procedure terminates).


3. Good Successor Configuration
In this section, we present our first major contribution: computing a finite representation of a core
universal model for a given ℰℒℋℐ KB. In a nutshell, the good successor configuration tells us for each
configuration of a node and its immediate ‘neighbourhood’, the precise configuration of blank nodes
that should be introduced as immediate successors.
   In contrast to DL-Liteℛ , in ℰℒℋℐ the relevant context is not only the incoming roles 𝑟, but also
the concepts satisfied at the already existing neighbours. For simplicity, we focus on describing the
good successors of the blank nodes introduced during the chase, which have a unique predecessor
in the forest-like structure. For describing the successor configurations, we use types and successor
types. A type 𝑡 ∈ ℱ is defined as an element of 𝒫(𝑁𝐶+ ) × 𝒫(𝑁𝑅+ ) × 𝒫(𝑁𝐶+ ), and a successor type
𝑢 ∈ 𝒮 is an element of 𝒫(𝑁𝑅+ ) × 𝒫(𝑁𝐶+ ). A type 𝑡 describes a pair of nodes and the roles between
them, while a successor type describes a set of roles leading to one node. A pair (𝑡, {𝑢1 , . . . , 𝑢𝑛 }) in
the good configuration means that a node that is connected to its predecessor as described by 𝑡, and
needs successors as prescribed by 𝑢1 , · · · , 𝑢𝑛 , see Figure 1. Furthermore, we define the inverse function
inv : 𝒫(𝑁𝑅+ )×𝒫(𝑁𝐶+ ) → 𝒫(𝑁𝐶+ )×𝒫(𝑁𝑅+ ) on successor types by setting inv (𝑢) := (𝜋2 (𝑢), (𝜋1 (𝑢))− ).
(Recall that 𝜋1 and 𝜋2 are the first and second projection of a tuple, respectively, as defined in the
preliminaries.)
   Defining the successors of all types that may occur in the chase for any possible ABox would
necessarily lead to a strictly exponential behaviour. While exponentiality is in general unavoidable,
as standard reasoning tasks are ExpTime complete, we want to avoid it for as many instances as
possible. Hence in the following we build the good successor configuration carefully, to avoid best-
case exponential behaviour. We start from the input ABox 𝒜, and extend it to satisfy all applicable
axioms, but we restrict the satisfaction of existential axioms to only create blank nodes that are directly
neighbouring the ABox. We call this structure 𝒜1 . The freshly introduced blank nodes in 𝒜1 have
exactly one predecessor and can thus be described by a type. We take exactly those types as initial
live types, and compute the pre-good configuration of successors that each live type needs. In each
step of the computation we add to the live types the new successor types. It might happen that we
encounter a so-called loop computation: the created successors may propagate some concept 𝐵 back
to the current node, This information is saved in a relation 𝐿 as the pair of the type and the implied
concept, and the pre-good successor pairs are updated with this information. Unlike DL-Liteℛ , derived
information in ℰℒℋℐcan propagate through the model, and in particular, affect the concepts satisfied
by ABox individuals. Thus the information from the loop computation 𝐿 must be used to update also
the extended ABox, from which we retrieve fresh, updated live types for the next round of building
pre-good successors. This continues in rounds until the process terminates.
Definition 2. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜). Let 𝒜1 be the unique (up to isomorphism) set of
                     𝐼 𝒯 ,𝑁        core
                        𝜔 ∘ −−→ 𝒜 . Let 𝐹 be the following set of types, for each 𝑖 ≥ 1
atoms such that 𝒜(−−−→)          1       𝑖

          𝐹𝑖 := {𝑡 ∈ ℱ | 𝑛 ∈ 𝑁𝐵 , 𝑐 ∈ 𝑁𝐼 , 𝑟 ∈ 𝑁𝑅+ , 𝜋1 (𝑡) = {𝐴 ∈ 𝑁𝐶 | 𝐴(𝑛) ∈ 𝒜𝑖 } ∪ {⊤},
                                                            𝜋2 (𝑡) = {𝑟 ∈ 𝑁𝑅+ | 𝑟(𝑛, 𝑐) ∈ 𝒜𝑖 },
                                                            𝜋3 (𝑡) = {𝐴 ∈ 𝑁𝐶 | 𝐴(𝑐) ∈ 𝒜𝑖 } ∪ {⊤}}.

Construct the pre-good successor relation, psucc𝑖 ⊆ ℱ × 𝒫(𝒮), initially set to (𝑡, {}) ∈ psucc𝑖 for all
𝑡 ∈ 𝐹𝑖 , and the loop computation relation, 𝐿 ⊆ ℱ × 𝑁𝐶 , initially empty, for each 𝑖 ≥ 1 by applying the
following rules exhaustively.
1. If {(𝑡, 𝐴1 ), . . . , (𝑡, 𝐴𝑛 )} ⊆ 𝐿 and 𝐴1 ⊓ · · · ⊓ 𝐴𝑛 ⊑ 𝐵 ∈ 𝒯 , then update 𝐿 by adding (𝑡, 𝐵);

2. If (𝑡, 𝑢) ∈ psucc𝑖 and 𝐴1 ⊓ · · · ⊓ 𝐴𝑛 ⊑ 𝐵 ∈ 𝒯 such that there exists 1 ≤ 𝑗 ≤ 𝑚, with 𝐴 ⊆ 𝜋2 (𝑢𝑗 ), but
   𝐵 ̸∈ 𝜋2 (𝑢𝑗 ), then update the entry (𝑡, 𝑢) in psucc𝑖 to (𝑡, {𝑢1 , . . . , (𝜋1 (𝑢𝑗 ), 𝜋2 (𝑢𝑗 ) ∪ {𝐵}), . . . , 𝑢𝑚 });

3. If (𝑡, 𝑢) ∈ psucc𝑖 and ∃𝑟.𝐴 ⊑ 𝐵 ∈ 𝒯 such that there exists 1 ≤ 𝑗 ≤ 𝑚, with 𝑟 ∈ 𝜋1 (𝑢𝑗 ) and
   𝐴 ∈ 𝜋2 (𝑢𝑗 ), but 𝐵 ̸∈ 𝜋1 (𝑡), then update 𝐿 by adding (𝑡, 𝐵);

4. If (𝑡, 𝑢) ∈ psucc𝑖 and ∃𝑟.𝐴 ⊑ 𝐵 ∈ 𝒯 such that there exists 1 ≤ 𝑗 ≤ 𝑚, with 𝑟− ∈ 𝜋1 (𝑢𝑗 )
   and 𝐴 ∈ 𝜋1 (𝑡) ∪ {𝐴′ | (𝑡, 𝐴′ ) ∈ 𝐿}, but 𝐵 ̸∈ 𝜋2 (𝑢𝑗 ), then update the entry (𝑡, 𝑢) in psucc𝑖 to
   (𝑡, {𝑢1 , . . . , (𝜋1 (𝑢𝑗 ), 𝜋2 (𝑢𝑗 ) ∪ {𝐵}), . . . , 𝑢𝑚 });
                        𝜋3 (𝑡)                               𝑐𝒜,𝑆 = {𝑐}
                                                          (¬𝑐)𝒜,𝑆 = Δ𝒜 ∖ {𝑐}
                    𝜋2 (𝑡)
                                                             𝐴𝒜,𝑆 = {𝑐 | 𝐴(𝑐) ∈ 𝒜}
                        𝜋1 (𝑡)                           (¬𝐴)𝒜,𝑆 = Δ𝒜 ∖ {𝑐 | 𝐴(𝑐) ∈ 𝒜}
               𝜋1 (𝑢)             𝜋1 (𝑢′ )          (𝜙1 ∧ 𝜙2 )𝒜,𝑆 = (𝜙1 )𝒜,𝑆 ∩ (𝜙2 )𝒜,𝑆
              𝜋2 (𝑢)              𝜋2 (𝑢′ )            (∃𝐸.𝜙)𝒜,𝑆 = {𝑒 ∈ Δ𝒜 | ∃𝑒′ : (𝑒, 𝑒′ ) ∈ 𝐸 𝒜 ∧ 𝑒′ ∈ 𝜙𝒜,𝑆 }
                                                             𝑠𝒜,𝑆 = {𝑐 | 𝑠(𝑐) ∈ 𝑆}
          Figure 1: Structure of the good
             successor configuration                        Figure 2: Evaluating shape expressions


5. If (𝑡, 𝑢) ∈ psucc𝑖 and 𝑟 ⊑ 𝑟′ ∈ 𝒯 such that there exists 1 ≤ 𝑗 ≤ 𝑚, with 𝑟 ∈ 𝜋1 (𝑢𝑗 ), but 𝑟′ ̸∈ 𝜋1 (𝑢𝑗 ),
   or 𝑟− ∈ 𝜋1 (𝑢𝑗 ), but 𝑟′− ̸∈ 𝜋1 (𝑢𝑗 ) then update the entry (𝑡, 𝑢) in psucc𝑖 to (𝑡, {𝑢1 , . . . , (𝜋1 (𝑢𝑗 ) ∪
   {𝑟′ }, 𝜋2 (𝑢𝑗 )), . . . , 𝑢𝑚 });

6. If {(𝑡, 𝑢), (𝑡′ , 𝑢′ )} ⊆ psucc𝑖 such that there exists 1 ≤ 𝑗 ≤ 𝑚, with 𝜋1 (𝑡) = 𝜋3 (𝑡′ ), 𝜋1 (𝑢𝑗 )− =
   𝜋2 (𝑡′ ) and 𝜋2 (𝑢𝑗 ) = 𝜋1 (𝑡′ ), then update psucc𝑖 (𝑡) to (𝑡, {𝑢1 , . . . , (𝜋1 (𝑢𝑗 ), 𝜋2 (𝑢𝑗 ) ∪ {𝐴′ | (𝑡′ , 𝐴′ ) ∈
   𝐿}), . . . , 𝑢𝑚 });

7. If (𝑡, 𝑢) ∈ psucc𝑖 and 𝐴 ⊑ ∃𝑟.𝐵 ∈ 𝒯 such that 𝐴 ∈ 𝜋1 (𝑡), and 𝑟 ̸∈ 𝜋2 (𝑡) or 𝐴 ̸∈ 𝜋3 (𝑡), and for all 1 ≤
   𝑗 ≤ 𝑚, 𝑟 ̸∈ 𝜋1 (𝑢𝑗 ) or 𝐵 ̸∈ 𝜋2 (𝑢𝑗 ), then update the entry (𝑡, 𝑢) in psucc𝑖 to (𝑡, 𝑢 ∪ {({𝑟}, {⊤, 𝐵})});

8. If (𝑡, 𝑢) ∈ psucc𝑖 and there exists 1 ≤ 𝑗, 𝑘 ≤ 𝑙 such that 𝑗 ̸= 𝑘, 𝜋1 (𝑢𝑗 ) ⊆ 𝜋1 (𝑢𝑘 ) and 𝜋2 (𝑢𝑗 ) ⊆ 𝜋2 (𝑢𝑘 ),
   then update the entry (𝑡, 𝑢) in psucc𝑖 to (𝑡, 𝑢 ∖ {𝑢𝑗 });

9. If (𝑡, 𝑢) ∈ psucc𝑖 , then for each 1 ≤ 𝑗 ≤ 𝑙, such that ((inv (𝑢𝑗 ), 𝜋1 (𝑡)), 𝑢′ ) ̸∈ psucc𝑖 , for any 𝑢′ , add
   ((inv (𝑢𝑗 ), 𝜋1 (𝑡), {}) to psucc𝑖 ;

where we use the notation 𝐴, 𝑢, 𝑢′ as a shorthand for {𝐴1 , . . . , 𝐴𝑛 }, {𝑢1 , . . . , 𝑢𝑚 }, resp. {𝑢′1 , . . . , 𝑢′𝑚′ }.
Now, let ℬ𝑖 be the smallest set of atoms such that 𝒜𝑖 ⊆ ℬ𝑖 and, if (𝑡, 𝐵) ∈ 𝐿 and there exists 𝑐 ∈ 𝑁𝐵 ,
𝑑 ∈ 𝑁𝐼 such that 𝜋1 (𝑡) ⊆ {𝐴 ∈ 𝑁𝐶 | 𝐴(𝑐) ∈ ℬ𝑖 } ∪ {⊤}, 𝜋2 (𝑡) ⊆ {𝑟 ∈ 𝑁𝑅+ | 𝑟(𝑐, 𝑑) ∈ ℬ𝑖 } and
𝜋3 (𝑡) ⊆ {𝐴 ∈ 𝑁𝐶 | 𝐴(𝑑) ∈ ℬ𝑖 } ∪ {⊤}, then 𝐵(𝑐) ∈ ℬ𝑖 . Define 𝒜𝑖+1 as the unique (up to isomorphism)
                              𝐼  𝒯 ,𝑁
                                𝜔 ∘ −−→ 𝒜    core
set of atoms such that ℬ𝑖 (−−−→)          𝑖+1 .


Proposition 1. There exists an 𝑛 ≥ 1 such that 𝒜𝑛 = 𝒜𝑛+1 . We use the notation 𝒜+
                                                                                𝒯 for 𝒜𝑛 .

  First, note that by construction, 𝒜𝑖 ⊆ 𝒜𝑖+1 . Furthermore, the amount of blank nodes neighbouring a
node in the data, such that they do not subsume each other, is restricted by the amount of existential
axioms in 𝒯 .
  We can now define the good successor configuration as the limit of psucc𝑛 , and the set ℱ𝒯 ,𝒜 of live
types as the types occurring in it.

Definition 3. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜). Let 𝑛 ≥ 1 be the smallest 𝑛 such that 𝒜𝑛 = 𝒜𝑛+1 .
Set succ𝒯 ,𝒜 = psucc𝑛 and let ℱ𝒯 ,𝒜 ⊆ ℱ be such that 𝑡 ∈ ℱ𝒯 ,𝒜 iff there exists 𝑢 such that (𝑡, 𝑢) ∈ succ𝒯 ,𝒜 .

  In the good successor configuration, we only add successors that are directly required by the axioms
and whose set of roles can be summarised by one role that implies all others. This will be useful later.
We denote by cl 𝒯 (𝑟) ⊆ 𝑁𝑅+ the smallest set such that 𝑟 ∈ cl 𝒯 (𝑟) and if 𝑟′ ∈ cl 𝒯 (𝑟) and 𝑟′ ⊑ 𝑟′′ ∈ 𝒯
or 𝑟′− ⊑ 𝑟′′− ∈ 𝒯 , then 𝑟′′ ∈ cl 𝒯 (𝑟).

Proposition 2. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜). If succ𝒯 ,𝒜 (𝑡, 𝑢), then for each 𝑢 ∈ 𝑢 there
exists a 𝐴 ⊑ ∃𝑟.𝐵 ∈ 𝒯 such that 𝐴 ∈ 𝜋1 (𝑡), cl 𝒯 (𝑟) = 𝜋1 (𝑢) and 𝐵 ∈ 𝜋2 (𝑢).
  When computing the good successor configuration, we are computing all loops, that is, all implications
that may propagate back during the chase. In a nutshell, using the good successors is analogous to
doing the chase with the closure of 𝒯 under semantic consequences 𝒯 |= 𝐶 ⊑ 𝐷.
Proposition 3. Let 𝒯 * := {𝐶 ⊑ 𝐷 | 𝒯 |= 𝐶 ⊑ 𝐷}, and let ℬ be the unique (up to isomorphism) set of
                   𝒯 * ,𝑁𝐼 𝜔 core
atoms such that 𝒜(−−−−→)    ∘ −−→ ℬ. Then 𝒜+ ∼
                                             = ℬ.  𝒯

  The proof relies on the fact that all relevant loop computations are covered in the construction of the
good successor configuration. As we consider live types, all possible configurations in the anonymous
parts are considered and all inferred loops are saved in the loop computation list.


4. Core Universal Model Construction
The good successor configuration allows us to easily build the desired universal core model.
  We introduce some additional notation. Let 𝒜 be a set of atoms. For 𝑋 ⊆ 𝑁𝐼 ∪ 𝑁𝐵 , we define the set
𝑛𝒜 (𝑋) of neighbours of 𝑋. We also define the type type 𝒜 (𝑐) of a blank node 𝑐 ∈ 𝑁𝐵 with |𝑛𝒜 (𝑐)| = 1,
where 𝑑 ∈ 𝑛𝒜 (𝑐).
                         ⋃︁
             𝑛𝒜 (𝑋) :=      {𝑥 ∈ 𝑁𝐼 ∪ 𝑁𝐵 | 𝑟(𝑐, 𝑥) ∈ 𝒜 ∨ 𝑟(𝑥, 𝑐) ∈ 𝒜, 𝑟 ∈ 𝑁𝑅 }
                          𝑐∈𝑋
            type 𝒜 (𝑐) := ({𝐴 ∈ 𝑁𝐶 | 𝐴(𝑐) ∈ 𝒜} ∪ {⊤}, {𝑟 ∈ 𝑁𝑅+ | 𝑟(𝑐, 𝑑) ∈ 𝒜},
                                                              {𝐴 ∈ 𝑁𝐶 | 𝐴(𝑑) ∈ 𝒜} ∪ {⊤})

  We can now build the model, starting from 𝒜+
                                             𝒯 and adding successors according to succ𝒯 ,𝒜 .

Definition 4. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜), define the core universal model core(𝒯 , 𝒜) as
the union of the sequence 𝑋1 , 𝑋2 , . . ., that is recursively defined in the following way
    • 𝑋1 := 𝒜+
             𝒯;

    • 𝑋𝑛+1 is constructed from 𝑋𝑛 by adding, for each 𝑐 ∈ 𝑁𝐵 such that |𝑛𝑋𝑛 (𝑐)| = 1 and
      succ𝒯 ,𝒜 (type 𝑋𝑛 (𝑐), 𝑢) holds for some 𝑢, the following atoms for each 𝑢 ∈ 𝑢, with a fresh 𝑑 ∈ 𝑁𝐵
      for each 𝑢 ∈ 𝑢
         – {𝑟(𝑐, 𝑑), 𝑟− (𝑑, 𝑐)} ⊆ 𝑋𝑛+1 for each 𝑟 ∈ 𝜋1 (𝑢);
         – 𝐴(𝑑) ∈ 𝑋𝑛+1 for each 𝐴 ∈ 𝜋2 (𝑢).
   Note that we can see each entry in the sequence 𝑋1 , 𝑋2 , . . . as layers such that the freshly added
blank nodes in 𝑋𝑖 have distance 𝑖 to the ABox.
  We can now show that 𝑐𝑜𝑟𝑒(𝒯 , 𝒜) is a model, that it is a core, and that it is universal.
Proposition 4. For all ℰℒℋℐ knowledge bases (𝒯 , 𝒜), 𝑐𝑜𝑟𝑒(𝒯 , 𝒜) is a model of (𝒯 , 𝒜).
  The proof is based on the idea that all axioms that we consider are all in normal form. Thus, the rules
for the pre-good successor relation suffice to ensure that all axioms are satisfied in the part outside of
the ABox 𝒜. Closer to the data, the result follows from Proposition 3.
Proposition 5. For all ℰℒℋℐ knowledge bases (𝒯 , 𝒜), 𝑐𝑜𝑟𝑒(𝒯 , 𝒜) is a core.
Proof.⋃︀A set of atoms 𝒜 has a core cover when there exist finite sets of atoms ℬ1 ⊆ ℬ2 ⊆ . . . with
𝒜 = 𝑖≥1 ℬ𝑖 , such that for all ℬ𝑖 , each homomorphism ℎ : Δℬ𝑖 → Δ𝒜 is an embedding.
   From [22], Theorem 16, we learn that each set of atoms that has a core cover is a core. Thus, it suffices
to show that ℬ𝑖 = 𝑋𝑖 is a core cover for
                                           ⋃︁
                                       ℬ=      𝑋𝑖 = core(𝒯 , 𝒜).
                                             𝑖≥1
It is immediate that for each 𝑖, the identity mapping ℎ𝑖 : Δ𝑋𝑖 → Δcore(𝒯 ,𝒜) given by 𝑥 ↦→ 𝑥 is an
embedding. By induction on 𝑖 it is shown that each homomorphism from 𝑋𝑖 to core(𝒯 , 𝒜) is equal to
the identity mapping.
    • 𝑖 = 1. By definition 𝑋1 = 𝒜+  𝒯 , which is by construction a core. Note that for all endormorphisms
on cores it holds that if one node is mapped to itself, all nodes have to be mapped to themselves.
                                                       +
Furthermore, note that each 𝑥 ∈ Δcore(𝒯 ,𝒜) ∖ Δ𝒜𝒯 does not have a neighbour in the ABox, so the
image of each homomorphism from 𝒜+                                           +                     𝒜+
                                          𝒯 to core(𝒯 , 𝒜) is contained in 𝒜𝒯 . As all nodes in Δ
                                                                                                    𝒯 ∩ 𝑁𝐼

have to be mapped to themselves, we can conclude the required.
    • 𝑖 = 𝑛 + 1. Let us consider some 𝑥 ∈ Δ𝑋𝑛+1 ∖ Δ𝑋𝑛 . By induction hypothesis, we know that for all
homomorphisms ℎ : Δ𝑋𝑛+1 → Δcore , ℎ(𝑥′ ) = 𝑥′ for all 𝑥′ ∈ Δ𝑋𝑛 . Moreover, we know there exists
𝑦 ∈ Δ𝑋𝑛 with succ𝒯 ,𝒜 (type 𝑋𝑛 (𝑦), 𝑢) such that there exists 𝑢 ∈ 𝑢 that led to the introduction of 𝑥, i.e.,
𝑛𝑋𝑛+1 (𝑥) = {𝑦}. So, for each homomorphism ℎ : Δ𝑋𝑛+1 → Δcore(𝒯 ,𝒜) , we find ℎ(𝑥) ∈ 𝑛𝑐𝑜𝑟𝑒(𝒯 ,𝒜) (𝑦).
By construction, the neighbourhood of 𝑦 exists of (i) 𝑧 such that 𝑛𝑋𝑛 (𝑦) = {𝑧}, however, ℎ(𝑥) = 𝑧
would mean that the preconditions of introducing a new successor type in rule 7 in Definition 2 were
not applied properly, and (ii) 𝑥′ such that there exists 𝑢′ ∈ 𝑢 with 𝑢 ̸= 𝑢′ that led to the introduction of
𝑥′ , however, ℎ(𝑥) = 𝑥′ would mean that rule 8 of Definition 2 was not properly applied. Thus, we must
conclude that ℎ(𝑥) = 𝑥 for all 𝑥 ∈ Δ𝑋𝑛+1 , which concludes this induction.

Proposition 6. For all ℰℒℋℐ knowledge bases (𝒯 , 𝒜), 𝑐𝑜𝑟𝑒(𝒯 , 𝒜) is universal.

  The idea of the proof is that we only introduce new successors if there exists an applicable existential
axiom, and that all information in core(𝒯 , 𝒜) is a direct consequence of the applicable axioms and will
be derived in every model.
  Finally, we show that if the core chase terminates, it results in core(𝒯 , 𝒜).

Theorem 1. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜) such that ℬ is the unique, up to isomorphism,
                      𝒯   core
                      → ∘ −−→)𝜔 ℬ, then ℬ ∼
structure such that 𝒜(−                   = core(𝒯 , 𝒜).
Proof. First, note that both ℬ, by definition, and core(𝒯 , 𝒜), by Proposition 5 are cores. Therefore, as ℬ
is finite, it suffices to show that there exists homomorphisms in both directions. Note that core(𝒯 , 𝒜)
is a model by Proposition 4. As ℬ is a universal model, [7], it follows that there exists a homomorphism
from ℬ into each model, including core(𝒯 , 𝒜). By Proposition 6 we also have that core(𝒯 , 𝒜) is
universal, which means also the other required homomorphism must exist.

   The authors of [22] show that the stable chase does not always yield a universal model for existential
rules, but it is unclear whether the same result holds for the restriction to ℰℒℋℐ. If the stable chase for
ℰℒℋℐ is universal, then Theorem 1 can be extended to the stable chase.
   The core chase construction described in [7] uses multiple core computations. This makes it unsuitable
for implementation, as computing cores is hard in general [21]. The restricted chase yields a core chase
if the axioms are core-stratified [23], but there is no efficient algorithm for determining the core-
stratification order or even whether the axioms are core-stratified. Note that the few core computations
used in Definition 2 are very restricted and are thus not as hard: our approach is one of the first efficient
techniques for constructing the core chase. We note that an effective procedure for ℰℒℋ was presented
in [12], but the authors do not prove that their construction results in the core chase.


5. SHACL Validation
We now consider the problem of SHACL validation in the presence of ℰℒℋℐ ontologies.
   First we introduce the fragment of SHACL we consider, which we call semi-positive SHACL. Like in
[15], we follow the formalisation of SHACL in [17]. Let 𝑁𝑆 be an infinite set of shape names, disjoint
from 𝑁𝐼 , 𝑁𝐵 , 𝑁𝐶+ and 𝑁𝑅+ . A shape expression 𝜙 is built in the following way

                                𝜙 ::= 𝑐 | ¬𝑐 | 𝐴 | ¬𝐴 | 𝜙 ∧ 𝜙 | ∃𝐸.𝜙 | 𝑠,
for 𝑐 ∈ 𝑁𝐼 , 𝐴 ∈ 𝑁𝐶 , 𝑠 ∈ 𝑁𝑆 and 𝐸 a regular path expression such that 𝐸 ::= 𝑟 | 𝐸 ∪𝐸 | 𝐸 ∘𝐸 | 𝐸 * , for
𝑟 ∈ 𝑁𝑅+ , where "∪" is the standard union of relations, "∘" concatenation as described in the preliminaries
and "*" the Kleene star. A constraint is an expression of the form 𝑠 ← 𝜙, for 𝑠 ∈ 𝑁𝑆 and 𝜙 a shape
expression. A shapes graph (𝒞, 𝒢) is a pair consisting of a set of constraints 𝒞 and a set of targets 𝒢.
Targets are of the form 𝑠(𝑐), where 𝑠 ∈ 𝑁𝑆 and 𝑐 ∈ 𝑁𝐼 . Given a set of atoms 𝒜, a shape assignment
is a set of shape atoms of the form 𝑠(𝑐), for 𝑠 ∈ 𝑁𝑆 , 𝑐 ∈ Δ𝒜 . Given a shape expressions 𝜙, a shape
assignment 𝑆, and a set of atoms 𝒜, define 𝜙𝒜,𝑆 ⊆ Δ𝒜 inductively as in Figure 2, where (𝑒, 𝑒′ ) ∈ 𝐸 𝒜
if there exists 𝑤1 · . . . · 𝑤𝑛 in the language defined by 𝐸 and {𝑤1 (𝑒, 𝑒1 ), . . . , 𝑤𝑛 (𝑒𝑛−1 , 𝑒′ )} ⊆ 𝒜.
    Just like in [15], we consider the least fixed point semantics for SHACL. We define an immediate
consequence operator 𝑇𝒜,𝒞 that maps shape assignments to shape assignments: 𝑇𝒜,𝒞 (𝑆) := 𝑆 ∪ {𝑠(𝑎) |
𝑠 ← 𝜙 ∈ 𝒞, 𝑎 ∈ (𝜙)𝒜,𝑆 }. For a set of atoms 𝒜 and a shapes graph (𝒞, 𝒢), we say that 𝒜 validates
(𝒞, 𝒢) when 𝒢 is contained in the least fixed point of 𝑇𝒜,𝒞 . In the presence of a TBox 𝒯 , we define the
semantics of validation using the core universal model. That is, we say that (𝒯 , 𝒜) validates (𝒞, 𝒢) in
case core(𝒯 , 𝒜) validates (𝒞, 𝒢).
  Our goal is now to reduce validation in the presence of a TBox, to plain validation over a set of atoms.
The following normal form for SHACL constraints will facilitate this goal.
Definition 5. A SHACL constraint is in normal form if it has one of the following forms
                         (NC1) 𝑠 ← 𝑐        (NC2) 𝑠 ← ¬𝑐           (NC3) 𝑠 ← 𝐴
                         (NC4) 𝑠 ← ¬𝐴       (NC5) 𝑠 ← 𝑠′ ∧ 𝑠′′     (NC6) 𝑠 ← ∃𝑟.𝑠′ ,

where 𝑐 ∈ 𝑁𝐼 , 𝑟 ∈ 𝑁𝑅+ , {𝑠, 𝑠′ , 𝑠′′ } ⊆ 𝑁𝑆 and 𝐴 ∈ 𝑁𝐶 .
Proposition 7. Each set of semi-positive SHACL constraints 𝒞 can be translated in time polynomial in
|𝒞|, into a set of semi-positive constraints 𝒞 ′ in normal form such that for all 𝒢 that do not use the freshly
introduced shape names, and all 𝒯 and 𝒜, we have that (𝒯 , 𝒜) validates (𝒞, 𝒢) iff (𝒯 , 𝒜) validates
(𝒞 ′ , 𝒢).
Proof. With a similar construction as of Proposition 4.2 in [18], it is easy to see that each constraint can
be translated in polynomial time into the a normal form consisting of (NC1) until (NC5) as above, and
(NC6’) 𝑠 ← ∃𝐸.𝑠′ . To normalise 𝑠 ← ∃𝐸.𝑠′ , suppose ℳ = (𝑄, Σ, 𝑞𝐼 , Δ, 𝑞𝐹 ) is the nondeterministic
finite automaton recognising 𝐸. Take fresh shape names 𝑠𝑞 for each 𝑞 ∈ 𝑄 and add the following
constraints 𝑠 ← 𝑠𝑞𝐼 ∧ 𝑠𝑞𝐼 , 𝑠𝑞 ← ∃𝑟.𝑠′𝑞 if (𝑞, 𝑟, 𝑞 ′ ) ∈ Δ and 𝑠𝑞𝐹 ← 𝑠′ ∧ 𝑠′ . Note that this automaton can
be constructed in polynomial time, which is a standard result in the literature, see for instance [24].

   We are ready now to reduce validation to the case without a TBox. Given a TBox 𝒯 and a set of atoms
𝒜, or more specifically, the succ𝒯 ,𝒜 that represents a universal core model of (𝒯 , 𝒜). We transform a
shapes graph (𝒞, 𝒢) into a new shapes graph (𝒞𝒯 ,𝒜 , 𝒢) such that (𝒯 , 𝒜) validates (𝒞, 𝒢) iff 𝒜′ validates
(𝒞𝒯 ,𝒜 , 𝒢), for some set of atoms 𝒜′ that can be easily obtained from 𝒜 and the construction of succ𝒯 ,𝒜 .
Unlike the similar rewriting for DL-Liteℛ , the variant we introduce here does not work for every set of
atoms 𝒜: we can only guarantee soundness and completeness if the set of atoms 𝒜 is such that succ𝒯 ,𝒜
represents the core universal model of (𝒯 , 𝒜). We compromise the data independence, but in this way,
we avoid the best case exponentially of its predecessor [15].
   We use a fresh concept name 𝐴      ̂︀ ∈ 𝑁𝐶 to decorate the outer layer of 𝒜+ , and define 𝒜* :=
                                                                                  𝒯                  𝒯
                        +
𝒜+𝒯  ∪ {  ̂︀ | 𝑐 ∈ Δ𝒜𝒯 ∩ 𝑁𝐵 }.The elements in our rewriting are triples (𝑡, 𝑊, 𝐻) ⊆ ℱ𝒯 ,𝒜 × 𝒫({∃𝑟.𝑠 |
         𝐴(𝑐)
𝑟 ∈ 𝑁𝑅+ , 𝑠 ∈ 𝑁𝑆 }) × 𝒫(𝑁𝑆 ): 𝑡 is a live type, 𝑊 a set of assumptions and 𝐻 a set of shape names. The
starting set 𝐼 of our rewriting is such that (𝑡, 𝑊, 𝐻) ∈ 𝐼 iff
     • if ∃𝑟.𝑠′ ∈ 𝑊 , there exists 𝑠 ← ∃𝑟.𝑠′ ∈ 𝒞 and 𝑟 ∈ 𝜋2 (𝑡), and
     • 𝑠 ∈ 𝐻 iff there exists ∃𝑟.𝑠′ ∈ 𝑊 such that 𝑠 ← ∃𝑟.𝑠′ ∈ 𝒞.
   Our rewriting algorithm adds shape names 𝑠 to the heads 𝐻. Specifically, we update (𝑡, 𝑊, 𝐻) ∈ 𝐼,
to (𝑡, 𝑊, 𝐻 ∪ {𝑠}) when any of the following holds:
1. 𝑠 ← 𝐴 ∈ 𝒞, 𝐴 ∈ 𝜋1 (𝑡), 𝑠 ̸∈ 𝐻, or
2. 𝑠 ← ¬𝐴 ∈ 𝒞, 𝐴 ̸∈ 𝜋1 (𝑡), 𝑠 ̸∈ 𝐻, or
3. 𝑠 ← ¬𝑐 ∈ 𝒞, 𝑠 ̸∈ 𝐻, or
4. 𝑠 ← 𝑠1 ∧ 𝑠2 ∈ 𝒞, {𝑠1 , 𝑠2 } ⊆ 𝐻, 𝑠 ̸∈ 𝐻, or
5. 𝑠 ← ∃𝑟.𝑠1 ∈ 𝒞, and there exists (𝑡′ , 𝑊 ′ , 𝐻 ′ ) ∈ 𝐼 such that, for some 𝑢 ∈ 𝑢 with succ𝒯 ,𝒜 (𝑡, 𝑢), we
   have 𝑡′ = (inv (𝑢), 𝜋1 (𝑡)), 𝑠1 ∈ 𝐻 ′ , 𝑟− ∈ 𝜋2 (𝑡) and {𝑠 | ∃𝑟.𝑠 ∈ 𝑊 ′ } ⊆ 𝐻.

   Let sat(𝐼) be the result of exhaustively adding shape names as above. The triples in sat(𝐼) are then
translated into the set of constraints 𝒞𝒯 ,𝒜 in the following way.
   If (𝑡, 𝑊, 𝐻) ∈ sat(𝐼), then for each 𝑠 ∈ 𝐻 the following constraint is contained in 𝒞𝒯 ,𝒜
                                  ⋀︁               ⋀︁          ⋀︁
                            𝑠←          𝐴 ∧ ∃𝑟.          𝐵∧        ∃𝑟.𝑠′ ∧ 𝐴,
                                                                           ̂︀
                                    𝐴∈𝜋1 (𝑡)          𝐵∈𝜋3 (𝑡)       ∃𝑟.𝑠′ ∈𝑊

where 𝑟 ∈ 𝜋2 (𝑡) is such that for all 𝑟′ ∈ 𝜋2 (𝑡) we have 𝑟′ ∈ cl 𝒯 (𝑟). Note that the existence of 𝑟 is
guaranteed by Proposition 2.
  This gives us the desired rewriting. Its proof of correctness is analogous to the one in [15].

Theorem 2. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜) and (𝒞, 𝒢) a semi-positive SHACL shapes graph.
Then it holds that (𝒯 , 𝒜) validates (𝒞, 𝒢) iff 𝒜*𝒯 validates (𝒞𝒯 ,𝒜 , 𝒢).

   Since we are taking the input 𝒜 into account, we do not have to consider all possible types and 𝒞𝒯
is not best-case exponential. However, there are still ways to optimise the rewriting. The first and
simplest optimisation is to restrict the form of the starting sets even more, that is, not all 𝑠 ∈ 𝑁𝑆 will
ever interact with each other. So, let rel (𝑠) ⊆ 𝑁𝑆 be defined as the smallest set such that 𝑠 ∈ rel (𝑠); if
𝑠′ ← 𝑠′′ ∧ 𝑠′′′ ∈ 𝒞 and {𝑠′ , 𝑠′′ , 𝑠′′′ } ∩ rel (𝑠) ̸= {}, then {𝑠′ , 𝑠′′ , 𝑠′′′ } ∈ rel (𝑠); and if 𝑠′ ← ∃𝑟.𝑠′′ ∈ 𝒞 and
{𝑠′ , 𝑠′′ } ∩ rel (𝑠) ̸= {}, then {𝑠′ , 𝑠′′ } ∈ rel (𝑠). The restriction is then that (𝑡, 𝑊, 𝐻) is only considered
to be in the starting set in case there exists 𝑠 ∈ 𝑁𝑆 such that 𝐻 ⊆ rel (𝑠). This optimisation is mostly
useful for sets of constraints that relatively use a small amount of role names. Another more involved
optimisation is based on the close correspondence between the SHACL fragment we are considering and
ℰℒℋℐ in normalised form. We can translate the SHACL constraints into ℰℒℋℐ axioms, introducing for
each existing concept name a fresh concept name to denote the absence of this concept name in succ𝒯 ,𝒜 .
A similar algorithm to the one in Section 3 will then decorate the good successor configuration with all
relevant shape names. We believe that the latter version may be better suited for implementation.
   However, all these optimisations cannot avoid ExpTime-complexity in the worst case. The proof
of the next claim is an easy generalisation of the one in [15]. To show PTime-completeness in data
complexity the rewriting can be made data independent as in [15].

Theorem 3. In the presence of ℰℒℋℐ TBoxes, SHACL validation is ExpTime-complete in combined
complexity and PTime-complete in data complexity. This holds also for positive constraints.


6. Conclusions and Outlook
In this paper, we introduced a new, more efficient representation of a core universal model. We had
shown in [15] that this can be done in a data-independent way; the construction there is for DL-Liteℛ ,
but a data-independent computation for ℰℒℋℐ would not be harder than the one we have presented
here. Instead, by considering the exact configurations that appear in the data, we have chosen to focus
solely on computing the relevant parts, highly reducing the number of good successor configurations
and TBox subsumptions that need to be computed. Thus, this is a first step towards an efficient
implementation of the core chase.
   We further focus on one of many application areas of this technique: we use the alternative represen-
tation of a core universal model to reduce SHACL validation combined with ℰℒℋℐ TBoxes to plain
SHACL validation. Also here, we could take all types for a data-independent rewriting, but restricting it
to the live types allows us to avoid an exponential computation in many cases, as not all combinations
of types and assumptions need to be considered. This means the proposed algorithm also brings us
closer to implementing the combination of SHACL with ontologies. We are confident that the proposed
techniques can most likely be optimised further, but leave this for future work. To keep the presentation
simple we focused on semi-positive SHACL in this paper, but we note that the rewriting presented in
the last section can be extended to SHACL with stratified negation following the ideas of [15].


Acknowledgments
      The project leading to this application has received funding from the European Union’s Horizon
2020 research and innovation programme under grant agreement No 101034440.
  This work was also partially supported by the Austrian Science Fund (FWF) project P30873.


References
 [1] M. Ortiz, M. Simkus, Reasoning and query answering in description logics, in: Reasoning Web,
     volume 7487 of Lecture Notes in Computer Science, Springer, 2012, pp. 1–53.
 [2] M. Bienvenu, M. Ortiz, Ontology-mediated query answering with data-tractable description logics,
     in: Reasoning Web, volume 9203 of Lecture Notes in Computer Science, Springer, 2015, pp. 218–307.
 [3] M. Ortiz, S. Rudolph, M. Simkus, Query answering in the horn fragments of the description logics
     SHOIQ and SROIQ, in: IJCAI, IJCAI/AAAI, 2011, pp. 1039–1044.
 [4] D. S. Johnson, A. C. Klug, Testing containment of conjunctive queries under functional and
     inclusion dependencies, J. Comput. Syst. Sci. 28 (1984) 167–189.
 [5] R. Fagin, P. G. Kolaitis, R. J. Miller, L. Popa, Data exchange: semantics and query answering, Theor.
     Comput. Sci. 336 (2005) 89–124.
 [6] B. Marnette, Generalized schema-mappings: from termination to tractability, in: J. Paredaens,
     J. Su (Eds.), Proceedings of the Twenty-Eigth ACM SIGMOD-SIGACT-SIGART Symposium on
     Principles of Database Systems, PODS 2009, June 19 - July 1, 2009, Providence, Rhode Island, USA,
     ACM, 2009, pp. 13–22.
 [7] A. Deutsch, A. Nash, J. B. Remmel, The chase revisited, in: M. Lenzerini, D. Lembo (Eds.),
     Proceedings of the Twenty-Seventh ACM SIGMOD-SIGACT-SIGART Symposium on Principles of
     Database Systems, PODS 2008, June 9-11, 2008, Vancouver, BC, Canada, ACM, 2008, pp. 149–158.
 [8] T. Eiter, M. Ortiz, M. Simkus, T. Tran, G. Xiao, Query rewriting for Horn-SHIQ plus rules, in:
     J. Hoffmann, B. Selman (Eds.), Proceedings of the Twenty-Sixth AAAI Conference on Artificial
     Intelligence, July 22-26, 2012, Toronto, Ontario, Canada, AAAI Press, 2012, pp. 726–733.
 [9] H. Pérez-Urbina, B. Motik, I. Horrocks, Tractable query answering and rewriting under description
     logic constraints, J. Appl. Log. 8 (2010) 186–209.
[10] D. Calvanese, G. D. Giacomo, D. Lembo, M. Lenzerini, A. Poggi, M. Rodriguez-Muro, R. Rosati,
     M. Ruzzi, D. F. Savo, The MASTRO system for ontology-based data access, Semantic Web 2 (2011)
     43–53.
[11] D. Calvanese, B. Cogrel, S. Komla-Ebri, R. Kontchakov, D. Lanti, M. Rezk, M. Rodriguez-Muro,
     G. Xiao, Ontop: Answering SPARQL queries over relational databases, Semantic Web 8 (2017)
     471–487.
[12] S. Borgwardt, W. Forkel, Closed-world semantics for conjunctive queries with negation over
     ELH-bottom ontologies, in: S. Kraus (Ed.), Proceedings of the Twenty-Eighth International Joint
     Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, ijcai.org, 2019,
     pp. 6131–6135.
[13] M. Arenas, G. Gottlob, A. Pieris, Expressive languages for querying the semantic web, ACM Trans.
     Database Syst. 43 (2018) 13:1–13:45.
[14] S. Ellmauthaler, M. Krötzsch, S. Mennicke, Answering queries with negation over existential rules,
     in: Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, AAAI Press, 2022, pp.
     5626–5633.
[15] S. Ahmetaj, M. Ortiz, A. Oudshoorn, M. Šimkus, Reconciling shacl and ontologies: Semantics and
     validation via rewriting, in: ECAI 2023, IOS Press, 2023, pp. 27–35.
[16] W3C, Shape constraint language (SHACL), Technical Report. (2017). https://www.w3.org/TR/shacl.
[17] J. Corman, J. L. Reutter, O. Savkovic, Semantics and validation of recursive SHACL, in: D. Vrandecic,
     K. Bontcheva, M. C. Suárez-Figueroa, V. Presutti, I. Celino, M. Sabou, L. Kaffee, E. Simperl (Eds.),
     The Semantic Web - ISWC 2018 - 17th International Semantic Web Conference, Monterey, CA,
     USA, October 8-12, 2018, Proceedings, Part I, volume 11136 of Lecture Notes in Computer Science,
     Springer, 2018, pp. 318–336.
[18] M. Andresel, J. Corman, M. Ortiz, J. L. Reutter, O. Savkovic, M. Šimkus, Stable model semantics for
     recursive SHACL, in: Proc. of The Web Conference 2020, ACM, 2020, p. 1570–1580.
[19] P. Pareti, G. Konstantinidis, F. Mogavero, Satisfiability and containment of recursive SHACL,
     Journal of Web Semantics 74 (2022) 100721.
[20] B. Bogaerts, M. Jakubowski, J. Van den Bussche, SHACL: A description logic in disguise, in:
     Proc. of LPNMR 2022, volume 13416 of Lecture Notes in Computer Science, Springer, 2022, pp. 75–88.
[21] P. Hell, J. Nesetril, The core of a graph, Discret. Math. 109 (1992) 117–126.
[22] D. Carral, M. Krötzsch, M. Marx, A. Ozaki, S. Rudolph, Preserving constraints with the stable chase,
     in: B. Kimelfeld, Y. Amsterdamer (Eds.), 21st International Conference on Database Theory, ICDT
     2018, March 26-29, 2018, Vienna, Austria, volume 98 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum
     für Informatik, 2018, pp. 12:1–12:19.
[23] M. Krötzsch, Computing cores for existential rules with the standard chase and ASP, in: D. Cal-
     vanese, E. Erdem, M. Thielscher (Eds.), Proceedings of the 17th International Conference on
     Principles of Knowledge Representation and Reasoning, KR 2020, Rhodes, Greece, September
     12-18, 2020, 2020, pp. 603–613.
[24] J. Hromkovic, S. Seibert, T. Wilke, Translating regular expressions into small 𝜖-free nondetermin-
     istic finite automata, J. Comput. Syst. Sci. 62 (2001) 565–588.