Reasoning with the Core Chase: the Case of SHACL Validation over ELHI Knowledge Bases Anouk Oudshoorn1 , Magdalena Ortiz1 and Mantas Šimkus1 1 Institute of Logic and Computation, TU Wien, Austria Abstract Many approaches to query answering in Description Logics (DLs) adopt the certain answer semantics, which is natural for standard query languages without negation, but falls short when negation is involved. Since negation plays a crucial role in many applications, alternative semantics are needed. Core universal models are increasingly accepted as a means to give a semantics to queries with negation. In this paper, we consider the problem of evaluating (possibly recursive) SHACL expressions over the core universal model of an ℰℒℋℐ KB. SHACL, recently proposed as a standard constraint language for RDF data, is a natural and powerful query language, and the fragment we consider adds regular path navigation to semi-positive monadic Datalog with acyclic rule bodies. We first propose a construction of a finite representation of the core universal model, based on a novel calculus for ℰℒℋℐ that gives up the data-independence to avoid the best-case exponential behaviour of similar approaches, and which we believe can lead towards an efficient implementation. Then we leverage this finite representation to reduce validation in the presence of ontologies to plain SHACL validation, similarly to previous algorithms but avoiding their best-case exponential behaviour. Our algorithms yield tight data- and combined-complexity bounds for the studied SHACL validation problem, which coincide with plain consistency testing in ℰℒℋℐ. Keywords core chase, query rewriting, SHACL, ELHI, Horn DLs 1. Introduction Evaluating queries over description logic (DL) knowledge bases has been the subject of huge research efforts over almost two decades. Here, the certain answer semantics has played a prominent role in elegantly handling information incompleteness for large classes of queries: answering a query consists of computing the intersection of query answers over all models of the input knowledge base (KB). This problem can easily become computationally very costly for expressive DLs, and it is provably more expensive than standard reasoning in most of the popular extensions of 𝒜ℒ𝒞 [1]. But fortunately, this computational hurdle can usually be avoided in the so-called Horn DLs, which disallow ontological statements that involve disjunction, leading in turn to the universal model property [2, 3]. This property guarantees that a consistent KB always has a model that represents all models (technically, that can be homomorphically mapped into any model) and is thus sufficient to compute the certain answers to all conjunctive queries. The universal model property is closely related to the notion of a chase procedure known from databases, originally studied for reasoning about integrity constraints [4]. There are several chase variants with different properties (e.g., oblivious, Skolem, core) [5, 6, 7], and all of them can be used to obtain a model that is universal in the sense above. Query answering is still far from trivial even when universal models exists. For standard DLs, the chase does not terminate in general and universal models can be infinite, so we need to reason about them without building them explicitly. Nevertheless, the universal model property of Horn DLs has enabled the development of several algorithms and practical implementations of query answering over large classes of positive queries, see e.g., [2, 3, 8, 9, 10, 11] and their references. DL 2024: 37th International Workshop on Description Logics, June 18–21, 2024, Bergen, Norway $ anouk.oudshoorn@tuwien.ac.at (A. Oudshoorn); magdalena.ortiz@tuwien.ac.at (M. Ortiz); mantas.simkus@tuwien.ac.at (M. Šimkus)  0009-0006-4638-5948 (A. Oudshoorn); 0000-0002-2344-9658 (M. Ortiz); 0000-0003-0632-0294 (M. Šimkus) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Unfortunately, the situation is very different when queries involve negation as failure, which allows us to query about the absence of information, e.g., to ask for objects that do not (provably) have some property. Such queries are highly desirable in KR and have been advocated in several papers [12, 13, 14, 15]. However, they are not closed under homomorphisms, which means that different universal models may lead to different answers. Apparently innocuous differences in the chase procedure may affect the query answers, and the certain answer semantics becomes inadequate; see, e.g., the discussion in [14] and [15]. The core universal model is increasingly accepted as a way to define the semantics of these queries [14, 15], since it can be seen as the structure that best represents the intended meaning of the given input data and the terminological axioms, without any superfluous structures or redundant elements. The first major contribution of this paper is a method for computing a finite representation of a (possibly infinite)core universal model of an ℰℒℋℐ KB. This method includes a data-aware version of a consequence-based calculus for ℰℒℋℐ. Specifically, we define a collection of reasoning rules that incrementally builds a finite representation of a core universal model for a given input ABox. Unlike previous consequence-based inference procedures for ℰℒℋℐ, are the rules in our calculus aware of input data. This avoids the main drawback of the previous approaches: data-independent computations easily lead to best-case exponential behavior. After presenting our calculus, we use it to address a timely and challenging problem: validation of (possibly recursive) SHACL in the presence of ontologies. SHACL is a recently introduced W3C recommendation for writing constraints over RDF data [16]. It has also been formalised in logic and has connections to DLs [17, 18, 19, 20]. The SHACL standard envisions SHACL validation performed in the presence of ontologies but does not describe how this should be realised. SHACL is closely related to DLs and typical formalisms studied in the setting of ontology-mediated queries. We concentrate on semi-positive SHACL shapes, which allow for negation of ontology predicates, but not of shape names. This fragment can be seen as semi-positive monadic Datalog queries with acyclic rule bodies and regular path expressions, and it already raises the issues with the certain answer semantics discussed above. Following our previous work [15], we use a core universal model of the data and ontology for validation. This leads us to our second contribution: an algorithm for evaluating SHACL expressions over the finite representation of a core universal model of an ℰℒℋℐ KB. This procedure is based on the rewriting from [15], but unlike that work, the algorithm presented here is not best-case exponential, potentially paving the way towards implementation. Moreover, we show that checking if a given ℰℒℋℐ KB validates a shapes graph is ExpTime-complete and PTime-complete in combined and data complexity, respectively. These bounds are the best we could hope for, since they coincide with the complexity of consistency testing in ℰℒℋℐ. 2. Preliminaries Let 𝑁𝐶 , 𝑁𝐼 , and 𝑁𝑅 be infinite sets of concept names, individual names, and role names, respectively. Let 𝑁𝐶+ := 𝑁𝐶 ∪ {⊤} and 𝑁𝑅+ := {𝑝, 𝑝− | 𝑝 ∈ 𝑁𝑅 } denote roles. For every 𝑝 ∈ 𝑁𝑅 , let (𝑝− )− = 𝑝. For each set of roles 𝑅 ⊆ 𝑁𝑅+ , set 𝑅− := {𝑟− | 𝑟 ∈ 𝑁𝑅+ }. Let 𝑁𝐵 be an infinite set of blank nodes disjoint from 𝑁𝐼 . Call 𝐴(𝑐) and 𝑟(𝑐, 𝑑) atoms, for each 𝐴 ∈ 𝑁𝐶 , 𝑟 ∈ 𝑁𝑅+ and {𝑐, 𝑑} ⊆ 𝑁𝐼 ∪ 𝑁𝐵 . We let Δ𝒜 denote the domain of 𝒜, i.e. the elements of 𝑁𝐼 ∪ 𝑁𝐵 that appear in 𝒜. In case Δ𝒜 ⊆ 𝑁𝐼 , we call 𝒜 an ABox. In the rest of this paper, we assume that for all ABoxes 𝒜, if 𝑟(𝑎, 𝑏) ∈ 𝒜, then 𝑟− (𝑏, 𝑎) ∈ 𝒜 too. For a tuple ⃗𝑥 = (𝑥1 , . . . , 𝑥𝑛 ) and 1 ≤ 𝑗 ≤ 𝑛, we let 𝜋𝑖 (𝑥 ⃗ ) = 𝑥𝑖 be its 𝑖-th projection. Morphisms. Let 𝒜 and 𝒜′ be sets of atoms. A homomorphism from 𝒜 to 𝒜′ is a function ′ ℎ : Δ𝒜 → Δ𝒜 such that for all {𝑐, 𝑑} ⊆ 𝑁𝐼 ∪ 𝑁𝐵 , all 𝐴 ∈ 𝑁𝐶 and all 𝑝 ∈ 𝑁𝑅 , (i) if 𝑐 ∈ Δ𝒜 ∩ 𝑁𝐼 , then ℎ(𝑐) = 𝑐, (ii) if 𝐴(𝑐) ∈ 𝒜, then 𝐴(ℎ(𝑐)) ∈ 𝒜′ , and (iii) if {𝑝(𝑐, 𝑑), 𝑝− (𝑑, 𝑐)} ∩ 𝒜 ̸= {}, then {𝑝(ℎ(𝑐), ℎ(𝑑)), 𝑝− (ℎ(𝑑), ℎ(𝑐))} ∩ 𝒜′ ̸= {}. A homomorphism is called strong when (ii) and (iii) are strengthened to “𝐴(𝑐) ∈ 𝒜 iff 𝐴(ℎ(𝑐)) ∈ 𝒜′ ” and “{𝑝(𝑐, 𝑑), 𝑝− (𝑑, 𝑐)} ∩ 𝒜 = ̸ {} iff {𝑝(ℎ(𝑐), ℎ(𝑑)), 𝑝− (ℎ(𝑑), ℎ(𝑐))} ∩ 𝒜′ ̸= {}”, respectively. An embedding is a strong injective ho- momorphism, an isomorphism is a surjective embedding and an endomorphism of 𝒜 is a homomorphism from 𝒜 to itself. A set of atoms is a core when all its endomorphisms are embeddings. The core of a set of atoms 𝒜 is a set of atoms ℬ ⊆ 𝒜, such that (i) there exists an endomorphism ℎ from 𝒜 to ℬ, (ii) ℬ is core the restriction to the image of ℎ, and (iii) ℬ is a core.We write 𝒜 −−→ ℬ. Each finite set of atoms has a core that is unique up to isomorphism [21]. The Description logic ℰℒℋℐ. We are considering a normalised version of ℰℒℋℐ, i.e., the axioms we consider have one of the following normal forms. 𝐴1 ⊓ . . . ⊓ 𝐴𝑛 ⊑ 𝐵 𝐴1 ⊑ ∃𝑟.𝐴2 ∃𝑟.𝐴1 ⊑ 𝐵 𝑟 ⊑ 𝑟′ Here, {𝐴1 , . . . , 𝐴𝑛 } ⊆ 𝑁𝐶+ , 𝐵 ∈ 𝑁𝐶+ ∪ {⊥} and {𝑟, 𝑟′ } ⊆ 𝑁𝑅+ . An ℰℒℋℐ TBox 𝒯 is a set of such axioms. An (ℰℒℋℐ) knowledge base (𝒯 , 𝒜) consists of an ABox 𝒜 and an ℰℒℋℐ TBox 𝒯 . As usual, we use first-order interpretations to define the semantics. Note that any set of atoms 𝒜 can be viewed as an interpretation with domain Δ𝒜 . Building the chase. The function 𝑓𝑥 , for each 𝑥 ∈ 𝑁𝐼 ∪ 𝑁𝐵 , that translates concepts into a set of atoms is inductively defined in the following way: 𝑓𝑥 (⊤) := {}, 𝑓𝑥 (𝐴) := {𝐴(𝑥)}, 𝑓𝑥 (∃𝑟.𝐴) := {𝑟(𝑥, 𝑦), 𝑟− (𝑦, 𝑥), 𝐴(𝑦)} for some fresh variable 𝑦 and 𝑓𝑥 (𝐶 ⊓ 𝐶 ′ ) := 𝑓𝑥 (𝐶) ∪ 𝑓𝑥 (𝐶 ′ ). We say 𝑐 is an applicable match for an axiom 𝐶 ⊑ 𝐷 ∈ 𝒯 in a set of atoms 𝒜, when there exists a homomorphism ℎ from 𝑓𝑥 (𝐶) to 𝒜, such that ℎ(𝑥) = 𝑐 and there is no homomorphism from 𝑓𝑥 (𝐷) to 𝒜 such that ℎ(𝑥) = 𝑐. We use the notation (𝑐, 𝐶 ⊑ 𝐷) ∈ m∃ (𝒯 , 𝒜) when 𝐷 is of the form ∃𝑟.𝐴 for some 𝑟 ∈ 𝑁𝑅+ , 𝐴 ∈ 𝑁𝐶+ , and the notation (𝑐, 𝐶 ⊑ 𝐷) ∈ m(𝒯 , 𝒜) when 𝐷 ∈ 𝑁𝐶 . Similarly, (𝑐, 𝑑) is an applicable match for 𝑟 ⊑ 𝑟′ ∈ 𝒯 in a set of atoms 𝒜, when 𝑟(𝑐, 𝑑) ∈ 𝒜 and 𝑟′ (𝑐, 𝑑) ̸∈ 𝒜. In that case, we write ((𝑐, 𝑑), 𝑟 ⊑ 𝑟′ ) ∈ m(𝒯 , 𝒜). 𝒯 ,𝐼 Definition 1. For an ℰℒℋℐ TBox 𝒯 , a set 𝐼 ⊆ 𝑁𝐼 ∪ 𝑁𝐵 and ABoxes 𝒜, 𝒜′ , we let 𝒜 −−→ 𝒜′ if ⋃︁ ⋃︁ ⋃︁ 𝒜′ = 𝒜 ∪ 𝑓𝑐 (𝐷) ∪ 𝑓𝑐 (𝐷) ∪ 𝑟′ (𝑐, 𝑑). (𝑐,𝐶⊑𝐷)∈m∃ (𝒯 ,𝒜),𝑐∈𝐼 (𝑐,𝐶⊑𝐷)∈m(𝒯 ,𝒜) ((𝑐,𝑑),𝑟⊑𝑟′ )∈m(𝒯 ,𝒜) When 𝐼 = 𝑁𝐼 ∪ 𝑁𝐵 , we drop the 𝐼. Note that the 𝐼 restricts the set of nodes for which we consider applicable existential matches. We use the notation 𝒜(→)𝜔 𝒜′ to denote that there exists an 𝑛 such that 𝒜(→)𝑛 𝒜′ and for all 𝒜′′ such that 𝒜′ → 𝒜′′ we find 𝒜′ = 𝒜′′ , for each binary relation →. With ∘ we denote the concatenation of two binary relations, that is, 𝒜 → ∘ →′ ℬ iff there exists 𝒜′ such that 𝒜 → 𝒜′ and 𝒜′ →′ ℬ, for each pair of binary relations → and →′ . Core chase. In the core chase, all applicable matches are fired simultaneously, followed by a core check. This procedure is repeated until it terminates (which is not guaranteed): given a knowledge 𝒯 core base 𝒯 , 𝒜, the core chase is the unique, up to isomorphism, structure ℬ such that 𝒜(− → ∘ −−→)𝜔 ℬ [7]. This procedure finds a finite universal (core) model whenever it exists, but it does not specify how to create an infinite structure from a series of finite chase structures: simply taking the union of ℬ𝑖 , 𝒯 core such that 𝒜(− → ∘ −−→)𝑖 ℬ𝑖 , does in general not produce a core. Take for instance 𝒜 = {𝐴(𝑐)} and 𝒯 = {𝐴 ⊑ ∃𝑟.𝐴, 𝐴 ⊑ ∃𝑠.𝐴, 𝑟 ⊑ 𝑠}. The core chase is generalised to infinite structures in [22] by the so-called stable chase. The austere universal model, introduced in [15] for DL-Liteℛ , is constructed using a good successor configuration that ensures that the axioms are locally satisfied while preserving conditions of a core at each step. Here we follow a similar approach, and moreover, we prove that these local conditions are enough to obtain the core chase (whenever the procedure terminates). 3. Good Successor Configuration In this section, we present our first major contribution: computing a finite representation of a core universal model for a given ℰℒℋℐ KB. In a nutshell, the good successor configuration tells us for each configuration of a node and its immediate ‘neighbourhood’, the precise configuration of blank nodes that should be introduced as immediate successors. In contrast to DL-Liteℛ , in ℰℒℋℐ the relevant context is not only the incoming roles 𝑟, but also the concepts satisfied at the already existing neighbours. For simplicity, we focus on describing the good successors of the blank nodes introduced during the chase, which have a unique predecessor in the forest-like structure. For describing the successor configurations, we use types and successor types. A type 𝑡 ∈ ℱ is defined as an element of 𝒫(𝑁𝐶+ ) × 𝒫(𝑁𝑅+ ) × 𝒫(𝑁𝐶+ ), and a successor type 𝑢 ∈ 𝒮 is an element of 𝒫(𝑁𝑅+ ) × 𝒫(𝑁𝐶+ ). A type 𝑡 describes a pair of nodes and the roles between them, while a successor type describes a set of roles leading to one node. A pair (𝑡, {𝑢1 , . . . , 𝑢𝑛 }) in the good configuration means that a node that is connected to its predecessor as described by 𝑡, and needs successors as prescribed by 𝑢1 , · · · , 𝑢𝑛 , see Figure 1. Furthermore, we define the inverse function inv : 𝒫(𝑁𝑅+ )×𝒫(𝑁𝐶+ ) → 𝒫(𝑁𝐶+ )×𝒫(𝑁𝑅+ ) on successor types by setting inv (𝑢) := (𝜋2 (𝑢), (𝜋1 (𝑢))− ). (Recall that 𝜋1 and 𝜋2 are the first and second projection of a tuple, respectively, as defined in the preliminaries.) Defining the successors of all types that may occur in the chase for any possible ABox would necessarily lead to a strictly exponential behaviour. While exponentiality is in general unavoidable, as standard reasoning tasks are ExpTime complete, we want to avoid it for as many instances as possible. Hence in the following we build the good successor configuration carefully, to avoid best- case exponential behaviour. We start from the input ABox 𝒜, and extend it to satisfy all applicable axioms, but we restrict the satisfaction of existential axioms to only create blank nodes that are directly neighbouring the ABox. We call this structure 𝒜1 . The freshly introduced blank nodes in 𝒜1 have exactly one predecessor and can thus be described by a type. We take exactly those types as initial live types, and compute the pre-good configuration of successors that each live type needs. In each step of the computation we add to the live types the new successor types. It might happen that we encounter a so-called loop computation: the created successors may propagate some concept 𝐵 back to the current node, This information is saved in a relation 𝐿 as the pair of the type and the implied concept, and the pre-good successor pairs are updated with this information. Unlike DL-Liteℛ , derived information in ℰℒℋℐcan propagate through the model, and in particular, affect the concepts satisfied by ABox individuals. Thus the information from the loop computation 𝐿 must be used to update also the extended ABox, from which we retrieve fresh, updated live types for the next round of building pre-good successors. This continues in rounds until the process terminates. Definition 2. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜). Let 𝒜1 be the unique (up to isomorphism) set of 𝐼 𝒯 ,𝑁 core 𝜔 ∘ −−→ 𝒜 . Let 𝐹 be the following set of types, for each 𝑖 ≥ 1 atoms such that 𝒜(−−−→) 1 𝑖 𝐹𝑖 := {𝑡 ∈ ℱ | 𝑛 ∈ 𝑁𝐵 , 𝑐 ∈ 𝑁𝐼 , 𝑟 ∈ 𝑁𝑅+ , 𝜋1 (𝑡) = {𝐴 ∈ 𝑁𝐶 | 𝐴(𝑛) ∈ 𝒜𝑖 } ∪ {⊤}, 𝜋2 (𝑡) = {𝑟 ∈ 𝑁𝑅+ | 𝑟(𝑛, 𝑐) ∈ 𝒜𝑖 }, 𝜋3 (𝑡) = {𝐴 ∈ 𝑁𝐶 | 𝐴(𝑐) ∈ 𝒜𝑖 } ∪ {⊤}}. Construct the pre-good successor relation, psucc𝑖 ⊆ ℱ × 𝒫(𝒮), initially set to (𝑡, {}) ∈ psucc𝑖 for all 𝑡 ∈ 𝐹𝑖 , and the loop computation relation, 𝐿 ⊆ ℱ × 𝑁𝐶 , initially empty, for each 𝑖 ≥ 1 by applying the following rules exhaustively. 1. If {(𝑡, 𝐴1 ), . . . , (𝑡, 𝐴𝑛 )} ⊆ 𝐿 and 𝐴1 ⊓ · · · ⊓ 𝐴𝑛 ⊑ 𝐵 ∈ 𝒯 , then update 𝐿 by adding (𝑡, 𝐵); 2. If (𝑡, 𝑢) ∈ psucc𝑖 and 𝐴1 ⊓ · · · ⊓ 𝐴𝑛 ⊑ 𝐵 ∈ 𝒯 such that there exists 1 ≤ 𝑗 ≤ 𝑚, with 𝐴 ⊆ 𝜋2 (𝑢𝑗 ), but 𝐵 ̸∈ 𝜋2 (𝑢𝑗 ), then update the entry (𝑡, 𝑢) in psucc𝑖 to (𝑡, {𝑢1 , . . . , (𝜋1 (𝑢𝑗 ), 𝜋2 (𝑢𝑗 ) ∪ {𝐵}), . . . , 𝑢𝑚 }); 3. If (𝑡, 𝑢) ∈ psucc𝑖 and ∃𝑟.𝐴 ⊑ 𝐵 ∈ 𝒯 such that there exists 1 ≤ 𝑗 ≤ 𝑚, with 𝑟 ∈ 𝜋1 (𝑢𝑗 ) and 𝐴 ∈ 𝜋2 (𝑢𝑗 ), but 𝐵 ̸∈ 𝜋1 (𝑡), then update 𝐿 by adding (𝑡, 𝐵); 4. If (𝑡, 𝑢) ∈ psucc𝑖 and ∃𝑟.𝐴 ⊑ 𝐵 ∈ 𝒯 such that there exists 1 ≤ 𝑗 ≤ 𝑚, with 𝑟− ∈ 𝜋1 (𝑢𝑗 ) and 𝐴 ∈ 𝜋1 (𝑡) ∪ {𝐴′ | (𝑡, 𝐴′ ) ∈ 𝐿}, but 𝐵 ̸∈ 𝜋2 (𝑢𝑗 ), then update the entry (𝑡, 𝑢) in psucc𝑖 to (𝑡, {𝑢1 , . . . , (𝜋1 (𝑢𝑗 ), 𝜋2 (𝑢𝑗 ) ∪ {𝐵}), . . . , 𝑢𝑚 }); 𝜋3 (𝑡) 𝑐𝒜,𝑆 = {𝑐} (¬𝑐)𝒜,𝑆 = Δ𝒜 ∖ {𝑐} 𝜋2 (𝑡) 𝐴𝒜,𝑆 = {𝑐 | 𝐴(𝑐) ∈ 𝒜} 𝜋1 (𝑡) (¬𝐴)𝒜,𝑆 = Δ𝒜 ∖ {𝑐 | 𝐴(𝑐) ∈ 𝒜} 𝜋1 (𝑢) 𝜋1 (𝑢′ ) (𝜙1 ∧ 𝜙2 )𝒜,𝑆 = (𝜙1 )𝒜,𝑆 ∩ (𝜙2 )𝒜,𝑆 𝜋2 (𝑢) 𝜋2 (𝑢′ ) (∃𝐸.𝜙)𝒜,𝑆 = {𝑒 ∈ Δ𝒜 | ∃𝑒′ : (𝑒, 𝑒′ ) ∈ 𝐸 𝒜 ∧ 𝑒′ ∈ 𝜙𝒜,𝑆 } 𝑠𝒜,𝑆 = {𝑐 | 𝑠(𝑐) ∈ 𝑆} Figure 1: Structure of the good successor configuration Figure 2: Evaluating shape expressions 5. If (𝑡, 𝑢) ∈ psucc𝑖 and 𝑟 ⊑ 𝑟′ ∈ 𝒯 such that there exists 1 ≤ 𝑗 ≤ 𝑚, with 𝑟 ∈ 𝜋1 (𝑢𝑗 ), but 𝑟′ ̸∈ 𝜋1 (𝑢𝑗 ), or 𝑟− ∈ 𝜋1 (𝑢𝑗 ), but 𝑟′− ̸∈ 𝜋1 (𝑢𝑗 ) then update the entry (𝑡, 𝑢) in psucc𝑖 to (𝑡, {𝑢1 , . . . , (𝜋1 (𝑢𝑗 ) ∪ {𝑟′ }, 𝜋2 (𝑢𝑗 )), . . . , 𝑢𝑚 }); 6. If {(𝑡, 𝑢), (𝑡′ , 𝑢′ )} ⊆ psucc𝑖 such that there exists 1 ≤ 𝑗 ≤ 𝑚, with 𝜋1 (𝑡) = 𝜋3 (𝑡′ ), 𝜋1 (𝑢𝑗 )− = 𝜋2 (𝑡′ ) and 𝜋2 (𝑢𝑗 ) = 𝜋1 (𝑡′ ), then update psucc𝑖 (𝑡) to (𝑡, {𝑢1 , . . . , (𝜋1 (𝑢𝑗 ), 𝜋2 (𝑢𝑗 ) ∪ {𝐴′ | (𝑡′ , 𝐴′ ) ∈ 𝐿}), . . . , 𝑢𝑚 }); 7. If (𝑡, 𝑢) ∈ psucc𝑖 and 𝐴 ⊑ ∃𝑟.𝐵 ∈ 𝒯 such that 𝐴 ∈ 𝜋1 (𝑡), and 𝑟 ̸∈ 𝜋2 (𝑡) or 𝐴 ̸∈ 𝜋3 (𝑡), and for all 1 ≤ 𝑗 ≤ 𝑚, 𝑟 ̸∈ 𝜋1 (𝑢𝑗 ) or 𝐵 ̸∈ 𝜋2 (𝑢𝑗 ), then update the entry (𝑡, 𝑢) in psucc𝑖 to (𝑡, 𝑢 ∪ {({𝑟}, {⊤, 𝐵})}); 8. If (𝑡, 𝑢) ∈ psucc𝑖 and there exists 1 ≤ 𝑗, 𝑘 ≤ 𝑙 such that 𝑗 ̸= 𝑘, 𝜋1 (𝑢𝑗 ) ⊆ 𝜋1 (𝑢𝑘 ) and 𝜋2 (𝑢𝑗 ) ⊆ 𝜋2 (𝑢𝑘 ), then update the entry (𝑡, 𝑢) in psucc𝑖 to (𝑡, 𝑢 ∖ {𝑢𝑗 }); 9. If (𝑡, 𝑢) ∈ psucc𝑖 , then for each 1 ≤ 𝑗 ≤ 𝑙, such that ((inv (𝑢𝑗 ), 𝜋1 (𝑡)), 𝑢′ ) ̸∈ psucc𝑖 , for any 𝑢′ , add ((inv (𝑢𝑗 ), 𝜋1 (𝑡), {}) to psucc𝑖 ; where we use the notation 𝐴, 𝑢, 𝑢′ as a shorthand for {𝐴1 , . . . , 𝐴𝑛 }, {𝑢1 , . . . , 𝑢𝑚 }, resp. {𝑢′1 , . . . , 𝑢′𝑚′ }. Now, let ℬ𝑖 be the smallest set of atoms such that 𝒜𝑖 ⊆ ℬ𝑖 and, if (𝑡, 𝐵) ∈ 𝐿 and there exists 𝑐 ∈ 𝑁𝐵 , 𝑑 ∈ 𝑁𝐼 such that 𝜋1 (𝑡) ⊆ {𝐴 ∈ 𝑁𝐶 | 𝐴(𝑐) ∈ ℬ𝑖 } ∪ {⊤}, 𝜋2 (𝑡) ⊆ {𝑟 ∈ 𝑁𝑅+ | 𝑟(𝑐, 𝑑) ∈ ℬ𝑖 } and 𝜋3 (𝑡) ⊆ {𝐴 ∈ 𝑁𝐶 | 𝐴(𝑑) ∈ ℬ𝑖 } ∪ {⊤}, then 𝐵(𝑐) ∈ ℬ𝑖 . Define 𝒜𝑖+1 as the unique (up to isomorphism) 𝐼 𝒯 ,𝑁 𝜔 ∘ −−→ 𝒜 core set of atoms such that ℬ𝑖 (−−−→) 𝑖+1 . Proposition 1. There exists an 𝑛 ≥ 1 such that 𝒜𝑛 = 𝒜𝑛+1 . We use the notation 𝒜+ 𝒯 for 𝒜𝑛 . First, note that by construction, 𝒜𝑖 ⊆ 𝒜𝑖+1 . Furthermore, the amount of blank nodes neighbouring a node in the data, such that they do not subsume each other, is restricted by the amount of existential axioms in 𝒯 . We can now define the good successor configuration as the limit of psucc𝑛 , and the set ℱ𝒯 ,𝒜 of live types as the types occurring in it. Definition 3. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜). Let 𝑛 ≥ 1 be the smallest 𝑛 such that 𝒜𝑛 = 𝒜𝑛+1 . Set succ𝒯 ,𝒜 = psucc𝑛 and let ℱ𝒯 ,𝒜 ⊆ ℱ be such that 𝑡 ∈ ℱ𝒯 ,𝒜 iff there exists 𝑢 such that (𝑡, 𝑢) ∈ succ𝒯 ,𝒜 . In the good successor configuration, we only add successors that are directly required by the axioms and whose set of roles can be summarised by one role that implies all others. This will be useful later. We denote by cl 𝒯 (𝑟) ⊆ 𝑁𝑅+ the smallest set such that 𝑟 ∈ cl 𝒯 (𝑟) and if 𝑟′ ∈ cl 𝒯 (𝑟) and 𝑟′ ⊑ 𝑟′′ ∈ 𝒯 or 𝑟′− ⊑ 𝑟′′− ∈ 𝒯 , then 𝑟′′ ∈ cl 𝒯 (𝑟). Proposition 2. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜). If succ𝒯 ,𝒜 (𝑡, 𝑢), then for each 𝑢 ∈ 𝑢 there exists a 𝐴 ⊑ ∃𝑟.𝐵 ∈ 𝒯 such that 𝐴 ∈ 𝜋1 (𝑡), cl 𝒯 (𝑟) = 𝜋1 (𝑢) and 𝐵 ∈ 𝜋2 (𝑢). When computing the good successor configuration, we are computing all loops, that is, all implications that may propagate back during the chase. In a nutshell, using the good successors is analogous to doing the chase with the closure of 𝒯 under semantic consequences 𝒯 |= 𝐶 ⊑ 𝐷. Proposition 3. Let 𝒯 * := {𝐶 ⊑ 𝐷 | 𝒯 |= 𝐶 ⊑ 𝐷}, and let ℬ be the unique (up to isomorphism) set of 𝒯 * ,𝑁𝐼 𝜔 core atoms such that 𝒜(−−−−→) ∘ −−→ ℬ. Then 𝒜+ ∼ = ℬ. 𝒯 The proof relies on the fact that all relevant loop computations are covered in the construction of the good successor configuration. As we consider live types, all possible configurations in the anonymous parts are considered and all inferred loops are saved in the loop computation list. 4. Core Universal Model Construction The good successor configuration allows us to easily build the desired universal core model. We introduce some additional notation. Let 𝒜 be a set of atoms. For 𝑋 ⊆ 𝑁𝐼 ∪ 𝑁𝐵 , we define the set 𝑛𝒜 (𝑋) of neighbours of 𝑋. We also define the type type 𝒜 (𝑐) of a blank node 𝑐 ∈ 𝑁𝐵 with |𝑛𝒜 (𝑐)| = 1, where 𝑑 ∈ 𝑛𝒜 (𝑐). ⋃︁ 𝑛𝒜 (𝑋) := {𝑥 ∈ 𝑁𝐼 ∪ 𝑁𝐵 | 𝑟(𝑐, 𝑥) ∈ 𝒜 ∨ 𝑟(𝑥, 𝑐) ∈ 𝒜, 𝑟 ∈ 𝑁𝑅 } 𝑐∈𝑋 type 𝒜 (𝑐) := ({𝐴 ∈ 𝑁𝐶 | 𝐴(𝑐) ∈ 𝒜} ∪ {⊤}, {𝑟 ∈ 𝑁𝑅+ | 𝑟(𝑐, 𝑑) ∈ 𝒜}, {𝐴 ∈ 𝑁𝐶 | 𝐴(𝑑) ∈ 𝒜} ∪ {⊤}) We can now build the model, starting from 𝒜+ 𝒯 and adding successors according to succ𝒯 ,𝒜 . Definition 4. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜), define the core universal model core(𝒯 , 𝒜) as the union of the sequence 𝑋1 , 𝑋2 , . . ., that is recursively defined in the following way • 𝑋1 := 𝒜+ 𝒯; • 𝑋𝑛+1 is constructed from 𝑋𝑛 by adding, for each 𝑐 ∈ 𝑁𝐵 such that |𝑛𝑋𝑛 (𝑐)| = 1 and succ𝒯 ,𝒜 (type 𝑋𝑛 (𝑐), 𝑢) holds for some 𝑢, the following atoms for each 𝑢 ∈ 𝑢, with a fresh 𝑑 ∈ 𝑁𝐵 for each 𝑢 ∈ 𝑢 – {𝑟(𝑐, 𝑑), 𝑟− (𝑑, 𝑐)} ⊆ 𝑋𝑛+1 for each 𝑟 ∈ 𝜋1 (𝑢); – 𝐴(𝑑) ∈ 𝑋𝑛+1 for each 𝐴 ∈ 𝜋2 (𝑢). Note that we can see each entry in the sequence 𝑋1 , 𝑋2 , . . . as layers such that the freshly added blank nodes in 𝑋𝑖 have distance 𝑖 to the ABox. We can now show that 𝑐𝑜𝑟𝑒(𝒯 , 𝒜) is a model, that it is a core, and that it is universal. Proposition 4. For all ℰℒℋℐ knowledge bases (𝒯 , 𝒜), 𝑐𝑜𝑟𝑒(𝒯 , 𝒜) is a model of (𝒯 , 𝒜). The proof is based on the idea that all axioms that we consider are all in normal form. Thus, the rules for the pre-good successor relation suffice to ensure that all axioms are satisfied in the part outside of the ABox 𝒜. Closer to the data, the result follows from Proposition 3. Proposition 5. For all ℰℒℋℐ knowledge bases (𝒯 , 𝒜), 𝑐𝑜𝑟𝑒(𝒯 , 𝒜) is a core. Proof.⋃︀A set of atoms 𝒜 has a core cover when there exist finite sets of atoms ℬ1 ⊆ ℬ2 ⊆ . . . with 𝒜 = 𝑖≥1 ℬ𝑖 , such that for all ℬ𝑖 , each homomorphism ℎ : Δℬ𝑖 → Δ𝒜 is an embedding. From [22], Theorem 16, we learn that each set of atoms that has a core cover is a core. Thus, it suffices to show that ℬ𝑖 = 𝑋𝑖 is a core cover for ⋃︁ ℬ= 𝑋𝑖 = core(𝒯 , 𝒜). 𝑖≥1 It is immediate that for each 𝑖, the identity mapping ℎ𝑖 : Δ𝑋𝑖 → Δcore(𝒯 ,𝒜) given by 𝑥 ↦→ 𝑥 is an embedding. By induction on 𝑖 it is shown that each homomorphism from 𝑋𝑖 to core(𝒯 , 𝒜) is equal to the identity mapping. • 𝑖 = 1. By definition 𝑋1 = 𝒜+ 𝒯 , which is by construction a core. Note that for all endormorphisms on cores it holds that if one node is mapped to itself, all nodes have to be mapped to themselves. + Furthermore, note that each 𝑥 ∈ Δcore(𝒯 ,𝒜) ∖ Δ𝒜𝒯 does not have a neighbour in the ABox, so the image of each homomorphism from 𝒜+ + 𝒜+ 𝒯 to core(𝒯 , 𝒜) is contained in 𝒜𝒯 . As all nodes in Δ 𝒯 ∩ 𝑁𝐼 have to be mapped to themselves, we can conclude the required. • 𝑖 = 𝑛 + 1. Let us consider some 𝑥 ∈ Δ𝑋𝑛+1 ∖ Δ𝑋𝑛 . By induction hypothesis, we know that for all homomorphisms ℎ : Δ𝑋𝑛+1 → Δcore , ℎ(𝑥′ ) = 𝑥′ for all 𝑥′ ∈ Δ𝑋𝑛 . Moreover, we know there exists 𝑦 ∈ Δ𝑋𝑛 with succ𝒯 ,𝒜 (type 𝑋𝑛 (𝑦), 𝑢) such that there exists 𝑢 ∈ 𝑢 that led to the introduction of 𝑥, i.e., 𝑛𝑋𝑛+1 (𝑥) = {𝑦}. So, for each homomorphism ℎ : Δ𝑋𝑛+1 → Δcore(𝒯 ,𝒜) , we find ℎ(𝑥) ∈ 𝑛𝑐𝑜𝑟𝑒(𝒯 ,𝒜) (𝑦). By construction, the neighbourhood of 𝑦 exists of (i) 𝑧 such that 𝑛𝑋𝑛 (𝑦) = {𝑧}, however, ℎ(𝑥) = 𝑧 would mean that the preconditions of introducing a new successor type in rule 7 in Definition 2 were not applied properly, and (ii) 𝑥′ such that there exists 𝑢′ ∈ 𝑢 with 𝑢 ̸= 𝑢′ that led to the introduction of 𝑥′ , however, ℎ(𝑥) = 𝑥′ would mean that rule 8 of Definition 2 was not properly applied. Thus, we must conclude that ℎ(𝑥) = 𝑥 for all 𝑥 ∈ Δ𝑋𝑛+1 , which concludes this induction. Proposition 6. For all ℰℒℋℐ knowledge bases (𝒯 , 𝒜), 𝑐𝑜𝑟𝑒(𝒯 , 𝒜) is universal. The idea of the proof is that we only introduce new successors if there exists an applicable existential axiom, and that all information in core(𝒯 , 𝒜) is a direct consequence of the applicable axioms and will be derived in every model. Finally, we show that if the core chase terminates, it results in core(𝒯 , 𝒜). Theorem 1. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜) such that ℬ is the unique, up to isomorphism, 𝒯 core → ∘ −−→)𝜔 ℬ, then ℬ ∼ structure such that 𝒜(− = core(𝒯 , 𝒜). Proof. First, note that both ℬ, by definition, and core(𝒯 , 𝒜), by Proposition 5 are cores. Therefore, as ℬ is finite, it suffices to show that there exists homomorphisms in both directions. Note that core(𝒯 , 𝒜) is a model by Proposition 4. As ℬ is a universal model, [7], it follows that there exists a homomorphism from ℬ into each model, including core(𝒯 , 𝒜). By Proposition 6 we also have that core(𝒯 , 𝒜) is universal, which means also the other required homomorphism must exist. The authors of [22] show that the stable chase does not always yield a universal model for existential rules, but it is unclear whether the same result holds for the restriction to ℰℒℋℐ. If the stable chase for ℰℒℋℐ is universal, then Theorem 1 can be extended to the stable chase. The core chase construction described in [7] uses multiple core computations. This makes it unsuitable for implementation, as computing cores is hard in general [21]. The restricted chase yields a core chase if the axioms are core-stratified [23], but there is no efficient algorithm for determining the core- stratification order or even whether the axioms are core-stratified. Note that the few core computations used in Definition 2 are very restricted and are thus not as hard: our approach is one of the first efficient techniques for constructing the core chase. We note that an effective procedure for ℰℒℋ was presented in [12], but the authors do not prove that their construction results in the core chase. 5. SHACL Validation We now consider the problem of SHACL validation in the presence of ℰℒℋℐ ontologies. First we introduce the fragment of SHACL we consider, which we call semi-positive SHACL. Like in [15], we follow the formalisation of SHACL in [17]. Let 𝑁𝑆 be an infinite set of shape names, disjoint from 𝑁𝐼 , 𝑁𝐵 , 𝑁𝐶+ and 𝑁𝑅+ . A shape expression 𝜙 is built in the following way 𝜙 ::= 𝑐 | ¬𝑐 | 𝐴 | ¬𝐴 | 𝜙 ∧ 𝜙 | ∃𝐸.𝜙 | 𝑠, for 𝑐 ∈ 𝑁𝐼 , 𝐴 ∈ 𝑁𝐶 , 𝑠 ∈ 𝑁𝑆 and 𝐸 a regular path expression such that 𝐸 ::= 𝑟 | 𝐸 ∪𝐸 | 𝐸 ∘𝐸 | 𝐸 * , for 𝑟 ∈ 𝑁𝑅+ , where "∪" is the standard union of relations, "∘" concatenation as described in the preliminaries and "*" the Kleene star. A constraint is an expression of the form 𝑠 ← 𝜙, for 𝑠 ∈ 𝑁𝑆 and 𝜙 a shape expression. A shapes graph (𝒞, 𝒢) is a pair consisting of a set of constraints 𝒞 and a set of targets 𝒢. Targets are of the form 𝑠(𝑐), where 𝑠 ∈ 𝑁𝑆 and 𝑐 ∈ 𝑁𝐼 . Given a set of atoms 𝒜, a shape assignment is a set of shape atoms of the form 𝑠(𝑐), for 𝑠 ∈ 𝑁𝑆 , 𝑐 ∈ Δ𝒜 . Given a shape expressions 𝜙, a shape assignment 𝑆, and a set of atoms 𝒜, define 𝜙𝒜,𝑆 ⊆ Δ𝒜 inductively as in Figure 2, where (𝑒, 𝑒′ ) ∈ 𝐸 𝒜 if there exists 𝑤1 · . . . · 𝑤𝑛 in the language defined by 𝐸 and {𝑤1 (𝑒, 𝑒1 ), . . . , 𝑤𝑛 (𝑒𝑛−1 , 𝑒′ )} ⊆ 𝒜. Just like in [15], we consider the least fixed point semantics for SHACL. We define an immediate consequence operator 𝑇𝒜,𝒞 that maps shape assignments to shape assignments: 𝑇𝒜,𝒞 (𝑆) := 𝑆 ∪ {𝑠(𝑎) | 𝑠 ← 𝜙 ∈ 𝒞, 𝑎 ∈ (𝜙)𝒜,𝑆 }. For a set of atoms 𝒜 and a shapes graph (𝒞, 𝒢), we say that 𝒜 validates (𝒞, 𝒢) when 𝒢 is contained in the least fixed point of 𝑇𝒜,𝒞 . In the presence of a TBox 𝒯 , we define the semantics of validation using the core universal model. That is, we say that (𝒯 , 𝒜) validates (𝒞, 𝒢) in case core(𝒯 , 𝒜) validates (𝒞, 𝒢). Our goal is now to reduce validation in the presence of a TBox, to plain validation over a set of atoms. The following normal form for SHACL constraints will facilitate this goal. Definition 5. A SHACL constraint is in normal form if it has one of the following forms (NC1) 𝑠 ← 𝑐 (NC2) 𝑠 ← ¬𝑐 (NC3) 𝑠 ← 𝐴 (NC4) 𝑠 ← ¬𝐴 (NC5) 𝑠 ← 𝑠′ ∧ 𝑠′′ (NC6) 𝑠 ← ∃𝑟.𝑠′ , where 𝑐 ∈ 𝑁𝐼 , 𝑟 ∈ 𝑁𝑅+ , {𝑠, 𝑠′ , 𝑠′′ } ⊆ 𝑁𝑆 and 𝐴 ∈ 𝑁𝐶 . Proposition 7. Each set of semi-positive SHACL constraints 𝒞 can be translated in time polynomial in |𝒞|, into a set of semi-positive constraints 𝒞 ′ in normal form such that for all 𝒢 that do not use the freshly introduced shape names, and all 𝒯 and 𝒜, we have that (𝒯 , 𝒜) validates (𝒞, 𝒢) iff (𝒯 , 𝒜) validates (𝒞 ′ , 𝒢). Proof. With a similar construction as of Proposition 4.2 in [18], it is easy to see that each constraint can be translated in polynomial time into the a normal form consisting of (NC1) until (NC5) as above, and (NC6’) 𝑠 ← ∃𝐸.𝑠′ . To normalise 𝑠 ← ∃𝐸.𝑠′ , suppose ℳ = (𝑄, Σ, 𝑞𝐼 , Δ, 𝑞𝐹 ) is the nondeterministic finite automaton recognising 𝐸. Take fresh shape names 𝑠𝑞 for each 𝑞 ∈ 𝑄 and add the following constraints 𝑠 ← 𝑠𝑞𝐼 ∧ 𝑠𝑞𝐼 , 𝑠𝑞 ← ∃𝑟.𝑠′𝑞 if (𝑞, 𝑟, 𝑞 ′ ) ∈ Δ and 𝑠𝑞𝐹 ← 𝑠′ ∧ 𝑠′ . Note that this automaton can be constructed in polynomial time, which is a standard result in the literature, see for instance [24]. We are ready now to reduce validation to the case without a TBox. Given a TBox 𝒯 and a set of atoms 𝒜, or more specifically, the succ𝒯 ,𝒜 that represents a universal core model of (𝒯 , 𝒜). We transform a shapes graph (𝒞, 𝒢) into a new shapes graph (𝒞𝒯 ,𝒜 , 𝒢) such that (𝒯 , 𝒜) validates (𝒞, 𝒢) iff 𝒜′ validates (𝒞𝒯 ,𝒜 , 𝒢), for some set of atoms 𝒜′ that can be easily obtained from 𝒜 and the construction of succ𝒯 ,𝒜 . Unlike the similar rewriting for DL-Liteℛ , the variant we introduce here does not work for every set of atoms 𝒜: we can only guarantee soundness and completeness if the set of atoms 𝒜 is such that succ𝒯 ,𝒜 represents the core universal model of (𝒯 , 𝒜). We compromise the data independence, but in this way, we avoid the best case exponentially of its predecessor [15]. We use a fresh concept name 𝐴 ̂︀ ∈ 𝑁𝐶 to decorate the outer layer of 𝒜+ , and define 𝒜* := 𝒯 𝒯 + 𝒜+𝒯 ∪ { ̂︀ | 𝑐 ∈ Δ𝒜𝒯 ∩ 𝑁𝐵 }.The elements in our rewriting are triples (𝑡, 𝑊, 𝐻) ⊆ ℱ𝒯 ,𝒜 × 𝒫({∃𝑟.𝑠 | 𝐴(𝑐) 𝑟 ∈ 𝑁𝑅+ , 𝑠 ∈ 𝑁𝑆 }) × 𝒫(𝑁𝑆 ): 𝑡 is a live type, 𝑊 a set of assumptions and 𝐻 a set of shape names. The starting set 𝐼 of our rewriting is such that (𝑡, 𝑊, 𝐻) ∈ 𝐼 iff • if ∃𝑟.𝑠′ ∈ 𝑊 , there exists 𝑠 ← ∃𝑟.𝑠′ ∈ 𝒞 and 𝑟 ∈ 𝜋2 (𝑡), and • 𝑠 ∈ 𝐻 iff there exists ∃𝑟.𝑠′ ∈ 𝑊 such that 𝑠 ← ∃𝑟.𝑠′ ∈ 𝒞. Our rewriting algorithm adds shape names 𝑠 to the heads 𝐻. Specifically, we update (𝑡, 𝑊, 𝐻) ∈ 𝐼, to (𝑡, 𝑊, 𝐻 ∪ {𝑠}) when any of the following holds: 1. 𝑠 ← 𝐴 ∈ 𝒞, 𝐴 ∈ 𝜋1 (𝑡), 𝑠 ̸∈ 𝐻, or 2. 𝑠 ← ¬𝐴 ∈ 𝒞, 𝐴 ̸∈ 𝜋1 (𝑡), 𝑠 ̸∈ 𝐻, or 3. 𝑠 ← ¬𝑐 ∈ 𝒞, 𝑠 ̸∈ 𝐻, or 4. 𝑠 ← 𝑠1 ∧ 𝑠2 ∈ 𝒞, {𝑠1 , 𝑠2 } ⊆ 𝐻, 𝑠 ̸∈ 𝐻, or 5. 𝑠 ← ∃𝑟.𝑠1 ∈ 𝒞, and there exists (𝑡′ , 𝑊 ′ , 𝐻 ′ ) ∈ 𝐼 such that, for some 𝑢 ∈ 𝑢 with succ𝒯 ,𝒜 (𝑡, 𝑢), we have 𝑡′ = (inv (𝑢), 𝜋1 (𝑡)), 𝑠1 ∈ 𝐻 ′ , 𝑟− ∈ 𝜋2 (𝑡) and {𝑠 | ∃𝑟.𝑠 ∈ 𝑊 ′ } ⊆ 𝐻. Let sat(𝐼) be the result of exhaustively adding shape names as above. The triples in sat(𝐼) are then translated into the set of constraints 𝒞𝒯 ,𝒜 in the following way. If (𝑡, 𝑊, 𝐻) ∈ sat(𝐼), then for each 𝑠 ∈ 𝐻 the following constraint is contained in 𝒞𝒯 ,𝒜 ⋀︁ ⋀︁ ⋀︁ 𝑠← 𝐴 ∧ ∃𝑟. 𝐵∧ ∃𝑟.𝑠′ ∧ 𝐴, ̂︀ 𝐴∈𝜋1 (𝑡) 𝐵∈𝜋3 (𝑡) ∃𝑟.𝑠′ ∈𝑊 where 𝑟 ∈ 𝜋2 (𝑡) is such that for all 𝑟′ ∈ 𝜋2 (𝑡) we have 𝑟′ ∈ cl 𝒯 (𝑟). Note that the existence of 𝑟 is guaranteed by Proposition 2. This gives us the desired rewriting. Its proof of correctness is analogous to the one in [15]. Theorem 2. Given an ℰℒℋℐ knowledge base (𝒯 , 𝒜) and (𝒞, 𝒢) a semi-positive SHACL shapes graph. Then it holds that (𝒯 , 𝒜) validates (𝒞, 𝒢) iff 𝒜*𝒯 validates (𝒞𝒯 ,𝒜 , 𝒢). Since we are taking the input 𝒜 into account, we do not have to consider all possible types and 𝒞𝒯 is not best-case exponential. However, there are still ways to optimise the rewriting. The first and simplest optimisation is to restrict the form of the starting sets even more, that is, not all 𝑠 ∈ 𝑁𝑆 will ever interact with each other. So, let rel (𝑠) ⊆ 𝑁𝑆 be defined as the smallest set such that 𝑠 ∈ rel (𝑠); if 𝑠′ ← 𝑠′′ ∧ 𝑠′′′ ∈ 𝒞 and {𝑠′ , 𝑠′′ , 𝑠′′′ } ∩ rel (𝑠) ̸= {}, then {𝑠′ , 𝑠′′ , 𝑠′′′ } ∈ rel (𝑠); and if 𝑠′ ← ∃𝑟.𝑠′′ ∈ 𝒞 and {𝑠′ , 𝑠′′ } ∩ rel (𝑠) ̸= {}, then {𝑠′ , 𝑠′′ } ∈ rel (𝑠). The restriction is then that (𝑡, 𝑊, 𝐻) is only considered to be in the starting set in case there exists 𝑠 ∈ 𝑁𝑆 such that 𝐻 ⊆ rel (𝑠). This optimisation is mostly useful for sets of constraints that relatively use a small amount of role names. Another more involved optimisation is based on the close correspondence between the SHACL fragment we are considering and ℰℒℋℐ in normalised form. We can translate the SHACL constraints into ℰℒℋℐ axioms, introducing for each existing concept name a fresh concept name to denote the absence of this concept name in succ𝒯 ,𝒜 . A similar algorithm to the one in Section 3 will then decorate the good successor configuration with all relevant shape names. We believe that the latter version may be better suited for implementation. However, all these optimisations cannot avoid ExpTime-complexity in the worst case. The proof of the next claim is an easy generalisation of the one in [15]. To show PTime-completeness in data complexity the rewriting can be made data independent as in [15]. Theorem 3. In the presence of ℰℒℋℐ TBoxes, SHACL validation is ExpTime-complete in combined complexity and PTime-complete in data complexity. This holds also for positive constraints. 6. Conclusions and Outlook In this paper, we introduced a new, more efficient representation of a core universal model. We had shown in [15] that this can be done in a data-independent way; the construction there is for DL-Liteℛ , but a data-independent computation for ℰℒℋℐ would not be harder than the one we have presented here. Instead, by considering the exact configurations that appear in the data, we have chosen to focus solely on computing the relevant parts, highly reducing the number of good successor configurations and TBox subsumptions that need to be computed. Thus, this is a first step towards an efficient implementation of the core chase. We further focus on one of many application areas of this technique: we use the alternative represen- tation of a core universal model to reduce SHACL validation combined with ℰℒℋℐ TBoxes to plain SHACL validation. Also here, we could take all types for a data-independent rewriting, but restricting it to the live types allows us to avoid an exponential computation in many cases, as not all combinations of types and assumptions need to be considered. This means the proposed algorithm also brings us closer to implementing the combination of SHACL with ontologies. We are confident that the proposed techniques can most likely be optimised further, but leave this for future work. To keep the presentation simple we focused on semi-positive SHACL in this paper, but we note that the rewriting presented in the last section can be extended to SHACL with stratified negation following the ideas of [15]. Acknowledgments The project leading to this application has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101034440. This work was also partially supported by the Austrian Science Fund (FWF) project P30873. References [1] M. Ortiz, M. Simkus, Reasoning and query answering in description logics, in: Reasoning Web, volume 7487 of Lecture Notes in Computer Science, Springer, 2012, pp. 1–53. [2] M. Bienvenu, M. Ortiz, Ontology-mediated query answering with data-tractable description logics, in: Reasoning Web, volume 9203 of Lecture Notes in Computer Science, Springer, 2015, pp. 218–307. [3] M. Ortiz, S. Rudolph, M. Simkus, Query answering in the horn fragments of the description logics SHOIQ and SROIQ, in: IJCAI, IJCAI/AAAI, 2011, pp. 1039–1044. [4] D. S. Johnson, A. C. Klug, Testing containment of conjunctive queries under functional and inclusion dependencies, J. Comput. Syst. Sci. 28 (1984) 167–189. [5] R. Fagin, P. G. Kolaitis, R. J. Miller, L. Popa, Data exchange: semantics and query answering, Theor. Comput. Sci. 336 (2005) 89–124. [6] B. Marnette, Generalized schema-mappings: from termination to tractability, in: J. Paredaens, J. Su (Eds.), Proceedings of the Twenty-Eigth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2009, June 19 - July 1, 2009, Providence, Rhode Island, USA, ACM, 2009, pp. 13–22. [7] A. Deutsch, A. Nash, J. B. Remmel, The chase revisited, in: M. Lenzerini, D. Lembo (Eds.), Proceedings of the Twenty-Seventh ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2008, June 9-11, 2008, Vancouver, BC, Canada, ACM, 2008, pp. 149–158. [8] T. Eiter, M. Ortiz, M. Simkus, T. Tran, G. Xiao, Query rewriting for Horn-SHIQ plus rules, in: J. Hoffmann, B. Selman (Eds.), Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, July 22-26, 2012, Toronto, Ontario, Canada, AAAI Press, 2012, pp. 726–733. [9] H. Pérez-Urbina, B. Motik, I. Horrocks, Tractable query answering and rewriting under description logic constraints, J. Appl. Log. 8 (2010) 186–209. [10] D. Calvanese, G. D. Giacomo, D. Lembo, M. Lenzerini, A. Poggi, M. Rodriguez-Muro, R. Rosati, M. Ruzzi, D. F. Savo, The MASTRO system for ontology-based data access, Semantic Web 2 (2011) 43–53. [11] D. Calvanese, B. Cogrel, S. Komla-Ebri, R. Kontchakov, D. Lanti, M. Rezk, M. Rodriguez-Muro, G. Xiao, Ontop: Answering SPARQL queries over relational databases, Semantic Web 8 (2017) 471–487. [12] S. Borgwardt, W. Forkel, Closed-world semantics for conjunctive queries with negation over ELH-bottom ontologies, in: S. Kraus (Ed.), Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019, ijcai.org, 2019, pp. 6131–6135. [13] M. Arenas, G. Gottlob, A. Pieris, Expressive languages for querying the semantic web, ACM Trans. Database Syst. 43 (2018) 13:1–13:45. [14] S. Ellmauthaler, M. Krötzsch, S. Mennicke, Answering queries with negation over existential rules, in: Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI 2022, AAAI Press, 2022, pp. 5626–5633. [15] S. Ahmetaj, M. Ortiz, A. Oudshoorn, M. Šimkus, Reconciling shacl and ontologies: Semantics and validation via rewriting, in: ECAI 2023, IOS Press, 2023, pp. 27–35. [16] W3C, Shape constraint language (SHACL), Technical Report. (2017). https://www.w3.org/TR/shacl. [17] J. Corman, J. L. Reutter, O. Savkovic, Semantics and validation of recursive SHACL, in: D. Vrandecic, K. Bontcheva, M. C. Suárez-Figueroa, V. Presutti, I. Celino, M. Sabou, L. Kaffee, E. Simperl (Eds.), The Semantic Web - ISWC 2018 - 17th International Semantic Web Conference, Monterey, CA, USA, October 8-12, 2018, Proceedings, Part I, volume 11136 of Lecture Notes in Computer Science, Springer, 2018, pp. 318–336. [18] M. Andresel, J. Corman, M. Ortiz, J. L. Reutter, O. Savkovic, M. Šimkus, Stable model semantics for recursive SHACL, in: Proc. of The Web Conference 2020, ACM, 2020, p. 1570–1580. [19] P. Pareti, G. Konstantinidis, F. Mogavero, Satisfiability and containment of recursive SHACL, Journal of Web Semantics 74 (2022) 100721. [20] B. Bogaerts, M. Jakubowski, J. Van den Bussche, SHACL: A description logic in disguise, in: Proc. of LPNMR 2022, volume 13416 of Lecture Notes in Computer Science, Springer, 2022, pp. 75–88. [21] P. Hell, J. Nesetril, The core of a graph, Discret. Math. 109 (1992) 117–126. [22] D. Carral, M. Krötzsch, M. Marx, A. Ozaki, S. Rudolph, Preserving constraints with the stable chase, in: B. Kimelfeld, Y. Amsterdamer (Eds.), 21st International Conference on Database Theory, ICDT 2018, March 26-29, 2018, Vienna, Austria, volume 98 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018, pp. 12:1–12:19. [23] M. Krötzsch, Computing cores for existential rules with the standard chase and ASP, in: D. Cal- vanese, E. Erdem, M. Thielscher (Eds.), Proceedings of the 17th International Conference on Principles of Knowledge Representation and Reasoning, KR 2020, Rhodes, Greece, September 12-18, 2020, 2020, pp. 603–613. [24] J. Hromkovic, S. Seibert, T. Wilke, Translating regular expressions into small 𝜖-free nondetermin- istic finite automata, J. Comput. Syst. Sci. 62 (2001) 565–588.