=Paper=
{{Paper
|id=Vol-3263/paper-14
|storemode=property
|title=Computing Concept Referring Expressions with Standard OWL Reasoners
|pdfUrl=https://ceur-ws.org/Vol-3263/paper-14.pdf
|volume=Vol-3263
|authors=Moritz Illich,Birte Glimm
|dblpUrl=https://dblp.org/rec/conf/dlog/IllichG22
}}
==Computing Concept Referring Expressions with Standard OWL Reasoners==
Computing Concept Referring Expressions with Standard OWL Reasoners Moritz Illich1 , Birte Glimm1 1 Ulm University, James-Franck-Ring, Ulm, 89081, Germany Abstract Classical instance queries over an ontology only consider explicitly named individuals. Concept referring expressions (CREs) also allow for returning answers in the form of concepts that describe implicitly given individuals in terms of their relation to an explicitly named one. Existing approaches, e.g., based on tree automata, can neither be integrated into state-of-the-art OWL reasoners nor are they directly amenable for an efficient implementation. To address this, we devise a novel algorithm that uses highly optimized OWL reasoners as a black box. In addition to the standard criteria of singularity and certainty for CREs, we devise and consider the criterion of uniqueness of CREs for Horn ๐โ๐ ontologies. The evaluation of our prototypical implementation shows that computing CREs for the most general concept (โค) can be done in less than one minute for ontologies with thousands of individuals and concepts. Keywords ontologies, instance retrieval, referring expressions 1. Introduction Regarding Description Logic (DL) ontologies, automated reasoning systems derive implicit consequences of explicitly stated information, e.g., when answering queries for concept instances. Classically, answers to such queries consist of individual names that are used in the knowledge base. Consider, for example, a knowledge base that states that "every person has a mother who is a person" and that "John is a person" (in DL syntax: Person โ โhasMother.Person and Person(john)). Obviously, john is an answer to the query for instances of the concept Person and such names are also called referring expressions as they refer to elements in the models of the knowledge base. In each model of the knowledge base, however, there is also an element that represents "the mother of John". This anonymous element can be described by a concept referring expression (CRE) [1, 2, 3], i.e., by a concept that describes an anonymous element w.r.t. a named individual. For our example, we can use the CRE "the person who is the mother of John" (as DL concept: Person โ โhasMotherโ .{john}). The reasoning task of generalized instance retrieval refers to computing entailed named and anonymous answers in the form of CREs for a concept instance query, which allows for more comprehensive query results. Usually, CREs are meant to identify a single element (e.g., "Johnโs mother" and not some set of elements as, e.g., the concept Person itself). While some approaches for computing CREs rely on functionality of roles and of role paths in order to guarantee singularity, e.g., [2], we do not DL 2022: 35th International Workshop on Description Logics, August 7โ10, 2022, Haifa, Israel $ moritz.illich@uni-ulm.de (M. Illich); birte.glimm@uni-ulm.de (B. Glimm) ยฉ 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) make this restriction as also the recent work of Toman and Weddell [3]. Another challenge when computing CREs is describing all entailed answers to (concept instance) queries. While the approach of Toman and Weddell based on tree automata addresses this problem, it is not well-suited for an efficient implementation. Moreover, the applied constraints that ensure the construction of only singular CREs are actually too strict, thus neglecting query answers in some cases, while an explicit prevention of duplicate answers referring to different, but semantically equivalent, CREs is also not provided. In this paper, we address these issues and present the first practical method for generalized instance retrieval that uses highly optimized reasoners as a black box for computing CREs. At first, we shortly introduce necessary definitions regarding DLs and CREs. We then present the algorithm for computing answers to generalized instance queries and prove its properties in Section 3. In Section 4, we show the results of our empirical evaluation, followed by a discussion of related work in Section 5 and conclusions in Section 6. 2. Preliminaries Before we can describe our algorithm, we first have to consider some relevant definitions concerning ontologies and CREs. However, since we assume the reader to be familiar with DLs, we only focus on selected aspects. 2.1. Description Logics The syntax of the DL ๐โ๐โ๐ช is defined using a vocabulary consisting of countably infinite disjoint sets NC of atomic concepts, NR of atomic roles, and NI of individuals. A role is either an atomic or an inverse role ๐โ , ๐ โ NR . An ๐โ๐โ๐ช concept is defined as ๐ถ ::= โค | โฅ | ๐ด | {๐} | ยฌ๐ถ | ๐ถ1 โ ๐ถ2 | ๐ถ1 โ ๐ถ2 | โ๐ .๐ถ | โ๐ .๐ถ, where ๐ด โ NC , ๐ โ NI , and ๐ is a role. The DL ๐โ๐ is obtained from ๐โ๐โ๐ช by disallowing the use of nominals ({๐}) and inverse roles. In the remainder, we use ๐, ๐ for individuals, ๐ด, ๐ต for atomic and ๐ถ, ๐ท for (possibly) complex concepts, and ๐ for an atomic and ๐ for a (possibly) non-atomic role. A TBox is a set of concept inclusion axioms ๐ถ โ ๐ท. An ABox is a set of (concept and role) assertions of the form ๐ถ(๐) and ๐ (๐, ๐), respectively. A knowledge base (or ontology) ๐ฆ = ๐ฏ โช ๐ consists of a TBox ๐ฏ and an ABox ๐. We use ๐ถ โก ๐ท to abbreviate ๐ถ โ ๐ท and ๐ท โ ๐ถ. With cons(๐ฆ), rols(๐ฆ), and inds(๐ฆ), we denote the sets of atomic concepts, atomic roles, and individuals occurring in ๐ฆ, respectively. Interpretations and models are defined in the standard way (see e.g. [4]). For a knowledge base ๐ฆ = ๐ฏ โช ๐, we say that ๐ฆ is (concept) materialized if ๐ฆ |= ๐ด(๐) implies ๐ด(๐) โ ๐ฆ for each ๐ด โ cons(๐ฆ) โช {โค} and ๐ โ inds(๐ฆ); ๐ is reduced if there is no ๐โฒ โ ๐ with {๐(๐, ๐) | ๐(๐, ๐) โ ๐} โ ๐โฒ such that ๐โฒ โช {๐ต(๐) | ๐ด(๐) โ ๐ and ๐ฆ |= ๐ด โ ๐ต, ๐ด, ๐ต โ cons(๐ฆ)} is materialized, i.e., a reduced ABox is materialized with the most specific atomic concepts. An ๐โ๐ knowledge base ๐ฆ = ๐ฏ โช ๐ is Horn and in normalized form if ๐ถ โ NC for every ๐ถ(๐) โ ๐ฆ, and every ๐ถ โ ๐ท โ ๐ฆ is in one of the forms โค โ ๐ด, ๐ด โ ๐ต, ๐ด โ โฅ, ๐ด โ ๐ต โ ๐ดโฒ , โ๐.๐ด โ ๐ต, ๐ด โ โ๐.๐ต, ๐ด โ โ๐.๐ต, where ๐ด, ๐ดโฒ , ๐ต โ NC and ๐ โ NR . We use ๐ฆโ to denote the set {โ๐.๐ต | ๐ด โ โ๐.๐ต โ ๐ฏ } and ๐ฆโ to denote {โ๐.๐ต | ๐ด โ โ๐.๐ต โ ๐ฏ }. W.l.o.g., we assume in the remainder that Horn ๐โ๐ knowledge bases are normalized by applying a structural transformation (see e.g. [5]) and that the ABox is reduced. Each consistent Horn ๐โ๐ knowledge base ๐ฆ has a so-called universal model ๐ฐ which is minimal and unique, and contains all the implied facts. In particular, the universal model has a tree-like form of role-connected (anonymous) individuals with named individuals as roots where individuals are only made equal if it is necessarily entailed by the knowledge base. For query answering, it suffices to consider this universal model [3]. 2.2. Concept Referring Expressions Principally, referring expressions serve the purpose of uniquely identifying an object in a certain context through its properties, that also include relations to other objects [6]. In DLs, this can be realized by a conjunction of concepts with relations modeled as existential restrictions over inverse roles, which relate an anonymous individual to a named one (expressed as nominal): Definition 1. Let ๐ฆ be a normalized ontology, ๐ถ = ๐ด1 โ. . .โ๐ด๐ with ๐ด๐ โ cons(๐ฆ), ๐ โ rols(๐ฆ), and ๐ โ inds(๐ฆ), then a concept referring expression ๐ธ is defined as ๐ธ ::= {๐} ๐ธ ::= ๐ถ โ โ๐โ .(๐ธ) and we call ๐ the base individual of ๐ธ. The reason for describing referring expressions this way lies in the tree model property of DLs, which allows us to identify an anonymous object in terms of a named individual (the base individual) and the (inverse) role path that connects the former to the latter [3]. Consider the example taken from Toman and Weddell [3]: Example 1. Let ๐ฆ = ๐ฏ โช ๐ consist of ๐ฏ = {๐ด โ โ๐.๐ถ, ๐ด โ โ๐.๐ท, ๐ด โ โ๐.๐ต} and ๐ = {๐ด(๐)}, with the three (tree) models of ๐ฆ: ๐ด ๐ ๐ด ๐ ๐ด ๐ ๐ ๐ ๐ ๐ ๐ ๐ ๐ต, ๐ถ ๐ต, ๐ถ ๐ต, ๐ท ๐ต, ๐ถ ๐ต, ๐ท ๐ต, ๐ถ, ๐ท The model in the middle is the universal model as it can be embedded into any other model of ๐ฆ. When we consider the generalized instance query for the concept ๐ต, the universal model makes it reasonable to consider the ๐-successor of ๐ labeled ๐ถ and the ๐-successor of ๐ labeled ๐ท as singular certain answers to the query. We can describe these answers as the CREs ๐ถ โ โ๐โ .{๐} and ๐ท โ โ๐โ .{๐}, respectively. Note that we use an ๐โ๐โ๐ช concept to describe the CRE, whereas ๐ฆ is in Horn ๐โ๐. For a CRE to qualify as an answer to a generalized instance query, we impose some require- ments, following the definition of Toman and Weddell [3] for singularity and certainty: Definition 2. A generalized instance query is an atomic concept ๐ โ NC . A set ans(๐) of CREs is an answer to ๐ over a consistent ontology ๐ฆ if, for each ๐ธ, ๐ธ โฒ โ ans(๐), certainty ๐ฆ |= ๐ธ โ ๐ and |๐ธ โ | > 0 for all models โ of ๐ฆ, singularity |๐ธ ๐ฐ | = 1 in the universal model ๐ฐ of ๐ฆ, and uniqueness ๐ธ ๐ฐ ฬธ= ๐ธ โฒ ๐ฐ in the universal model ๐ฐ of ๐ฆ completeness for every CRE ๐น with ๐ฆ |= ๐น โ ๐ there exists some ๐ธ โ ans(๐) s.t. ๐ธ โก ๐น . Here, the additional uniqueness property ensures that we do not obtain any duplicate answers, since we are only interested in the individual (semantically) represented by some CRE rather than the actual (syntactic) form of the latter. Besides, note that the above definition of singularity is a weakening of the property defined by Borgida et al. [2], where a CRE was required to denote a singleton set in all models of the knowledge base. This weakening, however, is essential in DLs that do not support counting (e.g., through number restrictions or functional roles) such as ๐โ๐โ๐ช. Note also that unreachable objects cannot be certain answers. Furthermore, a complex concept may still be considered as query if it can be represented by means of a new atomic concept and additional appropriate axioms in the ontology, like โค โ ๐ด, for instance. Example 2 (Ex. 1 cont.). Consider again the knowledge base ๐ฆ of Example 1. Formally, the set {๐ถ โ โ๐โ .{๐}, ๐ท โ โ๐โ .{๐}} is an answer to the generalized instance query ๐ต as both concept expressions are singular certain answers that identify different anonymous individuals. The CRE โ๐โ .{๐} does not satisfy the singularity requirement and the CRE ๐ถ โ ๐ท โ โ๐โ .{๐} violates the certainty requirement. Note that if we were to add ๐ถ โก ๐ท to ๐ฏ , the universal model would be the right-most model of Example 1. In addition to the two CREs from above, also ๐ถ โ ๐ท โ โ๐โ .{๐} becomes a certain singular answer. All three CREs, however, identify the same anonymous individual in the universal model (in different syntactic variants) and our proposed uniqueness criterion requires that ans(๐) consists of only one of them. 3. Answering Generalized Instance Queries We propose an algorithm for computing CREs for a generalized instance query given by an atomic concept ๐ over a (normalized) Horn ๐โ๐ ontology ๐ฆ = ๐ฏ โช ๐ with reduced ABox ๐. An important aspect of the algorithm is that it employs an OWL reasoner as a black box as only standard reasoning tasks (e.g., as foreseen in the OWL API [7]) are required and the concrete approach being applied to answer generalized instance queries is not known to the reasoner. For example, we retrieve super-concepts of a concept or the most specific concepts (w.r.t. to the subsumption hierarchy) that an individual is an instance of (corresponding to those concepts in a reduced ABox). This provides great flexibility concerning the choice of the applied reasoner and allows for using optimized strategies for different ontologies. To construct CREs serving as answers for ๐ we start with a CRE ๐ธ = {๐} for each ๐ โ inds(๐ฆ) and a corresponding current concept ๐ถ๐ = ๐ด1 โ . . . โ ๐ด๐ using every ๐ด๐ (๐) โ ๐. We then look for suitable existential restrictions โ๐.๐ท which satisfy ๐ฆ |= ๐ถ๐ โ โ๐.๐ท and, thus, allow us to extend the current CRE to ๐ธ โฒ = ๐ท โ โ๐โ .(๐ธ). This procedure is repeated and each time the newly related ๐ท is chosen as next current concept until no more appropriate restrictions can be found to further extend the generated CREs. Here, a current concept ๐ถ does not just enable us Algorithm 1 Answering generalized instance queries procedure GeneralizedInstanceQuery(๐ฆ, ๐) input: ๐ฆ = ๐ฏ โช ๐: an ontology, ๐: a generalized instance query output: ans(๐) 1: ans โ โ 2: for all ๐ โ inds(๐ฆ) do d 3: ๐ถ๐ โ ๐ด(๐)โ๐ ๐ด 4: ans โ ans โช getCREs(๐ฆ, ๐, ๐ถ๐ , {๐}, โ ) 5: return ans to further expand a CRE, but moreover serves as an indicator if the latter represents an answer for the query ๐, which is the case if ๐ฆ |= ๐ถ โ ๐ as shown later in Lemma 1. Initialization: Algorithm 1 describes the main algorithm that constructs, for each individual ๐ occurring in the knowledge base, the initial current concept and then calls the algorithm to generate CREs w.r.t. this (base) individual. Constructing Concept Referring Expressions: Algorithm 2 uses five steps to construct CREs for a given base individual that serve as answers for the considered query: Step 1: At first, in Line 2, we look for suitable existential restrictions โ๐.๐ต that allow us to relate the current concept ๐ถ to the concept ๐ต in order to establish a link to a new anonymous individual, thus requiring ๐ฆ |= ๐ถ โ โ๐.๐ต. Since we are interested in producing singular CREs, a restriction โ๐.๐ต should only be considered if there is no โ๐.๐ด (using the same role ๐) such that ๐ฆ |= ๐ด โ ๐ต. We capture this by introducing reduced (sets of) existential restrictions: Definition 3 (Reduced Existential Restrictions). Let ๐ฆ be a normalized Horn ๐โ๐ ontology and ๐ถ a concept, then ๐ฆโ (๐ถ) = {โ๐.๐ต | ๐ฆ |= ๐ถ โ โ๐.๐ต, ๐ต โ NC , ๐ โ NR }. We denote with ๐ฆโmin (๐ถ) a smallest subset of ๐ฆโ (๐ถ) such that, for each โ๐.๐ต โ ๐ฆโ (๐ถ) โ ๐ฆโmin (๐ถ), there is some โ๐.๐ด โ ๐ฆโmin (๐ถ) s.t. ๐ฆ |= ๐ด โ ๐ต. Note that an instance of โ๐.๐ต โ ๐ฆโ (๐ถ) โ ๐ฆโmin (๐ถ) is connected to more than one instance of ๐ต, which means that a CRE using this existential restriction is non-singular as it represents more than one individual. Moreover, if there are two candidates โ๐.๐ด, โ๐.๐ต โ ๐ฆโ (๐ถ) such that ๐ฆ |= ๐ด โก ๐ต, only one of them should be chosen in order to prevent duplicate answers. Step 2: For each selected existential restriction that induces a successor for an instance of the current concept ๐ถ, we next consider universal restrictions that further specify these successors (see Lines 4โ5) and we add these concepts to the set of successor concepts for the role in the sc relation. After this step, each (๐, ๐) โ sc is such that ๐ฆ |= ๐ถ โ โ๐.(๐ต1 โ . . . โ ๐ต๐ ) for ๐ = {๐ต1 , . . . , ๐ต๐ }. Step 3: In this step of the algorithm, we determine those existential restrictions that can actually be applied to further construct CREs leading to singular, unique answers of the query and collect them in the set next. If we encounter d ๐) โ sc for which we already have d an entry (๐, another entry (๐, ๐ โฒ ) โ next such that ๐ฆ |= ๐ตโ๐ ๐ต โก ๐ต โฒ โ๐ โฒ ๐ต โฒ , we do not consider (๐, ๐) Algorithm 2 Computing CREs for a base individual procedure getCREs(๐ฆ, ๐, ๐ถ, ๐ธ, used) input: ๐ฆ = ๐ฏ โช ๐: an ontology, ๐: a query, ๐ถ: the current concept, ๐ธ: the current CRE, used: a set of already applied existential restrictions output: CREs from ans(๐) that build upon ๐ธ 1: // Step 1: find suitable existential restrictions 2: sc โ {(๐, {๐ต}) | โ๐.๐ต โ ๐ฆโ min (๐ถ)} 3: // Step 2: find related universal restrictions 4: for all (๐, ๐) โ sc do 5: ๐ โ ๐ โช {๐ต | โ๐.๐ต โ ๐ฆโ and ๐ฆ |= ๐ถ โ โ๐.๐ต} 6: // Step 3: filter found existential restrictions 7: next โ โ 8: for all (๐, ๐) โ sc do 9: // check for potential duplicates 10: scโฒ โ {(๐, ๐ โฒ ) โ next | ๐ฆ |= ๐ตโ๐ ๐ต โก ๐ต โฒ โ๐ โฒ ๐ต โฒ } d d 11: if scโฒ = โ then 12: if ๐ธ = {๐} then 13: if {๐ | ๐(๐, ๐) โ ๐, โ๐ต โ ๐ : ๐ฆ |= ๐ต(๐)} = โ then 14: next โ next โช {(๐, ๐)} 15: else 16: // look for cycle in current CRE 17: if (๐, ๐) โ used then 18: ๐ธ โ markCycle(๐ธ, ๐, ๐) 19: else 20: next โ next โช {(๐, ๐)} 21: // Step 4: check if current CRE is an answer 22: ans โ โ 23: if ๐ฆ |= ๐ถ โ ๐ then 24: ans โ ans โช {๐ธ} 25: // Step 5: extend CRE recursively 26: for all (๐, ๐) โ next do 27: ๐ถ โฒ โ ๐ตโ๐ ๐ต d 28: ๐ธ โฒ โ ๐ถ โฒ โ โ๐ โ .(๐ธ) 29: ans โ ans โช getCREs(๐ฆ, ๐, ๐ถ โฒ , ๐ธ โฒ , used โช {(๐, ๐)}) 30: return ans any further to prevent duplicate answers, i.e., if the check in Line 11 fails, the entry (๐, ๐) is not added to the set next. Even though we already consider only the reduced set of existential restrictions in Step 1, this check is still necessary. Consider, for example, that ๐ฆ |= ๐ถ โ โ๐.๐ด and ๐ฆ |= ๐ถ โ โ๐.๐ต for the current concept ๐ถ and ๐ฆ ฬธ|= ๐ด โก ๐ต. Hence, we have two entries (๐, {๐ด}) and (๐, {๐ต}) โ sc. Assume, furthermore, that ๐ฆ |= ๐ถ โ โ๐.๐ด and ๐ฆ |= ๐ถ โ โ๐.๐ต. This means, however, that for both (๐, {๐ด}), (๐, {๐ต}) โ sc, we have ๐ฆ |= ๐ถ โ โ๐.(๐ด โ ๐ต). To guarantee the uniqueness property, an existential restriction should also be ignored for the initial extension of a CRE only consisting of a nominal {๐} if the existential restriction is already satisfied through a (named) individual ๐. This situation is handled in Line 13. Note that due to the tree model property this situation can only occur for the initial CREs. For cyclic knowledge bases, it may be the case that for two, possibly equal, current concepts ๐ถ and ๐ถ โฒ that occur during the algorithm processing, the same existential restriction is applicable. This may lead to an arbitrarily long repetition of the same sequence of existential restrictions that originally led from ๐ถ to ๐ถ โฒ , hence, giving rise to an infinitely large number of possible CREs (of increasing length). Analogous to blocking in tableau algorithms [4], we ensure termination of the algorithm by detecting such cycles and by preventing the reuse of an already processed existential restriction. Instead, we compactly represent such CREs using a cycle notation in form of surrounding square brackets "[...]+ " to mark the sequence that may occur several times repeatedly (indicated by a call to the procedure markCycle in Line 18). For an example, consider ๐ฆ = ๐ฏ โช ๐ with ๐ฏ = {๐ด โ โ๐.๐ด} and ๐ = {๐ด(๐)}, where [๐ด โ โ๐โ .(]+ {๐}) is an answer for the generalized instance query ๐ด and serves as single representative for the infinitely many CREs of the form ๐ด โ โ๐โ .(๐ด โ โ๐โ .(...{๐}). Step 4: In this step, we check if the current CRE already represents an answer for the query ๐. For a standard instance query, this would usually require to check ๐ฆ |= ๐(๐) for a named individual ๐, but since we are considering anonymous individuals represented by some CRE ๐ธ, we want to know whether ๐ฆ |= ๐ธ โ ๐. This is actually realized by checking ๐ฆ |= ๐ถ โ ๐ in Line 23 for the related current concept ๐ถ based on the following lemma. Lemma 1. Given a Horn ๐โ๐ ontology ๐ฆ, a generalized instance query ๐, a CRE ๐ธ and its associated (Horn ๐โ๐) current concept ๐ถ, where ๐ถ = ๐ด1 โ . . . โ ๐ด๐ using every ๐ด๐ (๐) โ ๐ if ๐ธ = {๐}, or ๐ถ = ๐ท if ๐ธ = ๐ท โ โ๐โ .(...), then ๐ฆ |= ๐ธ โ ๐ โ ๐ฆ |= ๐ถ โ ๐. Proof sketch. (โ) ๐ฆ |= ๐ถ โ ๐ requires that ๐ถ is at least as specific as ๐. Therefore, we show that the current concept ๐ถ is actually the most specific (Horn ๐โ๐) concept such that ๐ฆ |= ๐ธ โ ๐ถ and that for any other most specific candidate ๐ท with ๐ฆ |= ๐ธ โ ๐ท, we have ๐ฆ |= ๐ถ โก ๐ท, which is mainly based on the fact that we only work with reduced existential restrictions (see Algorithm 2, Line 2 and Definition 3). (โ) Per definition, a current concept ๐ถ of some CRE ๐ธ always satisfies ๐ฆ |= ๐ธ โ ๐ถ. Step 5: Even if an answer has been identified, the processing of the CRE does still proceed, because it may be possible that the continuing construction leads to another answer later on. Therefore, the algorithm is eventually called recursively using appropriately updated arguments, like an extended CRE ๐ธ โฒ and the new current concept ๐ถ โฒ . 3.1. Properties In the following, we discuss why the described algorithm works properly in the way that for a generalized instance query, only certain, singular, and unique answers are produced w.r.t. Definition 2. Additionally, we show that the algorithm returns every such answer and, furthermore, always terminates. Due to space limitations, we mainly provide proof sketches, while more details are given in an online appendix.1 Certainty. In Line 23 of Algorithm 2, a CRE ๐ธ is selected as answer if ๐ฆ |= ๐ถ โ ๐, which implies ๐ฆ |= ๐ธ โ ๐ according to Lemma 1. In addition, a CRE basically describes a chain of linked individuals where the existence of one individual ensures the presence of the next one being referenced through an existential restriction. Since we start with an explicitly present individual as base for ๐ธ, there is always at least one element in ๐ธ โ for each model โ. Singularity. For an initial CRE ๐ธ = {๐}, we directly get |๐ธ ๐ฐ | = 1 for the universal model ๐ฐ. In the more general case ๐ธ = ๐ถ โ โ๐โ .(๐ธ โฒ ) with ๐ธ โฒ as the (recursive) part leading to the base individual, we show that the non-singular situation |๐ธ ๐ฐ | > 1 is never achieved: This would require that there is some part ๐ถ โฒ โ โ๐โฒโ .(๐ธ โฒโฒ ) in ๐ธ where one individual from ๐ธ โฒโฒ๐ฐ has ๐โฒ -relations to (at least) two different individuals in ๐ถ โฒ๐ฐ . The necessary conditions for this are described in Lemma 2 below (with ๐ถ โฒ as ๐ท), but since the algorithm only selects reduced existential restrictions w.r.t. Definition 3, i.e., some โ๐.๐ต is only chosen if there is no candidate โ๐.๐ด with ๐ฆ |= ๐ด โ ๐ต, this situation never occurs. Lemma 2. Let ๐ฆ be a Horn ๐โ๐ ontology, ๐ฐ the universal model of ๐ฆ, and assume that โจ๐, ๐โฉ, โจ๐, ๐โฉ โ ๐๐ฐ for some role ๐, and (anonymous individuals) ๐, ๐ โ ๐ท๐ฐ for some concept ๐ท. We have ๐ ฬธ= ๐ iff there exist some concepts ๐ด, ๐ต, ๐ถ such that ๐ โ ๐ถ ๐ฐ , ๐ฆ |= ๐ถ โ โ๐.๐ด, ๐ฆ |= ๐ถ โ โ๐.๐ต, ๐ฆ |= ๐ด โ ๐ท, ๐ฆ |= ๐ต โ ๐ท, and ๐ฆ ฬธ|= ๐ด โก ๐ต. Proof sketch. This basically follows from the characteristics of the universal model ๐ฐ. Uniqueness. Considering the tree-like universal model ๐ฐ, having two different CREs ๐ธ, ๐ธ โฒ with ๐ธ ๐ฐ = ๐ธ โฒ๐ฐ means that i) they both describe the same path in the tree using different concepts or ii) one refers to a sub-path of the other leading to the same anonymous individual: i) Because an edge in a tree path relates to some โ๐.๐ต, ๐ธ โฒ might describe the same path as ๐ธ if it uses some โ๐.๐ต โฒ with ๐ต โ ๐ต โฒ (including ๐ต โก ๐ต โฒ ) for the associated edge instead. In Algorithm 2, this is prevented in Lines 2 and 11. ii) Assuming that ๐ and ๐โฒ are the base individuals of ๐ธ and ๐ธ โฒ , respectively, there must be some ๐(๐, ๐โฒ ) โ ๐ such that ๐ธ โฒ refers to a sub-path of ๐ธ. If ๐โฒ๐ฐ โ ๐ต ๐ฐ for some ๐ต, some โ๐.๐ต would describe the same edge as ๐(๐, ๐โฒ ), given that ๐๐ฐ โ (โ๐.๐ต)๐ฐ , which is why such existential restrictions are not chosen (Line 13). Completeness. It is not required to construct every possible CRE that represents an answer for the stated query, but rather to find one unique, singular CRE for every, possibly anonymous, individual that serves as an answer. For that, we assume there may exist some CRE ๐ธ describing a correct, singular and unique answer, but which cannot be constructed by the algorithm and prove that this is not possible: Since the considered ๐ธ is not generated by the algorithm, it must possess a unique subpart that does not occur in any of the constructed answers. As we already consider all individuals during the initialization, this part must be of the form "๐ถ โฒ โ โ๐โ .(๐ถ โ . . .)", where the existential restriction โ๐.๐ถ โฒ was applied for the concept ๐ถ. By 1 https://www.uni-ulm.de/fileadmin/website_uni_ulm/iui.inst.090/Publikationen/2022/IlGl22a_appendix.pdf considering all the cases in the algorithm where ๐ถ is not used as current concept or โ๐.๐ถ โฒ is not selected (or constructed), respectively, we can see that either the certainty, singularity or uniqueness property would be violated by regarding ๐ธ as an answer. Besides, the employed cycle notation may be seen as a compromise to implicitly return a possibly infinite number of CREs in form of just one compact instance in order to ensure the termination of the algorithm. Termination. As the given ontology always consists of a finite number of axioms and concepts, all the applied search and iteration processes come to a halt. Therefore, non-termination may only be realized by the repeated usage of an existential restriction during a recursive call, which, however, is prevented by the cycle detection in Line 17 of Algorithm 2. 4. Implementation and Evaluation A prototypical Java implementation of our algorithm is available online on GitHub2 and also includes some optimizations. One of them refers to the ordering of existential and universal restrictions by means of separate subsumption hierarchies. This allows us to reduce the number of performed subsumption tests in the way that if, for example, ๐ฆ ฬธ|= ๐ถ โ โ๐.๐ต, we do not need to further check ๐ฆ |= ๐ถ โ โ๐โฒ .๐ด for any โ๐โฒ .๐ด that is below โ๐.๐ต in the hierarchy, i.e., for which ๐ฆ |= โ๐โฒ .๐ด โ โ๐.๐ต. Usually, reasoners are only concerned with computing subsumption hierarchies for atomic concepts, referred to as classification (see e.g. [4]). Therefore, we introduce a new atomic concept for each existential (or universal) restriction such that the reasoner can (internally) compute the hierarchies for those new atomic concepts, based on which the actual hierarchies for existential (or universal) restrictions can then be derived. Due to the associated additional overhead, we also implemented the (explicit) Enhanced Traversal Method [8] to efficiently deal with smaller number of elements. Another improvement is achieved by combining base individuals into one set if they share an equivalent initial current concept and furthermore possess analogous given role assertions. Since these properties result in the same computations for each individual, we can avoid unnecessary repetitions by calling the algorithm only once for such a combined set and just generate different versions of the constructed CRE for each base individual in the end. For a general performance assessment, the implementation was tested on different ontologies listed in Table 1 with codinteraction-A and the separate ore_ont_2608/4516/3313 being part of the ORE 2015 Reasoner Competition Corpus [9], while HAO (v2021-03-05), VO (v1.1.171) and DTO (v1.1.1) were taken from BioPortal3 . If necessary, axioms not adhering to Horn ๐โ๐ were removed and ontologies merged with their imports to obtain one complete ontology. Tests showed that subsumption checking has a huge influence on the performance, which is why we considered two different reasoners, namely HermiT4 (v1.3.8) and JFact5 (v5.0.3). In general, HermiT achieved better runtime results than JFact, except for the larger ontologies VO and DTO, where JFact was faster in determining the earlier mentioned subsumption hierarchies. 2 https://github.com/M-Illich/Computing-CREs.git 3 https://bioportal.bioontology.org/ontologies 4 http://www.hermit-reasoner.com/ 5 http://jfact.sourceforge.net/ Table 1 Overview of adapted ontologies with their number of atomic concepts (#con), individuals (#ind), com- bined individual sets (#sets), considered existential/universal restrictions (#โ/#โ), average runtime in ms for โค as query, number of answers (#ans) and their percentage distribution w.r.t. their depth, i.e., number of contained inverse existential restrictions (* no subsumption hierarchies) Ontology #con #ind #sets #โ #โ Runtime #ans depth (%) [ms] 0 1 2-4 5-7 8-10 11-12 codinteraction-A 6 21 13 3 0 71 30 70 30 - - - - ore_ont_2608 453 212 212 10 10 265 279 76 24 - - - - ore_ont_4516 509 217 217 10 10 185 284 76 24 - - - - ore_ont_3313 865 2,070 858 114 0 57,966 9,734 21 8 34 27 9 1 HAO 2,548 2,764 687 522 1 37,965 2,764 100 - - - - - VO 6,828 167 4 1,513 237 460* 176 95 5 - - - - DTO 10,075 3 1 7,958 114 738* 3 100 - - - - - Since both VO and DTO, however, possess many existential restrictions but only few individuals (especially in form of combined sets), the performance was actually best if no subsumption hierarchies are computed. Table 1 shows the obtained runtime measurements (average of five runs) for computing all CREs, i.e., using โค as query, with HermiT as reasoner, based on an AMD Ryzen 7 3700X 3.59 GHz processor with 16 GB RAM on Windows 10 (64-Bit). 5. Related Work Krahmer and van Deemter [6] provide a survey on referring expressions and their computa- tional generation, given by a historic overview of research developments. Here, the problem of generating referring expressions is defined as finding a description in form of combined properties that uniquely identify a given individual in a certain context, which differs from our work in that we intend to describe unknown, anonymous individuals rather than explicitly named ones. A related approach that works with DL is provided by Areces et al. [1] introducing an algorithm that takes as input a model โณ and tries to find formulas (concepts) that represent the individuals in โณ. The basic idea here is that, starting with โค as formula for which the interpretation contains every domain element, new formulas are created by successively refining the already given ones through appending new conjuncts with the intention to reduce the number of elements in the formulasโ interpretations. At first, only atomic concepts are added as conjuncts, before existential restrictions relating to already computed formulas are considered until the formulasโ interpretations are singular or no further adaptations are possible. Unlike in our work, the individuals to be described are already stated and the occurrence of cycles in referring expressions is not considered. However, both approaches make use of relations between individuals, but with the distinction that Areces et al. start from the individual ๐ described by the referring expression and go to another one ๐, while we utilize the inverse case where another (present) individual ๐ refers to ๐, the (anonymous) one being described. The work that comes closest to ours is given by Toman and Weddell [3] dealing with (gen- eralized) instance retrieval queries on (Horn ๐โ๐) ontologies, based on a tree automaton for which a state ๐ is given by a set of atomic concepts, while transitions are defined by so-called matching tuples (๐, ๐0 , ..., ๐๐ ) stating that some ๐๐ with 0 โฉฝ ๐ โฉฝ ๐ can be reached when in the current state ๐ a certain existential restriction โ๐๐ .๐ถ๐ is applied. Starting with a set ๐๐ = {๐ด | ๐ด(๐) โ ๐} for some individual ๐, a tree can be unfolded using those matching tuples. In order to answer a query d ๐ต, this tree is traversed beginning with some ๐๐ until a state ๐๐ is reached where ๐ฆ |= ๐ดโ๐๐ ๐ด โ ๐ต, for which the sequence ๐1 ๐ด1 ...๐๐ ๐ด๐ (called certain path) referring to the applied existential restrictions is then transformed into a CRE ๐ด๐ โ โ๐๐โ .(... ๐ด1 โ โ๐1โ .{๐}) serving as answer for ๐ต. Like in our algorithm, the tree automaton starts the construction of CREs with a given individual and also takes care of occurring cycles. Furthermore, both approaches only use minimal existential restrictions to ensure singularity, although the definitions for minimality are different: While Toman and Weddell rely on the general subsumption of the whole restriction โ๐.๐ถ, we consider subsumption for the inner concept ๐ถ instead w.r.t. Definition 3, because only then singularity can be guaranteed (based on Lemma 2). In this respect, the tree automaton approach does not fulfill the completeness property due to only working with the most specific existential restrictions, even though some other ones might lead to singular CREs, too, as illustrated in Example 3 below. Besides, we are only interested in producing unique CREs, which is not explicitly handled by Toman and Weddell. Example 3. Let ๐ฆ = ๐ฏ โช ๐ consist of ๐ = {๐ด(๐)} and ๐ฏ = {๐ด โ โ๐.๐ถ, โ๐.๐ถ โ ๐ต, ๐ต โ โ๐.๐ท, ๐ถ โ ๐, ๐ท โ ๐}. For the generalized instance query ๐, we get ans(๐) = {๐ถ โ โ๐โ .{๐}, ๐ท โ โ๐โ .{๐}}, even though โ๐.๐ท is not minimal w.r.t. subsumption due to ๐ฆ |= โ๐.๐ถ โ โ๐.๐ท. 6. Conclusion We presented an algorithm that generates concept referring expressions to identify both named and anonymous individuals serving as certain, singular and unique answers for an instance retrieval query on a Horn ๐โ๐ ontology. For that, we start with a named individual and recursively determine appropriate existential restrictions to create a chain of relations that defines an implicitly given individual based on its connection to an explicit one. A crucial factor here is the selection based on reduced existential restrictions (cf. Definition 3), enabling us to construct every desired singular CRE. Besides, the black box consideration of the applied reasoner provides more flexibility in choosing optimal reasoning approaches for different situations. Future Research. In general, the question which reasoner should be selected for the consid- ered ontologies is not an easy one and requires further investigations. On the other side, there may be some advantages of directly integrating the algorithm into a particular reasoner as the algorithm can then utilize the reasonerโs internal data structures and be optimized accordingly. Moreover, it may be of interest to consider extensions of Horn ๐โ๐, too. While some features might be already supported with none or only small adaptations in the algorithm, others prob- ably require more examination, especially since we might lose the existence of the tree-like universal model and thus can no longer make use of our original singularity definition. References [1] C. Areces, A. Koller, K. Striegnitz, Referring Expressions as Formulas of Description Logic, in: Proceedings of the Fifth International Natural Language Generation Conference, 2008, pp. 42โ49. doi:https://doi.org/10.3115/1708322.1708332. [2] A. Borgida, D. Toman, G. Weddell, On Referring Expressions in Query Answering over First Order Knowledge Bases, in: Proceedings of the 15th International Conference on Principles of Knowledge Representation and Reasoning (KRโ16), AAAI Press, 2016, pp. 319โ328. URL: https://www.aaai.org/ocs/index.php/KR/KR16/paper/viewFile/12860/12488. [3] D. Toman, G. Weddell, Finding ALL Answers to OBDA Queries Using Referring Expressions, in: AI 2019: Advances in Artificial Intelligence, Springer International Publishing, Cham, 2019, pp. 117โ129. doi:https://doi.org/10.1007/978-3-030-35288-2_10. [4] F. Baader, D. Calvanese, D. McGuinness, P. Patel-Schneider, D. Nardi, et al., The Description Logic Handbook: Theory, implementation and applications, Cambridge University Press, 2003. [5] Y. Kazakov, Consequence-Driven Reasoning for Horn ๐ฎโโ๐ฌ Ontologies, in: Proc. of the 21st Int. Joint Conf. on Artificial Intelligence, IJCAI 2009, 2009, pp. 2040โ2045. URL: https://www.ijcai.org/Proceedings/09/Papers/336.pdf. [6] E. Krahmer, K. van Deemter, Computational Generation of Referring Expressions: A Survey, Computational Linguistics 38 (2012) 173โ218. doi:https://doi.org/10.1162/COLI_ a_00088. [7] M. Horridge, S. Bechhofer, The OWL API: A Java API for OWL ontologies, Semant. Web 2 (2011) 11โ21. doi:https://doi.org/10.3233/SW-2011-0025. [8] F. Baader, B. Hollunder, B. Nebel, H.-J. Profitlich, E. Franconi, An Empirical Analysis of Optimization Techniques for Terminological Representation Systems, Applied Intelligence 4 (1994) 109โ132. doi:https://doi.org/10.22028/D291-24994. [9] N. Matentzoglu, B. Parsia, ORE 2015 Reasoner Competition Corpus, 2015. URL: https: //doi.org/10.5281/zenodo.18578. doi:10.5281/zenodo.18578.