Computing Concept Referring Expressions with Standard OWL Reasoners Moritz Illich1 , Birte Glimm1 1 Ulm University, James-Franck-Ring, Ulm, 89081, Germany Abstract Classical instance queries over an ontology only consider explicitly named individuals. Concept referring expressions (CREs) also allow for returning answers in the form of concepts that describe implicitly given individuals in terms of their relation to an explicitly named one. Existing approaches, e.g., based on tree automata, can neither be integrated into state-of-the-art OWL reasoners nor are they directly amenable for an efficient implementation. To address this, we devise a novel algorithm that uses highly optimized OWL reasoners as a black box. In addition to the standard criteria of singularity and certainty for CREs, we devise and consider the criterion of uniqueness of CREs for Horn ๐’œโ„’๐’ž ontologies. The evaluation of our prototypical implementation shows that computing CREs for the most general concept (โŠค) can be done in less than one minute for ontologies with thousands of individuals and concepts. Keywords ontologies, instance retrieval, referring expressions 1. Introduction Regarding Description Logic (DL) ontologies, automated reasoning systems derive implicit consequences of explicitly stated information, e.g., when answering queries for concept instances. Classically, answers to such queries consist of individual names that are used in the knowledge base. Consider, for example, a knowledge base that states that "every person has a mother who is a person" and that "John is a person" (in DL syntax: Person โŠ‘ โˆƒhasMother.Person and Person(john)). Obviously, john is an answer to the query for instances of the concept Person and such names are also called referring expressions as they refer to elements in the models of the knowledge base. In each model of the knowledge base, however, there is also an element that represents "the mother of John". This anonymous element can be described by a concept referring expression (CRE) [1, 2, 3], i.e., by a concept that describes an anonymous element w.r.t. a named individual. For our example, we can use the CRE "the person who is the mother of John" (as DL concept: Person โŠ“ โˆƒhasMotherโˆ’ .{john}). The reasoning task of generalized instance retrieval refers to computing entailed named and anonymous answers in the form of CREs for a concept instance query, which allows for more comprehensive query results. Usually, CREs are meant to identify a single element (e.g., "Johnโ€™s mother" and not some set of elements as, e.g., the concept Person itself). While some approaches for computing CREs rely on functionality of roles and of role paths in order to guarantee singularity, e.g., [2], we do not DL 2022: 35th International Workshop on Description Logics, August 7โ€“10, 2022, Haifa, Israel $ moritz.illich@uni-ulm.de (M. Illich); birte.glimm@uni-ulm.de (B. Glimm) ยฉ 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) make this restriction as also the recent work of Toman and Weddell [3]. Another challenge when computing CREs is describing all entailed answers to (concept instance) queries. While the approach of Toman and Weddell based on tree automata addresses this problem, it is not well-suited for an efficient implementation. Moreover, the applied constraints that ensure the construction of only singular CREs are actually too strict, thus neglecting query answers in some cases, while an explicit prevention of duplicate answers referring to different, but semantically equivalent, CREs is also not provided. In this paper, we address these issues and present the first practical method for generalized instance retrieval that uses highly optimized reasoners as a black box for computing CREs. At first, we shortly introduce necessary definitions regarding DLs and CREs. We then present the algorithm for computing answers to generalized instance queries and prove its properties in Section 3. In Section 4, we show the results of our empirical evaluation, followed by a discussion of related work in Section 5 and conclusions in Section 6. 2. Preliminaries Before we can describe our algorithm, we first have to consider some relevant definitions concerning ontologies and CREs. However, since we assume the reader to be familiar with DLs, we only focus on selected aspects. 2.1. Description Logics The syntax of the DL ๐’œโ„’๐’žโ„๐’ช is defined using a vocabulary consisting of countably infinite disjoint sets NC of atomic concepts, NR of atomic roles, and NI of individuals. A role is either an atomic or an inverse role ๐‘Ÿโˆ’ , ๐‘Ÿ โˆˆ NR . An ๐’œโ„’๐’žโ„๐’ช concept is defined as ๐ถ ::= โŠค | โŠฅ | ๐ด | {๐‘œ} | ยฌ๐ถ | ๐ถ1 โŠ“ ๐ถ2 | ๐ถ1 โŠ” ๐ถ2 | โˆ€๐‘….๐ถ | โˆƒ๐‘….๐ถ, where ๐ด โˆˆ NC , ๐‘œ โˆˆ NI , and ๐‘… is a role. The DL ๐’œโ„’๐’ž is obtained from ๐’œโ„’๐’žโ„๐’ช by disallowing the use of nominals ({๐‘œ}) and inverse roles. In the remainder, we use ๐‘Ž, ๐‘ for individuals, ๐ด, ๐ต for atomic and ๐ถ, ๐ท for (possibly) complex concepts, and ๐‘Ÿ for an atomic and ๐‘… for a (possibly) non-atomic role. A TBox is a set of concept inclusion axioms ๐ถ โŠ‘ ๐ท. An ABox is a set of (concept and role) assertions of the form ๐ถ(๐‘Ž) and ๐‘…(๐‘Ž, ๐‘), respectively. A knowledge base (or ontology) ๐’ฆ = ๐’ฏ โˆช ๐’œ consists of a TBox ๐’ฏ and an ABox ๐’œ. We use ๐ถ โ‰ก ๐ท to abbreviate ๐ถ โŠ‘ ๐ท and ๐ท โŠ‘ ๐ถ. With cons(๐’ฆ), rols(๐’ฆ), and inds(๐’ฆ), we denote the sets of atomic concepts, atomic roles, and individuals occurring in ๐’ฆ, respectively. Interpretations and models are defined in the standard way (see e.g. [4]). For a knowledge base ๐’ฆ = ๐’ฏ โˆช ๐’œ, we say that ๐’ฆ is (concept) materialized if ๐’ฆ |= ๐ด(๐‘Ž) implies ๐ด(๐‘Ž) โˆˆ ๐’ฆ for each ๐ด โˆˆ cons(๐’ฆ) โˆช {โŠค} and ๐‘Ž โˆˆ inds(๐’ฆ); ๐’œ is reduced if there is no ๐’œโ€ฒ โŠ‚ ๐’œ with {๐‘Ÿ(๐‘Ž, ๐‘) | ๐‘Ÿ(๐‘Ž, ๐‘) โˆˆ ๐’œ} โŠ‚ ๐’œโ€ฒ such that ๐’œโ€ฒ โˆช {๐ต(๐‘Ž) | ๐ด(๐‘Ž) โˆˆ ๐’œ and ๐’ฆ |= ๐ด โŠ‘ ๐ต, ๐ด, ๐ต โˆˆ cons(๐’ฆ)} is materialized, i.e., a reduced ABox is materialized with the most specific atomic concepts. An ๐’œโ„’๐’ž knowledge base ๐’ฆ = ๐’ฏ โˆช ๐’œ is Horn and in normalized form if ๐ถ โˆˆ NC for every ๐ถ(๐‘Ž) โˆˆ ๐’ฆ, and every ๐ถ โŠ‘ ๐ท โˆˆ ๐’ฆ is in one of the forms โŠค โŠ‘ ๐ด, ๐ด โŠ‘ ๐ต, ๐ด โŠ‘ โŠฅ, ๐ด โŠ“ ๐ต โŠ‘ ๐ดโ€ฒ , โˆƒ๐‘Ÿ.๐ด โŠ‘ ๐ต, ๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ต, ๐ด โŠ‘ โˆ€๐‘Ÿ.๐ต, where ๐ด, ๐ดโ€ฒ , ๐ต โˆˆ NC and ๐‘Ÿ โˆˆ NR . We use ๐’ฆโˆƒ to denote the set {โˆƒ๐‘Ÿ.๐ต | ๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ต โˆˆ ๐’ฏ } and ๐’ฆโˆ€ to denote {โˆ€๐‘Ÿ.๐ต | ๐ด โŠ‘ โˆ€๐‘Ÿ.๐ต โˆˆ ๐’ฏ }. W.l.o.g., we assume in the remainder that Horn ๐’œโ„’๐’ž knowledge bases are normalized by applying a structural transformation (see e.g. [5]) and that the ABox is reduced. Each consistent Horn ๐’œโ„’๐’ž knowledge base ๐’ฆ has a so-called universal model ๐’ฐ which is minimal and unique, and contains all the implied facts. In particular, the universal model has a tree-like form of role-connected (anonymous) individuals with named individuals as roots where individuals are only made equal if it is necessarily entailed by the knowledge base. For query answering, it suffices to consider this universal model [3]. 2.2. Concept Referring Expressions Principally, referring expressions serve the purpose of uniquely identifying an object in a certain context through its properties, that also include relations to other objects [6]. In DLs, this can be realized by a conjunction of concepts with relations modeled as existential restrictions over inverse roles, which relate an anonymous individual to a named one (expressed as nominal): Definition 1. Let ๐’ฆ be a normalized ontology, ๐ถ = ๐ด1 โŠ“. . .โŠ“๐ด๐‘› with ๐ด๐‘– โˆˆ cons(๐’ฆ), ๐‘Ÿ โˆˆ rols(๐’ฆ), and ๐‘Ž โˆˆ inds(๐’ฆ), then a concept referring expression ๐ธ is defined as ๐ธ ::= {๐‘Ž} ๐ธ ::= ๐ถ โŠ“ โˆƒ๐‘Ÿโˆ’ .(๐ธ) and we call ๐‘Ž the base individual of ๐ธ. The reason for describing referring expressions this way lies in the tree model property of DLs, which allows us to identify an anonymous object in terms of a named individual (the base individual) and the (inverse) role path that connects the former to the latter [3]. Consider the example taken from Toman and Weddell [3]: Example 1. Let ๐’ฆ = ๐’ฏ โˆช ๐’œ consist of ๐’ฏ = {๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ถ, ๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ท, ๐ด โŠ‘ โˆ€๐‘Ÿ.๐ต} and ๐’œ = {๐ด(๐‘Ž)}, with the three (tree) models of ๐’ฆ: ๐ด ๐‘Ž ๐ด ๐‘Ž ๐ด ๐‘Ž ๐‘Ÿ ๐‘Ÿ ๐‘Ÿ ๐‘Ÿ ๐‘Ÿ ๐‘Ÿ ๐ต, ๐ถ ๐ต, ๐ถ ๐ต, ๐ท ๐ต, ๐ถ ๐ต, ๐ท ๐ต, ๐ถ, ๐ท The model in the middle is the universal model as it can be embedded into any other model of ๐’ฆ. When we consider the generalized instance query for the concept ๐ต, the universal model makes it reasonable to consider the ๐‘Ÿ-successor of ๐‘Ž labeled ๐ถ and the ๐‘Ÿ-successor of ๐‘Ž labeled ๐ท as singular certain answers to the query. We can describe these answers as the CREs ๐ถ โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž} and ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}, respectively. Note that we use an ๐’œโ„’๐’žโ„๐’ช concept to describe the CRE, whereas ๐’ฆ is in Horn ๐’œโ„’๐’ž. For a CRE to qualify as an answer to a generalized instance query, we impose some require- ments, following the definition of Toman and Weddell [3] for singularity and certainty: Definition 2. A generalized instance query is an atomic concept ๐‘„ โˆˆ NC . A set ans(๐‘„) of CREs is an answer to ๐‘„ over a consistent ontology ๐’ฆ if, for each ๐ธ, ๐ธ โ€ฒ โˆˆ ans(๐‘„), certainty ๐’ฆ |= ๐ธ โŠ‘ ๐‘„ and |๐ธ โ„ | > 0 for all models โ„ of ๐’ฆ, singularity |๐ธ ๐’ฐ | = 1 in the universal model ๐’ฐ of ๐’ฆ, and uniqueness ๐ธ ๐’ฐ ฬธ= ๐ธ โ€ฒ ๐’ฐ in the universal model ๐’ฐ of ๐’ฆ completeness for every CRE ๐น with ๐’ฆ |= ๐น โŠ‘ ๐‘„ there exists some ๐ธ โˆˆ ans(๐‘„) s.t. ๐ธ โ‰ก ๐น . Here, the additional uniqueness property ensures that we do not obtain any duplicate answers, since we are only interested in the individual (semantically) represented by some CRE rather than the actual (syntactic) form of the latter. Besides, note that the above definition of singularity is a weakening of the property defined by Borgida et al. [2], where a CRE was required to denote a singleton set in all models of the knowledge base. This weakening, however, is essential in DLs that do not support counting (e.g., through number restrictions or functional roles) such as ๐’œโ„’๐’žโ„๐’ช. Note also that unreachable objects cannot be certain answers. Furthermore, a complex concept may still be considered as query if it can be represented by means of a new atomic concept and additional appropriate axioms in the ontology, like โŠค โŠ‘ ๐ด, for instance. Example 2 (Ex. 1 cont.). Consider again the knowledge base ๐’ฆ of Example 1. Formally, the set {๐ถ โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}, ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}} is an answer to the generalized instance query ๐ต as both concept expressions are singular certain answers that identify different anonymous individuals. The CRE โˆƒ๐‘Ÿโˆ’ .{๐‘Ž} does not satisfy the singularity requirement and the CRE ๐ถ โŠ“ ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž} violates the certainty requirement. Note that if we were to add ๐ถ โ‰ก ๐ท to ๐’ฏ , the universal model would be the right-most model of Example 1. In addition to the two CREs from above, also ๐ถ โŠ“ ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž} becomes a certain singular answer. All three CREs, however, identify the same anonymous individual in the universal model (in different syntactic variants) and our proposed uniqueness criterion requires that ans(๐‘„) consists of only one of them. 3. Answering Generalized Instance Queries We propose an algorithm for computing CREs for a generalized instance query given by an atomic concept ๐‘„ over a (normalized) Horn ๐’œโ„’๐’ž ontology ๐’ฆ = ๐’ฏ โˆช ๐’œ with reduced ABox ๐’œ. An important aspect of the algorithm is that it employs an OWL reasoner as a black box as only standard reasoning tasks (e.g., as foreseen in the OWL API [7]) are required and the concrete approach being applied to answer generalized instance queries is not known to the reasoner. For example, we retrieve super-concepts of a concept or the most specific concepts (w.r.t. to the subsumption hierarchy) that an individual is an instance of (corresponding to those concepts in a reduced ABox). This provides great flexibility concerning the choice of the applied reasoner and allows for using optimized strategies for different ontologies. To construct CREs serving as answers for ๐‘„ we start with a CRE ๐ธ = {๐‘Ž} for each ๐‘Ž โˆˆ inds(๐’ฆ) and a corresponding current concept ๐ถ๐‘Ž = ๐ด1 โŠ“ . . . โŠ“ ๐ด๐‘› using every ๐ด๐‘– (๐‘Ž) โˆˆ ๐’œ. We then look for suitable existential restrictions โˆƒ๐‘Ÿ.๐ท which satisfy ๐’ฆ |= ๐ถ๐‘Ž โŠ‘ โˆƒ๐‘Ÿ.๐ท and, thus, allow us to extend the current CRE to ๐ธ โ€ฒ = ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .(๐ธ). This procedure is repeated and each time the newly related ๐ท is chosen as next current concept until no more appropriate restrictions can be found to further extend the generated CREs. Here, a current concept ๐ถ does not just enable us Algorithm 1 Answering generalized instance queries procedure GeneralizedInstanceQuery(๐’ฆ, ๐‘„) input: ๐’ฆ = ๐’ฏ โˆช ๐’œ: an ontology, ๐‘„: a generalized instance query output: ans(๐‘„) 1: ans โ† โˆ… 2: for all ๐‘Ž โˆˆ inds(๐’ฆ) do d 3: ๐ถ๐‘Ž โ† ๐ด(๐‘Ž)โˆˆ๐’œ ๐ด 4: ans โ† ans โˆช getCREs(๐’ฆ, ๐‘„, ๐ถ๐‘Ž , {๐‘Ž}, โˆ…) 5: return ans to further expand a CRE, but moreover serves as an indicator if the latter represents an answer for the query ๐‘„, which is the case if ๐’ฆ |= ๐ถ โŠ‘ ๐‘„ as shown later in Lemma 1. Initialization: Algorithm 1 describes the main algorithm that constructs, for each individual ๐‘Ž occurring in the knowledge base, the initial current concept and then calls the algorithm to generate CREs w.r.t. this (base) individual. Constructing Concept Referring Expressions: Algorithm 2 uses five steps to construct CREs for a given base individual that serve as answers for the considered query: Step 1: At first, in Line 2, we look for suitable existential restrictions โˆƒ๐‘Ÿ.๐ต that allow us to relate the current concept ๐ถ to the concept ๐ต in order to establish a link to a new anonymous individual, thus requiring ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ต. Since we are interested in producing singular CREs, a restriction โˆƒ๐‘Ÿ.๐ต should only be considered if there is no โˆƒ๐‘Ÿ.๐ด (using the same role ๐‘Ÿ) such that ๐’ฆ |= ๐ด โŠ‘ ๐ต. We capture this by introducing reduced (sets of) existential restrictions: Definition 3 (Reduced Existential Restrictions). Let ๐’ฆ be a normalized Horn ๐’œโ„’๐’ž ontology and ๐ถ a concept, then ๐’ฆโˆƒ (๐ถ) = {โˆƒ๐‘Ÿ.๐ต | ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ต, ๐ต โˆˆ NC , ๐‘Ÿ โˆˆ NR }. We denote with ๐’ฆโˆƒmin (๐ถ) a smallest subset of ๐’ฆโˆƒ (๐ถ) such that, for each โˆƒ๐‘Ÿ.๐ต โˆˆ ๐’ฆโˆƒ (๐ถ) โˆ– ๐’ฆโˆƒmin (๐ถ), there is some โˆƒ๐‘Ÿ.๐ด โˆˆ ๐’ฆโˆƒmin (๐ถ) s.t. ๐’ฆ |= ๐ด โŠ‘ ๐ต. Note that an instance of โˆƒ๐‘Ÿ.๐ต โˆˆ ๐’ฆโˆƒ (๐ถ) โˆ– ๐’ฆโˆƒmin (๐ถ) is connected to more than one instance of ๐ต, which means that a CRE using this existential restriction is non-singular as it represents more than one individual. Moreover, if there are two candidates โˆƒ๐‘Ÿ.๐ด, โˆƒ๐‘Ÿ.๐ต โˆˆ ๐’ฆโˆƒ (๐ถ) such that ๐’ฆ |= ๐ด โ‰ก ๐ต, only one of them should be chosen in order to prevent duplicate answers. Step 2: For each selected existential restriction that induces a successor for an instance of the current concept ๐ถ, we next consider universal restrictions that further specify these successors (see Lines 4โ€“5) and we add these concepts to the set of successor concepts for the role in the sc relation. After this step, each (๐‘Ÿ, ๐‘†) โˆˆ sc is such that ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.(๐ต1 โŠ“ . . . โŠ“ ๐ต๐‘› ) for ๐‘† = {๐ต1 , . . . , ๐ต๐‘› }. Step 3: In this step of the algorithm, we determine those existential restrictions that can actually be applied to further construct CREs leading to singular, unique answers of the query and collect them in the set next. If we encounter d ๐‘†) โˆˆ sc for which we already have d an entry (๐‘Ÿ, another entry (๐‘Ÿ, ๐‘† โ€ฒ ) โˆˆ next such that ๐’ฆ |= ๐ตโˆˆ๐‘† ๐ต โ‰ก ๐ต โ€ฒ โˆˆ๐‘† โ€ฒ ๐ต โ€ฒ , we do not consider (๐‘Ÿ, ๐‘†) Algorithm 2 Computing CREs for a base individual procedure getCREs(๐’ฆ, ๐‘„, ๐ถ, ๐ธ, used) input: ๐’ฆ = ๐’ฏ โˆช ๐’œ: an ontology, ๐‘„: a query, ๐ถ: the current concept, ๐ธ: the current CRE, used: a set of already applied existential restrictions output: CREs from ans(๐‘„) that build upon ๐ธ 1: // Step 1: find suitable existential restrictions 2: sc โ† {(๐‘Ÿ, {๐ต}) | โˆƒ๐‘Ÿ.๐ต โˆˆ ๐’ฆโˆƒ min (๐ถ)} 3: // Step 2: find related universal restrictions 4: for all (๐‘Ÿ, ๐‘†) โˆˆ sc do 5: ๐‘† โ† ๐‘† โˆช {๐ต | โˆ€๐‘Ÿ.๐ต โˆˆ ๐’ฆโˆ€ and ๐’ฆ |= ๐ถ โŠ‘ โˆ€๐‘Ÿ.๐ต} 6: // Step 3: filter found existential restrictions 7: next โ† โˆ… 8: for all (๐‘Ÿ, ๐‘†) โˆˆ sc do 9: // check for potential duplicates 10: scโ€ฒ โ† {(๐‘Ÿ, ๐‘† โ€ฒ ) โˆˆ next | ๐’ฆ |= ๐ตโˆˆ๐‘† ๐ต โ‰ก ๐ต โ€ฒ โˆˆ๐‘† โ€ฒ ๐ต โ€ฒ } d d 11: if scโ€ฒ = โˆ… then 12: if ๐ธ = {๐‘Ž} then 13: if {๐‘ | ๐‘Ÿ(๐‘Ž, ๐‘) โˆˆ ๐’œ, โˆ€๐ต โˆˆ ๐‘† : ๐’ฆ |= ๐ต(๐‘)} = โˆ… then 14: next โ† next โˆช {(๐‘Ÿ, ๐‘†)} 15: else 16: // look for cycle in current CRE 17: if (๐‘Ÿ, ๐‘†) โˆˆ used then 18: ๐ธ โ† markCycle(๐ธ, ๐‘Ÿ, ๐‘†) 19: else 20: next โ† next โˆช {(๐‘Ÿ, ๐‘†)} 21: // Step 4: check if current CRE is an answer 22: ans โ† โˆ… 23: if ๐’ฆ |= ๐ถ โŠ‘ ๐‘„ then 24: ans โ† ans โˆช {๐ธ} 25: // Step 5: extend CRE recursively 26: for all (๐‘Ÿ, ๐‘†) โˆˆ next do 27: ๐ถ โ€ฒ โ† ๐ตโˆˆ๐‘† ๐ต d 28: ๐ธ โ€ฒ โ† ๐ถ โ€ฒ โŠ“ โˆƒ๐‘Ÿ โˆ’ .(๐ธ) 29: ans โ† ans โˆช getCREs(๐’ฆ, ๐‘„, ๐ถ โ€ฒ , ๐ธ โ€ฒ , used โˆช {(๐‘Ÿ, ๐‘†)}) 30: return ans any further to prevent duplicate answers, i.e., if the check in Line 11 fails, the entry (๐‘Ÿ, ๐‘†) is not added to the set next. Even though we already consider only the reduced set of existential restrictions in Step 1, this check is still necessary. Consider, for example, that ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ด and ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ต for the current concept ๐ถ and ๐’ฆ ฬธ|= ๐ด โ‰ก ๐ต. Hence, we have two entries (๐‘Ÿ, {๐ด}) and (๐‘Ÿ, {๐ต}) โˆˆ sc. Assume, furthermore, that ๐’ฆ |= ๐ถ โŠ‘ โˆ€๐‘Ÿ.๐ด and ๐’ฆ |= ๐ถ โŠ‘ โˆ€๐‘Ÿ.๐ต. This means, however, that for both (๐‘Ÿ, {๐ด}), (๐‘Ÿ, {๐ต}) โˆˆ sc, we have ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.(๐ด โŠ“ ๐ต). To guarantee the uniqueness property, an existential restriction should also be ignored for the initial extension of a CRE only consisting of a nominal {๐‘Ž} if the existential restriction is already satisfied through a (named) individual ๐‘. This situation is handled in Line 13. Note that due to the tree model property this situation can only occur for the initial CREs. For cyclic knowledge bases, it may be the case that for two, possibly equal, current concepts ๐ถ and ๐ถ โ€ฒ that occur during the algorithm processing, the same existential restriction is applicable. This may lead to an arbitrarily long repetition of the same sequence of existential restrictions that originally led from ๐ถ to ๐ถ โ€ฒ , hence, giving rise to an infinitely large number of possible CREs (of increasing length). Analogous to blocking in tableau algorithms [4], we ensure termination of the algorithm by detecting such cycles and by preventing the reuse of an already processed existential restriction. Instead, we compactly represent such CREs using a cycle notation in form of surrounding square brackets "[...]+ " to mark the sequence that may occur several times repeatedly (indicated by a call to the procedure markCycle in Line 18). For an example, consider ๐’ฆ = ๐’ฏ โˆช ๐’œ with ๐’ฏ = {๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ด} and ๐’œ = {๐ด(๐‘Ž)}, where [๐ด โŠ“ โˆƒ๐‘Ÿโˆ’ .(]+ {๐‘Ž}) is an answer for the generalized instance query ๐ด and serves as single representative for the infinitely many CREs of the form ๐ด โŠ“ โˆƒ๐‘Ÿโˆ’ .(๐ด โŠ“ โˆƒ๐‘Ÿโˆ’ .(...{๐‘Ž}). Step 4: In this step, we check if the current CRE already represents an answer for the query ๐‘„. For a standard instance query, this would usually require to check ๐’ฆ |= ๐‘„(๐‘Ž) for a named individual ๐‘Ž, but since we are considering anonymous individuals represented by some CRE ๐ธ, we want to know whether ๐’ฆ |= ๐ธ โŠ‘ ๐‘„. This is actually realized by checking ๐’ฆ |= ๐ถ โŠ‘ ๐‘„ in Line 23 for the related current concept ๐ถ based on the following lemma. Lemma 1. Given a Horn ๐’œโ„’๐’ž ontology ๐’ฆ, a generalized instance query ๐‘„, a CRE ๐ธ and its associated (Horn ๐’œโ„’๐’ž) current concept ๐ถ, where ๐ถ = ๐ด1 โŠ“ . . . โŠ“ ๐ด๐‘› using every ๐ด๐‘– (๐‘Ž) โˆˆ ๐’œ if ๐ธ = {๐‘Ž}, or ๐ถ = ๐ท if ๐ธ = ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .(...), then ๐’ฆ |= ๐ธ โŠ‘ ๐‘„ โ‡” ๐’ฆ |= ๐ถ โŠ‘ ๐‘„. Proof sketch. (โ‡’) ๐’ฆ |= ๐ถ โŠ‘ ๐‘„ requires that ๐ถ is at least as specific as ๐‘„. Therefore, we show that the current concept ๐ถ is actually the most specific (Horn ๐’œโ„’๐’ž) concept such that ๐’ฆ |= ๐ธ โŠ‘ ๐ถ and that for any other most specific candidate ๐ท with ๐’ฆ |= ๐ธ โŠ‘ ๐ท, we have ๐’ฆ |= ๐ถ โ‰ก ๐ท, which is mainly based on the fact that we only work with reduced existential restrictions (see Algorithm 2, Line 2 and Definition 3). (โ‡) Per definition, a current concept ๐ถ of some CRE ๐ธ always satisfies ๐’ฆ |= ๐ธ โŠ‘ ๐ถ. Step 5: Even if an answer has been identified, the processing of the CRE does still proceed, because it may be possible that the continuing construction leads to another answer later on. Therefore, the algorithm is eventually called recursively using appropriately updated arguments, like an extended CRE ๐ธ โ€ฒ and the new current concept ๐ถ โ€ฒ . 3.1. Properties In the following, we discuss why the described algorithm works properly in the way that for a generalized instance query, only certain, singular, and unique answers are produced w.r.t. Definition 2. Additionally, we show that the algorithm returns every such answer and, furthermore, always terminates. Due to space limitations, we mainly provide proof sketches, while more details are given in an online appendix.1 Certainty. In Line 23 of Algorithm 2, a CRE ๐ธ is selected as answer if ๐’ฆ |= ๐ถ โŠ‘ ๐‘„, which implies ๐’ฆ |= ๐ธ โŠ‘ ๐‘„ according to Lemma 1. In addition, a CRE basically describes a chain of linked individuals where the existence of one individual ensures the presence of the next one being referenced through an existential restriction. Since we start with an explicitly present individual as base for ๐ธ, there is always at least one element in ๐ธ โ„ for each model โ„. Singularity. For an initial CRE ๐ธ = {๐‘Ž}, we directly get |๐ธ ๐’ฐ | = 1 for the universal model ๐’ฐ. In the more general case ๐ธ = ๐ถ โŠ“ โˆƒ๐‘Ÿโˆ’ .(๐ธ โ€ฒ ) with ๐ธ โ€ฒ as the (recursive) part leading to the base individual, we show that the non-singular situation |๐ธ ๐’ฐ | > 1 is never achieved: This would require that there is some part ๐ถ โ€ฒ โŠ“ โˆƒ๐‘Ÿโ€ฒโˆ’ .(๐ธ โ€ฒโ€ฒ ) in ๐ธ where one individual from ๐ธ โ€ฒโ€ฒ๐’ฐ has ๐‘Ÿโ€ฒ -relations to (at least) two different individuals in ๐ถ โ€ฒ๐’ฐ . The necessary conditions for this are described in Lemma 2 below (with ๐ถ โ€ฒ as ๐ท), but since the algorithm only selects reduced existential restrictions w.r.t. Definition 3, i.e., some โˆƒ๐‘Ÿ.๐ต is only chosen if there is no candidate โˆƒ๐‘Ÿ.๐ด with ๐’ฆ |= ๐ด โŠ‘ ๐ต, this situation never occurs. Lemma 2. Let ๐’ฆ be a Horn ๐’œโ„’๐’ž ontology, ๐’ฐ the universal model of ๐’ฆ, and assume that โŸจ๐‘, ๐‘ŽโŸฉ, โŸจ๐‘, ๐‘โŸฉ โˆˆ ๐‘Ÿ๐’ฐ for some role ๐‘Ÿ, and (anonymous individuals) ๐‘Ž, ๐‘ โˆˆ ๐ท๐’ฐ for some concept ๐ท. We have ๐‘Ž ฬธ= ๐‘ iff there exist some concepts ๐ด, ๐ต, ๐ถ such that ๐‘ โˆˆ ๐ถ ๐’ฐ , ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ด, ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ต, ๐’ฆ |= ๐ด โŠ‘ ๐ท, ๐’ฆ |= ๐ต โŠ‘ ๐ท, and ๐’ฆ ฬธ|= ๐ด โ‰ก ๐ต. Proof sketch. This basically follows from the characteristics of the universal model ๐’ฐ. Uniqueness. Considering the tree-like universal model ๐’ฐ, having two different CREs ๐ธ, ๐ธ โ€ฒ with ๐ธ ๐’ฐ = ๐ธ โ€ฒ๐’ฐ means that i) they both describe the same path in the tree using different concepts or ii) one refers to a sub-path of the other leading to the same anonymous individual: i) Because an edge in a tree path relates to some โˆƒ๐‘Ÿ.๐ต, ๐ธ โ€ฒ might describe the same path as ๐ธ if it uses some โˆƒ๐‘Ÿ.๐ต โ€ฒ with ๐ต โŠ‘ ๐ต โ€ฒ (including ๐ต โ‰ก ๐ต โ€ฒ ) for the associated edge instead. In Algorithm 2, this is prevented in Lines 2 and 11. ii) Assuming that ๐‘Ž and ๐‘Žโ€ฒ are the base individuals of ๐ธ and ๐ธ โ€ฒ , respectively, there must be some ๐‘Ÿ(๐‘Ž, ๐‘Žโ€ฒ ) โˆˆ ๐’œ such that ๐ธ โ€ฒ refers to a sub-path of ๐ธ. If ๐‘Žโ€ฒ๐’ฐ โˆˆ ๐ต ๐’ฐ for some ๐ต, some โˆƒ๐‘Ÿ.๐ต would describe the same edge as ๐‘Ÿ(๐‘Ž, ๐‘Žโ€ฒ ), given that ๐‘Ž๐’ฐ โˆˆ (โˆƒ๐‘Ÿ.๐ต)๐’ฐ , which is why such existential restrictions are not chosen (Line 13). Completeness. It is not required to construct every possible CRE that represents an answer for the stated query, but rather to find one unique, singular CRE for every, possibly anonymous, individual that serves as an answer. For that, we assume there may exist some CRE ๐ธ describing a correct, singular and unique answer, but which cannot be constructed by the algorithm and prove that this is not possible: Since the considered ๐ธ is not generated by the algorithm, it must possess a unique subpart that does not occur in any of the constructed answers. As we already consider all individuals during the initialization, this part must be of the form "๐ถ โ€ฒ โŠ“ โˆƒ๐‘Ÿโˆ’ .(๐ถ โŠ“ . . .)", where the existential restriction โˆƒ๐‘Ÿ.๐ถ โ€ฒ was applied for the concept ๐ถ. By 1 https://www.uni-ulm.de/fileadmin/website_uni_ulm/iui.inst.090/Publikationen/2022/IlGl22a_appendix.pdf considering all the cases in the algorithm where ๐ถ is not used as current concept or โˆƒ๐‘Ÿ.๐ถ โ€ฒ is not selected (or constructed), respectively, we can see that either the certainty, singularity or uniqueness property would be violated by regarding ๐ธ as an answer. Besides, the employed cycle notation may be seen as a compromise to implicitly return a possibly infinite number of CREs in form of just one compact instance in order to ensure the termination of the algorithm. Termination. As the given ontology always consists of a finite number of axioms and concepts, all the applied search and iteration processes come to a halt. Therefore, non-termination may only be realized by the repeated usage of an existential restriction during a recursive call, which, however, is prevented by the cycle detection in Line 17 of Algorithm 2. 4. Implementation and Evaluation A prototypical Java implementation of our algorithm is available online on GitHub2 and also includes some optimizations. One of them refers to the ordering of existential and universal restrictions by means of separate subsumption hierarchies. This allows us to reduce the number of performed subsumption tests in the way that if, for example, ๐’ฆ ฬธ|= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ต, we do not need to further check ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿโ€ฒ .๐ด for any โˆƒ๐‘Ÿโ€ฒ .๐ด that is below โˆƒ๐‘Ÿ.๐ต in the hierarchy, i.e., for which ๐’ฆ |= โˆƒ๐‘Ÿโ€ฒ .๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ต. Usually, reasoners are only concerned with computing subsumption hierarchies for atomic concepts, referred to as classification (see e.g. [4]). Therefore, we introduce a new atomic concept for each existential (or universal) restriction such that the reasoner can (internally) compute the hierarchies for those new atomic concepts, based on which the actual hierarchies for existential (or universal) restrictions can then be derived. Due to the associated additional overhead, we also implemented the (explicit) Enhanced Traversal Method [8] to efficiently deal with smaller number of elements. Another improvement is achieved by combining base individuals into one set if they share an equivalent initial current concept and furthermore possess analogous given role assertions. Since these properties result in the same computations for each individual, we can avoid unnecessary repetitions by calling the algorithm only once for such a combined set and just generate different versions of the constructed CRE for each base individual in the end. For a general performance assessment, the implementation was tested on different ontologies listed in Table 1 with codinteraction-A and the separate ore_ont_2608/4516/3313 being part of the ORE 2015 Reasoner Competition Corpus [9], while HAO (v2021-03-05), VO (v1.1.171) and DTO (v1.1.1) were taken from BioPortal3 . If necessary, axioms not adhering to Horn ๐’œโ„’๐’ž were removed and ontologies merged with their imports to obtain one complete ontology. Tests showed that subsumption checking has a huge influence on the performance, which is why we considered two different reasoners, namely HermiT4 (v1.3.8) and JFact5 (v5.0.3). In general, HermiT achieved better runtime results than JFact, except for the larger ontologies VO and DTO, where JFact was faster in determining the earlier mentioned subsumption hierarchies. 2 https://github.com/M-Illich/Computing-CREs.git 3 https://bioportal.bioontology.org/ontologies 4 http://www.hermit-reasoner.com/ 5 http://jfact.sourceforge.net/ Table 1 Overview of adapted ontologies with their number of atomic concepts (#con), individuals (#ind), com- bined individual sets (#sets), considered existential/universal restrictions (#โˆƒ/#โˆ€), average runtime in ms for โŠค as query, number of answers (#ans) and their percentage distribution w.r.t. their depth, i.e., number of contained inverse existential restrictions (* no subsumption hierarchies) Ontology #con #ind #sets #โˆƒ #โˆ€ Runtime #ans depth (%) [ms] 0 1 2-4 5-7 8-10 11-12 codinteraction-A 6 21 13 3 0 71 30 70 30 - - - - ore_ont_2608 453 212 212 10 10 265 279 76 24 - - - - ore_ont_4516 509 217 217 10 10 185 284 76 24 - - - - ore_ont_3313 865 2,070 858 114 0 57,966 9,734 21 8 34 27 9 1 HAO 2,548 2,764 687 522 1 37,965 2,764 100 - - - - - VO 6,828 167 4 1,513 237 460* 176 95 5 - - - - DTO 10,075 3 1 7,958 114 738* 3 100 - - - - - Since both VO and DTO, however, possess many existential restrictions but only few individuals (especially in form of combined sets), the performance was actually best if no subsumption hierarchies are computed. Table 1 shows the obtained runtime measurements (average of five runs) for computing all CREs, i.e., using โŠค as query, with HermiT as reasoner, based on an AMD Ryzen 7 3700X 3.59 GHz processor with 16 GB RAM on Windows 10 (64-Bit). 5. Related Work Krahmer and van Deemter [6] provide a survey on referring expressions and their computa- tional generation, given by a historic overview of research developments. Here, the problem of generating referring expressions is defined as finding a description in form of combined properties that uniquely identify a given individual in a certain context, which differs from our work in that we intend to describe unknown, anonymous individuals rather than explicitly named ones. A related approach that works with DL is provided by Areces et al. [1] introducing an algorithm that takes as input a model โ„ณ and tries to find formulas (concepts) that represent the individuals in โ„ณ. The basic idea here is that, starting with โŠค as formula for which the interpretation contains every domain element, new formulas are created by successively refining the already given ones through appending new conjuncts with the intention to reduce the number of elements in the formulasโ€™ interpretations. At first, only atomic concepts are added as conjuncts, before existential restrictions relating to already computed formulas are considered until the formulasโ€™ interpretations are singular or no further adaptations are possible. Unlike in our work, the individuals to be described are already stated and the occurrence of cycles in referring expressions is not considered. However, both approaches make use of relations between individuals, but with the distinction that Areces et al. start from the individual ๐‘Ž described by the referring expression and go to another one ๐‘, while we utilize the inverse case where another (present) individual ๐‘ refers to ๐‘Ž, the (anonymous) one being described. The work that comes closest to ours is given by Toman and Weddell [3] dealing with (gen- eralized) instance retrieval queries on (Horn ๐’œโ„’๐’ž) ontologies, based on a tree automaton for which a state ๐‘† is given by a set of atomic concepts, while transitions are defined by so-called matching tuples (๐‘†, ๐‘†0 , ..., ๐‘†๐‘˜ ) stating that some ๐‘†๐‘– with 0 โฉฝ ๐‘– โฉฝ ๐‘˜ can be reached when in the current state ๐‘† a certain existential restriction โˆƒ๐‘Ÿ๐‘– .๐ถ๐‘– is applied. Starting with a set ๐‘†๐‘Ž = {๐ด | ๐ด(๐‘Ž) โˆˆ ๐’œ} for some individual ๐‘Ž, a tree can be unfolded using those matching tuples. In order to answer a query d ๐ต, this tree is traversed beginning with some ๐‘†๐‘Ž until a state ๐‘†๐‘˜ is reached where ๐’ฆ |= ๐ดโˆˆ๐‘†๐‘˜ ๐ด โŠ‘ ๐ต, for which the sequence ๐‘Ÿ1 ๐ด1 ...๐‘Ÿ๐‘˜ ๐ด๐‘˜ (called certain path) referring to the applied existential restrictions is then transformed into a CRE ๐ด๐‘˜ โŠ“ โˆƒ๐‘Ÿ๐‘˜โˆ’ .(... ๐ด1 โŠ“ โˆƒ๐‘Ÿ1โˆ’ .{๐‘Ž}) serving as answer for ๐ต. Like in our algorithm, the tree automaton starts the construction of CREs with a given individual and also takes care of occurring cycles. Furthermore, both approaches only use minimal existential restrictions to ensure singularity, although the definitions for minimality are different: While Toman and Weddell rely on the general subsumption of the whole restriction โˆƒ๐‘Ÿ.๐ถ, we consider subsumption for the inner concept ๐ถ instead w.r.t. Definition 3, because only then singularity can be guaranteed (based on Lemma 2). In this respect, the tree automaton approach does not fulfill the completeness property due to only working with the most specific existential restrictions, even though some other ones might lead to singular CREs, too, as illustrated in Example 3 below. Besides, we are only interested in producing unique CREs, which is not explicitly handled by Toman and Weddell. Example 3. Let ๐’ฆ = ๐’ฏ โˆช ๐’œ consist of ๐’œ = {๐ด(๐‘Ž)} and ๐’ฏ = {๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ถ, โˆƒ๐‘Ÿ.๐ถ โŠ‘ ๐ต, ๐ต โŠ‘ โˆƒ๐‘Ÿ.๐ท, ๐ถ โŠ‘ ๐‘„, ๐ท โŠ‘ ๐‘„}. For the generalized instance query ๐‘„, we get ans(๐‘„) = {๐ถ โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}, ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}}, even though โˆƒ๐‘Ÿ.๐ท is not minimal w.r.t. subsumption due to ๐’ฆ |= โˆƒ๐‘Ÿ.๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ท. 6. Conclusion We presented an algorithm that generates concept referring expressions to identify both named and anonymous individuals serving as certain, singular and unique answers for an instance retrieval query on a Horn ๐’œโ„’๐’ž ontology. For that, we start with a named individual and recursively determine appropriate existential restrictions to create a chain of relations that defines an implicitly given individual based on its connection to an explicit one. A crucial factor here is the selection based on reduced existential restrictions (cf. Definition 3), enabling us to construct every desired singular CRE. Besides, the black box consideration of the applied reasoner provides more flexibility in choosing optimal reasoning approaches for different situations. Future Research. In general, the question which reasoner should be selected for the consid- ered ontologies is not an easy one and requires further investigations. On the other side, there may be some advantages of directly integrating the algorithm into a particular reasoner as the algorithm can then utilize the reasonerโ€™s internal data structures and be optimized accordingly. Moreover, it may be of interest to consider extensions of Horn ๐’œโ„’๐’ž, too. While some features might be already supported with none or only small adaptations in the algorithm, others prob- ably require more examination, especially since we might lose the existence of the tree-like universal model and thus can no longer make use of our original singularity definition. References [1] C. Areces, A. Koller, K. Striegnitz, Referring Expressions as Formulas of Description Logic, in: Proceedings of the Fifth International Natural Language Generation Conference, 2008, pp. 42โ€“49. doi:https://doi.org/10.3115/1708322.1708332. [2] A. Borgida, D. Toman, G. Weddell, On Referring Expressions in Query Answering over First Order Knowledge Bases, in: Proceedings of the 15th International Conference on Principles of Knowledge Representation and Reasoning (KRโ€™16), AAAI Press, 2016, pp. 319โ€“328. URL: https://www.aaai.org/ocs/index.php/KR/KR16/paper/viewFile/12860/12488. [3] D. Toman, G. Weddell, Finding ALL Answers to OBDA Queries Using Referring Expressions, in: AI 2019: Advances in Artificial Intelligence, Springer International Publishing, Cham, 2019, pp. 117โ€“129. doi:https://doi.org/10.1007/978-3-030-35288-2_10. [4] F. Baader, D. Calvanese, D. McGuinness, P. Patel-Schneider, D. Nardi, et al., The Description Logic Handbook: Theory, implementation and applications, Cambridge University Press, 2003. [5] Y. Kazakov, Consequence-Driven Reasoning for Horn ๐’ฎโ„‹โ„๐’ฌ Ontologies, in: Proc. of the 21st Int. Joint Conf. on Artificial Intelligence, IJCAI 2009, 2009, pp. 2040โ€“2045. URL: https://www.ijcai.org/Proceedings/09/Papers/336.pdf. [6] E. Krahmer, K. van Deemter, Computational Generation of Referring Expressions: A Survey, Computational Linguistics 38 (2012) 173โ€“218. doi:https://doi.org/10.1162/COLI_ a_00088. [7] M. Horridge, S. Bechhofer, The OWL API: A Java API for OWL ontologies, Semant. Web 2 (2011) 11โ€“21. doi:https://doi.org/10.3233/SW-2011-0025. [8] F. Baader, B. Hollunder, B. Nebel, H.-J. Profitlich, E. Franconi, An Empirical Analysis of Optimization Techniques for Terminological Representation Systems, Applied Intelligence 4 (1994) 109โ€“132. doi:https://doi.org/10.22028/D291-24994. [9] N. Matentzoglu, B. Parsia, ORE 2015 Reasoner Competition Corpus, 2015. URL: https: //doi.org/10.5281/zenodo.18578. doi:10.5281/zenodo.18578.