=Paper= {{Paper |id=Vol-3263/paper-14 |storemode=property |title=Computing Concept Referring Expressions with Standard OWL Reasoners |pdfUrl=https://ceur-ws.org/Vol-3263/paper-14.pdf |volume=Vol-3263 |authors=Moritz Illich,Birte Glimm |dblpUrl=https://dblp.org/rec/conf/dlog/IllichG22 }} ==Computing Concept Referring Expressions with Standard OWL Reasoners== https://ceur-ws.org/Vol-3263/paper-14.pdf
Computing Concept Referring Expressions with
Standard OWL Reasoners
Moritz Illich1 , Birte Glimm1
1
    Ulm University, James-Franck-Ring, Ulm, 89081, Germany


                                         Abstract
                                         Classical instance queries over an ontology only consider explicitly named individuals. Concept referring
                                         expressions (CREs) also allow for returning answers in the form of concepts that describe implicitly given
                                         individuals in terms of their relation to an explicitly named one. Existing approaches, e.g., based on tree
                                         automata, can neither be integrated into state-of-the-art OWL reasoners nor are they directly amenable
                                         for an efficient implementation. To address this, we devise a novel algorithm that uses highly optimized
                                         OWL reasoners as a black box. In addition to the standard criteria of singularity and certainty for CREs,
                                         we devise and consider the criterion of uniqueness of CREs for Horn ๐’œโ„’๐’ž ontologies. The evaluation of
                                         our prototypical implementation shows that computing CREs for the most general concept (โŠค) can be
                                         done in less than one minute for ontologies with thousands of individuals and concepts.

                                         Keywords
                                         ontologies, instance retrieval, referring expressions




1. Introduction
Regarding Description Logic (DL) ontologies, automated reasoning systems derive implicit
consequences of explicitly stated information, e.g., when answering queries for concept instances.
Classically, answers to such queries consist of individual names that are used in the knowledge
base. Consider, for example, a knowledge base that states that "every person has a mother
who is a person" and that "John is a person" (in DL syntax: Person โŠ‘ โˆƒhasMother.Person and
Person(john)). Obviously, john is an answer to the query for instances of the concept Person and
such names are also called referring expressions as they refer to elements in the models of the
knowledge base. In each model of the knowledge base, however, there is also an element that
represents "the mother of John". This anonymous element can be described by a concept referring
expression (CRE) [1, 2, 3], i.e., by a concept that describes an anonymous element w.r.t. a named
individual. For our example, we can use the CRE "the person who is the mother of John" (as DL
concept: Person โŠ“ โˆƒhasMotherโˆ’ .{john}). The reasoning task of generalized instance retrieval
refers to computing entailed named and anonymous answers in the form of CREs for a concept
instance query, which allows for more comprehensive query results.
   Usually, CREs are meant to identify a single element (e.g., "Johnโ€™s mother" and not some set
of elements as, e.g., the concept Person itself). While some approaches for computing CREs rely
on functionality of roles and of role paths in order to guarantee singularity, e.g., [2], we do not

  DL 2022: 35th International Workshop on Description Logics, August 7โ€“10, 2022, Haifa, Israel
$ moritz.illich@uni-ulm.de (M. Illich); birte.glimm@uni-ulm.de (B. Glimm)
                                       ยฉ 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
make this restriction as also the recent work of Toman and Weddell [3]. Another challenge
when computing CREs is describing all entailed answers to (concept instance) queries. While
the approach of Toman and Weddell based on tree automata addresses this problem, it is not
well-suited for an efficient implementation. Moreover, the applied constraints that ensure the
construction of only singular CREs are actually too strict, thus neglecting query answers in some
cases, while an explicit prevention of duplicate answers referring to different, but semantically
equivalent, CREs is also not provided. In this paper, we address these issues and present the
first practical method for generalized instance retrieval that uses highly optimized reasoners as
a black box for computing CREs.
   At first, we shortly introduce necessary definitions regarding DLs and CREs. We then present
the algorithm for computing answers to generalized instance queries and prove its properties in
Section 3. In Section 4, we show the results of our empirical evaluation, followed by a discussion
of related work in Section 5 and conclusions in Section 6.


2. Preliminaries
Before we can describe our algorithm, we first have to consider some relevant definitions
concerning ontologies and CREs. However, since we assume the reader to be familiar with DLs,
we only focus on selected aspects.

2.1. Description Logics
The syntax of the DL ๐’œโ„’๐’žโ„๐’ช is defined using a vocabulary consisting of countably infinite
disjoint sets NC of atomic concepts, NR of atomic roles, and NI of individuals. A role is either an
atomic or an inverse role ๐‘Ÿโˆ’ , ๐‘Ÿ โˆˆ NR . An ๐’œโ„’๐’žโ„๐’ช concept is defined as
               ๐ถ ::= โŠค | โŠฅ | ๐ด | {๐‘œ} | ยฌ๐ถ | ๐ถ1 โŠ“ ๐ถ2 | ๐ถ1 โŠ” ๐ถ2 | โˆ€๐‘….๐ถ | โˆƒ๐‘….๐ถ,
where ๐ด โˆˆ NC , ๐‘œ โˆˆ NI , and ๐‘… is a role. The DL ๐’œโ„’๐’ž is obtained from ๐’œโ„’๐’žโ„๐’ช by disallowing
the use of nominals ({๐‘œ}) and inverse roles. In the remainder, we use ๐‘Ž, ๐‘ for individuals, ๐ด, ๐ต
for atomic and ๐ถ, ๐ท for (possibly) complex concepts, and ๐‘Ÿ for an atomic and ๐‘… for a (possibly)
non-atomic role.
   A TBox is a set of concept inclusion axioms ๐ถ โŠ‘ ๐ท. An ABox is a set of (concept and
role) assertions of the form ๐ถ(๐‘Ž) and ๐‘…(๐‘Ž, ๐‘), respectively. A knowledge base (or ontology)
๐’ฆ = ๐’ฏ โˆช ๐’œ consists of a TBox ๐’ฏ and an ABox ๐’œ. We use ๐ถ โ‰ก ๐ท to abbreviate ๐ถ โŠ‘ ๐ท and
๐ท โŠ‘ ๐ถ. With cons(๐’ฆ), rols(๐’ฆ), and inds(๐’ฆ), we denote the sets of atomic concepts, atomic
roles, and individuals occurring in ๐’ฆ, respectively. Interpretations and models are defined in
the standard way (see e.g. [4]).
   For a knowledge base ๐’ฆ = ๐’ฏ โˆช ๐’œ, we say that ๐’ฆ is (concept) materialized if ๐’ฆ |= ๐ด(๐‘Ž)
implies ๐ด(๐‘Ž) โˆˆ ๐’ฆ for each ๐ด โˆˆ cons(๐’ฆ) โˆช {โŠค} and ๐‘Ž โˆˆ inds(๐’ฆ); ๐’œ is reduced if there is no
๐’œโ€ฒ โŠ‚ ๐’œ with {๐‘Ÿ(๐‘Ž, ๐‘) | ๐‘Ÿ(๐‘Ž, ๐‘) โˆˆ ๐’œ} โŠ‚ ๐’œโ€ฒ such that ๐’œโ€ฒ โˆช {๐ต(๐‘Ž) | ๐ด(๐‘Ž) โˆˆ ๐’œ and ๐’ฆ |= ๐ด โŠ‘
๐ต, ๐ด, ๐ต โˆˆ cons(๐’ฆ)} is materialized, i.e., a reduced ABox is materialized with the most specific
atomic concepts.
   An ๐’œโ„’๐’ž knowledge base ๐’ฆ = ๐’ฏ โˆช ๐’œ is Horn and in normalized form if ๐ถ โˆˆ NC for every
๐ถ(๐‘Ž) โˆˆ ๐’ฆ, and every ๐ถ โŠ‘ ๐ท โˆˆ ๐’ฆ is in one of the forms โŠค โŠ‘ ๐ด, ๐ด โŠ‘ ๐ต, ๐ด โŠ‘ โŠฅ, ๐ด โŠ“ ๐ต โŠ‘
๐ดโ€ฒ , โˆƒ๐‘Ÿ.๐ด โŠ‘ ๐ต, ๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ต, ๐ด โŠ‘ โˆ€๐‘Ÿ.๐ต, where ๐ด, ๐ดโ€ฒ , ๐ต โˆˆ NC and ๐‘Ÿ โˆˆ NR . We use ๐’ฆโˆƒ to
denote the set {โˆƒ๐‘Ÿ.๐ต | ๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ต โˆˆ ๐’ฏ } and ๐’ฆโˆ€ to denote {โˆ€๐‘Ÿ.๐ต | ๐ด โŠ‘ โˆ€๐‘Ÿ.๐ต โˆˆ ๐’ฏ }. W.l.o.g.,
we assume in the remainder that Horn ๐’œโ„’๐’ž knowledge bases are normalized by applying a
structural transformation (see e.g. [5]) and that the ABox is reduced.
   Each consistent Horn ๐’œโ„’๐’ž knowledge base ๐’ฆ has a so-called universal model ๐’ฐ which is
minimal and unique, and contains all the implied facts. In particular, the universal model has
a tree-like form of role-connected (anonymous) individuals with named individuals as roots
where individuals are only made equal if it is necessarily entailed by the knowledge base. For
query answering, it suffices to consider this universal model [3].

2.2. Concept Referring Expressions
Principally, referring expressions serve the purpose of uniquely identifying an object in a certain
context through its properties, that also include relations to other objects [6]. In DLs, this can
be realized by a conjunction of concepts with relations modeled as existential restrictions over
inverse roles, which relate an anonymous individual to a named one (expressed as nominal):
Definition 1. Let ๐’ฆ be a normalized ontology, ๐ถ = ๐ด1 โŠ“. . .โŠ“๐ด๐‘› with ๐ด๐‘– โˆˆ cons(๐’ฆ), ๐‘Ÿ โˆˆ rols(๐’ฆ),
and ๐‘Ž โˆˆ inds(๐’ฆ), then a concept referring expression ๐ธ is defined as
                              ๐ธ ::= {๐‘Ž}       ๐ธ ::= ๐ถ โŠ“ โˆƒ๐‘Ÿโˆ’ .(๐ธ)
and we call ๐‘Ž the base individual of ๐ธ.
  The reason for describing referring expressions this way lies in the tree model property of
DLs, which allows us to identify an anonymous object in terms of a named individual (the base
individual) and the (inverse) role path that connects the former to the latter [3]. Consider the
example taken from Toman and Weddell [3]:
Example 1. Let ๐’ฆ = ๐’ฏ โˆช ๐’œ consist of ๐’ฏ = {๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ถ, ๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ท, ๐ด โŠ‘ โˆ€๐‘Ÿ.๐ต} and
๐’œ = {๐ด(๐‘Ž)}, with the three (tree) models of ๐’ฆ:

                      ๐ด ๐‘Ž                           ๐ด ๐‘Ž                    ๐ด ๐‘Ž
                     ๐‘Ÿ ๐‘Ÿ ๐‘Ÿ                          ๐‘Ÿ ๐‘Ÿ                     ๐‘Ÿ
               ๐ต, ๐ถ ๐ต, ๐ถ ๐ต, ๐ท                   ๐ต, ๐ถ    ๐ต, ๐ท              ๐ต, ๐ถ, ๐ท

   The model in the middle is the universal model as it can be embedded into any other model of ๐’ฆ.
When we consider the generalized instance query for the concept ๐ต, the universal model makes
it reasonable to consider the ๐‘Ÿ-successor of ๐‘Ž labeled ๐ถ and the ๐‘Ÿ-successor of ๐‘Ž labeled ๐ท as
singular certain answers to the query. We can describe these answers as the CREs ๐ถ โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}
and ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}, respectively. Note that we use an ๐’œโ„’๐’žโ„๐’ช concept to describe the CRE, whereas
๐’ฆ is in Horn ๐’œโ„’๐’ž.
 For a CRE to qualify as an answer to a generalized instance query, we impose some require-
ments, following the definition of Toman and Weddell [3] for singularity and certainty:
Definition 2. A generalized instance query is an atomic concept ๐‘„ โˆˆ NC . A set ans(๐‘„) of CREs
is an answer to ๐‘„ over a consistent ontology ๐’ฆ if, for each ๐ธ, ๐ธ โ€ฒ โˆˆ ans(๐‘„),
certainty ๐’ฆ |= ๐ธ โŠ‘ ๐‘„ and |๐ธ โ„ | > 0 for all models โ„ of ๐’ฆ,

singularity |๐ธ ๐’ฐ | = 1 in the universal model ๐’ฐ of ๐’ฆ, and

uniqueness ๐ธ ๐’ฐ ฬธ= ๐ธ โ€ฒ ๐’ฐ in the universal model ๐’ฐ of ๐’ฆ

completeness for every CRE ๐น with ๐’ฆ |= ๐น โŠ‘ ๐‘„ there exists some ๐ธ โˆˆ ans(๐‘„) s.t. ๐ธ โ‰ก ๐น .

   Here, the additional uniqueness property ensures that we do not obtain any duplicate answers,
since we are only interested in the individual (semantically) represented by some CRE rather
than the actual (syntactic) form of the latter. Besides, note that the above definition of singularity
is a weakening of the property defined by Borgida et al. [2], where a CRE was required to denote
a singleton set in all models of the knowledge base. This weakening, however, is essential in
DLs that do not support counting (e.g., through number restrictions or functional roles) such as
๐’œโ„’๐’žโ„๐’ช. Note also that unreachable objects cannot be certain answers. Furthermore, a complex
concept may still be considered as query if it can be represented by means of a new atomic
concept and additional appropriate axioms in the ontology, like โŠค โŠ‘ ๐ด, for instance.

Example 2 (Ex. 1 cont.). Consider again the knowledge base ๐’ฆ of Example 1. Formally, the
set {๐ถ โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}, ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}} is an answer to the generalized instance query ๐ต as both
concept expressions are singular certain answers that identify different anonymous individuals.
The CRE โˆƒ๐‘Ÿโˆ’ .{๐‘Ž} does not satisfy the singularity requirement and the CRE ๐ถ โŠ“ ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}
violates the certainty requirement. Note that if we were to add ๐ถ โ‰ก ๐ท to ๐’ฏ , the universal
model would be the right-most model of Example 1. In addition to the two CREs from above, also
๐ถ โŠ“ ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž} becomes a certain singular answer. All three CREs, however, identify the same
anonymous individual in the universal model (in different syntactic variants) and our proposed
uniqueness criterion requires that ans(๐‘„) consists of only one of them.


3. Answering Generalized Instance Queries
We propose an algorithm for computing CREs for a generalized instance query given by an
atomic concept ๐‘„ over a (normalized) Horn ๐’œโ„’๐’ž ontology ๐’ฆ = ๐’ฏ โˆช ๐’œ with reduced ABox ๐’œ.
An important aspect of the algorithm is that it employs an OWL reasoner as a black box as only
standard reasoning tasks (e.g., as foreseen in the OWL API [7]) are required and the concrete
approach being applied to answer generalized instance queries is not known to the reasoner.
For example, we retrieve super-concepts of a concept or the most specific concepts (w.r.t. to the
subsumption hierarchy) that an individual is an instance of (corresponding to those concepts in
a reduced ABox). This provides great flexibility concerning the choice of the applied reasoner
and allows for using optimized strategies for different ontologies.
   To construct CREs serving as answers for ๐‘„ we start with a CRE ๐ธ = {๐‘Ž} for each ๐‘Ž โˆˆ inds(๐’ฆ)
and a corresponding current concept ๐ถ๐‘Ž = ๐ด1 โŠ“ . . . โŠ“ ๐ด๐‘› using every ๐ด๐‘– (๐‘Ž) โˆˆ ๐’œ. We then look
for suitable existential restrictions โˆƒ๐‘Ÿ.๐ท which satisfy ๐’ฆ |= ๐ถ๐‘Ž โŠ‘ โˆƒ๐‘Ÿ.๐ท and, thus, allow us to
extend the current CRE to ๐ธ โ€ฒ = ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .(๐ธ). This procedure is repeated and each time the
newly related ๐ท is chosen as next current concept until no more appropriate restrictions can be
found to further extend the generated CREs. Here, a current concept ๐ถ does not just enable us
Algorithm 1 Answering generalized instance queries
 procedure GeneralizedInstanceQuery(๐’ฆ, ๐‘„)
    input: ๐’ฆ = ๐’ฏ โˆช ๐’œ: an ontology, ๐‘„: a generalized instance query
    output: ans(๐‘„)
 1: ans โ† โˆ…
 2: for all ๐‘Ž โˆˆ inds(๐’ฆ) do
            d
 3: ๐ถ๐‘Ž โ† ๐ด(๐‘Ž)โˆˆ๐’œ ๐ด
 4: ans โ† ans โˆช getCREs(๐’ฆ, ๐‘„, ๐ถ๐‘Ž , {๐‘Ž}, โˆ…)
 5: return ans


to further expand a CRE, but moreover serves as an indicator if the latter represents an answer
for the query ๐‘„, which is the case if ๐’ฆ |= ๐ถ โŠ‘ ๐‘„ as shown later in Lemma 1.

Initialization: Algorithm 1 describes the main algorithm that constructs, for each individual
๐‘Ž occurring in the knowledge base, the initial current concept and then calls the algorithm to
generate CREs w.r.t. this (base) individual.

Constructing Concept Referring Expressions: Algorithm 2 uses five steps to construct
CREs for a given base individual that serve as answers for the considered query:
   Step 1: At first, in Line 2, we look for suitable existential restrictions โˆƒ๐‘Ÿ.๐ต that allow us to
relate the current concept ๐ถ to the concept ๐ต in order to establish a link to a new anonymous
individual, thus requiring ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ต. Since we are interested in producing singular CREs,
a restriction โˆƒ๐‘Ÿ.๐ต should only be considered if there is no โˆƒ๐‘Ÿ.๐ด (using the same role ๐‘Ÿ) such
that ๐’ฆ |= ๐ด โŠ‘ ๐ต. We capture this by introducing reduced (sets of) existential restrictions:

Definition 3 (Reduced Existential Restrictions). Let ๐’ฆ be a normalized Horn ๐’œโ„’๐’ž ontology
and ๐ถ a concept, then ๐’ฆโˆƒ (๐ถ) = {โˆƒ๐‘Ÿ.๐ต | ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ต, ๐ต โˆˆ NC , ๐‘Ÿ โˆˆ NR }. We denote with
๐’ฆโˆƒmin (๐ถ) a smallest subset of ๐’ฆโˆƒ (๐ถ) such that, for each โˆƒ๐‘Ÿ.๐ต โˆˆ ๐’ฆโˆƒ (๐ถ) โˆ– ๐’ฆโˆƒmin (๐ถ), there is some
โˆƒ๐‘Ÿ.๐ด โˆˆ ๐’ฆโˆƒmin (๐ถ) s.t. ๐’ฆ |= ๐ด โŠ‘ ๐ต.

   Note that an instance of โˆƒ๐‘Ÿ.๐ต โˆˆ ๐’ฆโˆƒ (๐ถ) โˆ– ๐’ฆโˆƒmin (๐ถ) is connected to more than one instance
of ๐ต, which means that a CRE using this existential restriction is non-singular as it represents
more than one individual. Moreover, if there are two candidates โˆƒ๐‘Ÿ.๐ด, โˆƒ๐‘Ÿ.๐ต โˆˆ ๐’ฆโˆƒ (๐ถ) such that
๐’ฆ |= ๐ด โ‰ก ๐ต, only one of them should be chosen in order to prevent duplicate answers.
   Step 2: For each selected existential restriction that induces a successor for an instance of the
current concept ๐ถ, we next consider universal restrictions that further specify these successors
(see Lines 4โ€“5) and we add these concepts to the set of successor concepts for the role in the
sc relation. After this step, each (๐‘Ÿ, ๐‘†) โˆˆ sc is such that ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.(๐ต1 โŠ“ . . . โŠ“ ๐ต๐‘› ) for
๐‘† = {๐ต1 , . . . , ๐ต๐‘› }.
   Step 3: In this step of the algorithm, we determine those existential restrictions that can
actually be applied to further construct CREs leading to singular, unique answers of the query
and collect them in the set next. If we encounter            d ๐‘†) โˆˆ sc for which we already have
                                                 d an entry (๐‘Ÿ,
another entry (๐‘Ÿ, ๐‘† โ€ฒ ) โˆˆ next such that ๐’ฆ |= ๐ตโˆˆ๐‘† ๐ต โ‰ก ๐ต โ€ฒ โˆˆ๐‘† โ€ฒ ๐ต โ€ฒ , we do not consider (๐‘Ÿ, ๐‘†)
Algorithm 2 Computing CREs for a base individual
 procedure getCREs(๐’ฆ, ๐‘„, ๐ถ, ๐ธ, used)
    input: ๐’ฆ = ๐’ฏ โˆช ๐’œ: an ontology, ๐‘„: a query, ๐ถ: the current concept, ๐ธ: the current CRE,
    used: a set of already applied existential restrictions
    output: CREs from ans(๐‘„) that build upon ๐ธ
 1: // Step 1: find suitable existential restrictions
 2: sc โ† {(๐‘Ÿ, {๐ต}) | โˆƒ๐‘Ÿ.๐ต โˆˆ ๐’ฆโˆƒ      min (๐ถ)}

 3: // Step 2: find related universal restrictions
 4: for all (๐‘Ÿ, ๐‘†) โˆˆ sc do
 5: ๐‘† โ† ๐‘† โˆช {๐ต | โˆ€๐‘Ÿ.๐ต โˆˆ ๐’ฆโˆ€ and ๐’ฆ |= ๐ถ โŠ‘ โˆ€๐‘Ÿ.๐ต}
 6: // Step 3: filter found existential restrictions
 7: next โ† โˆ…
 8: for all (๐‘Ÿ, ๐‘†) โˆˆ sc do
 9: // check for potential duplicates
10: scโ€ฒ โ† {(๐‘Ÿ, ๐‘† โ€ฒ ) โˆˆ next | ๐’ฆ |= ๐ตโˆˆ๐‘† ๐ต โ‰ก ๐ต โ€ฒ โˆˆ๐‘† โ€ฒ ๐ต โ€ฒ }
                                       d             d
11: if scโ€ฒ = โˆ… then
12:    if ๐ธ = {๐‘Ž} then
13:      if {๐‘ | ๐‘Ÿ(๐‘Ž, ๐‘) โˆˆ ๐’œ, โˆ€๐ต โˆˆ ๐‘† : ๐’ฆ |= ๐ต(๐‘)} = โˆ… then
14:        next โ† next โˆช {(๐‘Ÿ, ๐‘†)}
15:    else
16:      // look for cycle in current CRE
17:      if (๐‘Ÿ, ๐‘†) โˆˆ used then
18:        ๐ธ โ† markCycle(๐ธ, ๐‘Ÿ, ๐‘†)
19:      else
20:        next โ† next โˆช {(๐‘Ÿ, ๐‘†)}
21: // Step 4: check if current CRE is an answer
22: ans โ† โˆ…
23: if ๐’ฆ |= ๐ถ โŠ‘ ๐‘„ then
24: ans โ† ans โˆช {๐ธ}
25: // Step 5: extend CRE recursively
26: for all (๐‘Ÿ, ๐‘†) โˆˆ next do
27: ๐ถ โ€ฒ โ† ๐ตโˆˆ๐‘† ๐ต
              d
28: ๐ธ โ€ฒ โ† ๐ถ โ€ฒ โŠ“ โˆƒ๐‘Ÿ โˆ’ .(๐ธ)
29: ans โ† ans โˆช getCREs(๐’ฆ, ๐‘„, ๐ถ โ€ฒ , ๐ธ โ€ฒ , used โˆช {(๐‘Ÿ, ๐‘†)})
30: return ans


any further to prevent duplicate answers, i.e., if the check in Line 11 fails, the entry (๐‘Ÿ, ๐‘†) is
not added to the set next. Even though we already consider only the reduced set of existential
restrictions in Step 1, this check is still necessary. Consider, for example, that ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ด
and ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ต for the current concept ๐ถ and ๐’ฆ ฬธ|= ๐ด โ‰ก ๐ต. Hence, we have two entries
(๐‘Ÿ, {๐ด}) and (๐‘Ÿ, {๐ต}) โˆˆ sc. Assume, furthermore, that ๐’ฆ |= ๐ถ โŠ‘ โˆ€๐‘Ÿ.๐ด and ๐’ฆ |= ๐ถ โŠ‘ โˆ€๐‘Ÿ.๐ต.
This means, however, that for both (๐‘Ÿ, {๐ด}), (๐‘Ÿ, {๐ต}) โˆˆ sc, we have ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.(๐ด โŠ“ ๐ต).
   To guarantee the uniqueness property, an existential restriction should also be ignored for
the initial extension of a CRE only consisting of a nominal {๐‘Ž} if the existential restriction is
already satisfied through a (named) individual ๐‘. This situation is handled in Line 13. Note that
due to the tree model property this situation can only occur for the initial CREs.
   For cyclic knowledge bases, it may be the case that for two, possibly equal, current concepts ๐ถ
and ๐ถ โ€ฒ that occur during the algorithm processing, the same existential restriction is applicable.
This may lead to an arbitrarily long repetition of the same sequence of existential restrictions
that originally led from ๐ถ to ๐ถ โ€ฒ , hence, giving rise to an infinitely large number of possible CREs
(of increasing length). Analogous to blocking in tableau algorithms [4], we ensure termination
of the algorithm by detecting such cycles and by preventing the reuse of an already processed
existential restriction. Instead, we compactly represent such CREs using a cycle notation in
form of surrounding square brackets "[...]+ " to mark the sequence that may occur several times
repeatedly (indicated by a call to the procedure markCycle in Line 18). For an example, consider
๐’ฆ = ๐’ฏ โˆช ๐’œ with ๐’ฏ = {๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ด} and ๐’œ = {๐ด(๐‘Ž)}, where [๐ด โŠ“ โˆƒ๐‘Ÿโˆ’ .(]+ {๐‘Ž}) is an answer
for the generalized instance query ๐ด and serves as single representative for the infinitely many
CREs of the form ๐ด โŠ“ โˆƒ๐‘Ÿโˆ’ .(๐ด โŠ“ โˆƒ๐‘Ÿโˆ’ .(...{๐‘Ž}).
   Step 4: In this step, we check if the current CRE already represents an answer for the query
๐‘„. For a standard instance query, this would usually require to check ๐’ฆ |= ๐‘„(๐‘Ž) for a named
individual ๐‘Ž, but since we are considering anonymous individuals represented by some CRE ๐ธ,
we want to know whether ๐’ฆ |= ๐ธ โŠ‘ ๐‘„. This is actually realized by checking ๐’ฆ |= ๐ถ โŠ‘ ๐‘„ in
Line 23 for the related current concept ๐ถ based on the following lemma.

Lemma 1. Given a Horn ๐’œโ„’๐’ž ontology ๐’ฆ, a generalized instance query ๐‘„, a CRE ๐ธ and its
associated (Horn ๐’œโ„’๐’ž) current concept ๐ถ, where ๐ถ = ๐ด1 โŠ“ . . . โŠ“ ๐ด๐‘› using every ๐ด๐‘– (๐‘Ž) โˆˆ ๐’œ if
๐ธ = {๐‘Ž}, or ๐ถ = ๐ท if ๐ธ = ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .(...), then ๐’ฆ |= ๐ธ โŠ‘ ๐‘„ โ‡” ๐’ฆ |= ๐ถ โŠ‘ ๐‘„.

Proof sketch. (โ‡’) ๐’ฆ |= ๐ถ โŠ‘ ๐‘„ requires that ๐ถ is at least as specific as ๐‘„. Therefore, we
show that the current concept ๐ถ is actually the most specific (Horn ๐’œโ„’๐’ž) concept such that
๐’ฆ |= ๐ธ โŠ‘ ๐ถ and that for any other most specific candidate ๐ท with ๐’ฆ |= ๐ธ โŠ‘ ๐ท, we have
๐’ฆ |= ๐ถ โ‰ก ๐ท, which is mainly based on the fact that we only work with reduced existential
restrictions (see Algorithm 2, Line 2 and Definition 3).
(โ‡) Per definition, a current concept ๐ถ of some CRE ๐ธ always satisfies ๐’ฆ |= ๐ธ โŠ‘ ๐ถ.

   Step 5: Even if an answer has been identified, the processing of the CRE does still proceed,
because it may be possible that the continuing construction leads to another answer later on.
Therefore, the algorithm is eventually called recursively using appropriately updated arguments,
like an extended CRE ๐ธ โ€ฒ and the new current concept ๐ถ โ€ฒ .

3.1. Properties
In the following, we discuss why the described algorithm works properly in the way that
for a generalized instance query, only certain, singular, and unique answers are produced
w.r.t. Definition 2. Additionally, we show that the algorithm returns every such answer and,
furthermore, always terminates. Due to space limitations, we mainly provide proof sketches,
while more details are given in an online appendix.1

Certainty. In Line 23 of Algorithm 2, a CRE ๐ธ is selected as answer if ๐’ฆ |= ๐ถ โŠ‘ ๐‘„, which
implies ๐’ฆ |= ๐ธ โŠ‘ ๐‘„ according to Lemma 1. In addition, a CRE basically describes a chain of
linked individuals where the existence of one individual ensures the presence of the next one
being referenced through an existential restriction. Since we start with an explicitly present
individual as base for ๐ธ, there is always at least one element in ๐ธ โ„ for each model โ„.

Singularity. For an initial CRE ๐ธ = {๐‘Ž}, we directly get |๐ธ ๐’ฐ | = 1 for the universal model
๐’ฐ. In the more general case ๐ธ = ๐ถ โŠ“ โˆƒ๐‘Ÿโˆ’ .(๐ธ โ€ฒ ) with ๐ธ โ€ฒ as the (recursive) part leading to the
base individual, we show that the non-singular situation |๐ธ ๐’ฐ | > 1 is never achieved: This
would require that there is some part ๐ถ โ€ฒ โŠ“ โˆƒ๐‘Ÿโ€ฒโˆ’ .(๐ธ โ€ฒโ€ฒ ) in ๐ธ where one individual from ๐ธ โ€ฒโ€ฒ๐’ฐ
has ๐‘Ÿโ€ฒ -relations to (at least) two different individuals in ๐ถ โ€ฒ๐’ฐ . The necessary conditions for this
are described in Lemma 2 below (with ๐ถ โ€ฒ as ๐ท), but since the algorithm only selects reduced
existential restrictions w.r.t. Definition 3, i.e., some โˆƒ๐‘Ÿ.๐ต is only chosen if there is no candidate
โˆƒ๐‘Ÿ.๐ด with ๐’ฆ |= ๐ด โŠ‘ ๐ต, this situation never occurs.

Lemma 2. Let ๐’ฆ be a Horn ๐’œโ„’๐’ž ontology, ๐’ฐ the universal model of ๐’ฆ, and assume that
โŸจ๐‘, ๐‘ŽโŸฉ, โŸจ๐‘, ๐‘โŸฉ โˆˆ ๐‘Ÿ๐’ฐ for some role ๐‘Ÿ, and (anonymous individuals) ๐‘Ž, ๐‘ โˆˆ ๐ท๐’ฐ for some concept
๐ท. We have ๐‘Ž ฬธ= ๐‘ iff there exist some concepts ๐ด, ๐ต, ๐ถ such that ๐‘ โˆˆ ๐ถ ๐’ฐ , ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ด,
๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ต, ๐’ฆ |= ๐ด โŠ‘ ๐ท, ๐’ฆ |= ๐ต โŠ‘ ๐ท, and ๐’ฆ ฬธ|= ๐ด โ‰ก ๐ต.

Proof sketch. This basically follows from the characteristics of the universal model ๐’ฐ.

Uniqueness. Considering the tree-like universal model ๐’ฐ, having two different CREs ๐ธ, ๐ธ โ€ฒ
with ๐ธ ๐’ฐ = ๐ธ โ€ฒ๐’ฐ means that i) they both describe the same path in the tree using different
concepts or ii) one refers to a sub-path of the other leading to the same anonymous individual:
i) Because an edge in a tree path relates to some โˆƒ๐‘Ÿ.๐ต, ๐ธ โ€ฒ might describe the same path as
๐ธ if it uses some โˆƒ๐‘Ÿ.๐ต โ€ฒ with ๐ต โŠ‘ ๐ต โ€ฒ (including ๐ต โ‰ก ๐ต โ€ฒ ) for the associated edge instead.
In Algorithm 2, this is prevented in Lines 2 and 11. ii) Assuming that ๐‘Ž and ๐‘Žโ€ฒ are the base
individuals of ๐ธ and ๐ธ โ€ฒ , respectively, there must be some ๐‘Ÿ(๐‘Ž, ๐‘Žโ€ฒ ) โˆˆ ๐’œ such that ๐ธ โ€ฒ refers to a
sub-path of ๐ธ. If ๐‘Žโ€ฒ๐’ฐ โˆˆ ๐ต ๐’ฐ for some ๐ต, some โˆƒ๐‘Ÿ.๐ต would describe the same edge as ๐‘Ÿ(๐‘Ž, ๐‘Žโ€ฒ ),
given that ๐‘Ž๐’ฐ โˆˆ (โˆƒ๐‘Ÿ.๐ต)๐’ฐ , which is why such existential restrictions are not chosen (Line 13).

Completeness. It is not required to construct every possible CRE that represents an answer
for the stated query, but rather to find one unique, singular CRE for every, possibly anonymous,
individual that serves as an answer. For that, we assume there may exist some CRE ๐ธ describing
a correct, singular and unique answer, but which cannot be constructed by the algorithm and
prove that this is not possible: Since the considered ๐ธ is not generated by the algorithm, it
must possess a unique subpart that does not occur in any of the constructed answers. As
we already consider all individuals during the initialization, this part must be of the form
"๐ถ โ€ฒ โŠ“ โˆƒ๐‘Ÿโˆ’ .(๐ถ โŠ“ . . .)", where the existential restriction โˆƒ๐‘Ÿ.๐ถ โ€ฒ was applied for the concept ๐ถ. By

1
    https://www.uni-ulm.de/fileadmin/website_uni_ulm/iui.inst.090/Publikationen/2022/IlGl22a_appendix.pdf
considering all the cases in the algorithm where ๐ถ is not used as current concept or โˆƒ๐‘Ÿ.๐ถ โ€ฒ is
not selected (or constructed), respectively, we can see that either the certainty, singularity or
uniqueness property would be violated by regarding ๐ธ as an answer. Besides, the employed
cycle notation may be seen as a compromise to implicitly return a possibly infinite number of
CREs in form of just one compact instance in order to ensure the termination of the algorithm.

Termination. As the given ontology always consists of a finite number of axioms and concepts,
all the applied search and iteration processes come to a halt. Therefore, non-termination may
only be realized by the repeated usage of an existential restriction during a recursive call, which,
however, is prevented by the cycle detection in Line 17 of Algorithm 2.


4. Implementation and Evaluation
A prototypical Java implementation of our algorithm is available online on GitHub2 and also
includes some optimizations. One of them refers to the ordering of existential and universal
restrictions by means of separate subsumption hierarchies. This allows us to reduce the number
of performed subsumption tests in the way that if, for example, ๐’ฆ ฬธ|= ๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ต, we do not
need to further check ๐’ฆ |= ๐ถ โŠ‘ โˆƒ๐‘Ÿโ€ฒ .๐ด for any โˆƒ๐‘Ÿโ€ฒ .๐ด that is below โˆƒ๐‘Ÿ.๐ต in the hierarchy,
i.e., for which ๐’ฆ |= โˆƒ๐‘Ÿโ€ฒ .๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ต. Usually, reasoners are only concerned with computing
subsumption hierarchies for atomic concepts, referred to as classification (see e.g. [4]). Therefore,
we introduce a new atomic concept for each existential (or universal) restriction such that the
reasoner can (internally) compute the hierarchies for those new atomic concepts, based on
which the actual hierarchies for existential (or universal) restrictions can then be derived. Due
to the associated additional overhead, we also implemented the (explicit) Enhanced Traversal
Method [8] to efficiently deal with smaller number of elements.
   Another improvement is achieved by combining base individuals into one set if they share an
equivalent initial current concept and furthermore possess analogous given role assertions. Since
these properties result in the same computations for each individual, we can avoid unnecessary
repetitions by calling the algorithm only once for such a combined set and just generate different
versions of the constructed CRE for each base individual in the end.
   For a general performance assessment, the implementation was tested on different ontologies
listed in Table 1 with codinteraction-A and the separate ore_ont_2608/4516/3313 being part of
the ORE 2015 Reasoner Competition Corpus [9], while HAO (v2021-03-05), VO (v1.1.171) and
DTO (v1.1.1) were taken from BioPortal3 . If necessary, axioms not adhering to Horn ๐’œโ„’๐’ž were
removed and ontologies merged with their imports to obtain one complete ontology.
   Tests showed that subsumption checking has a huge influence on the performance, which
is why we considered two different reasoners, namely HermiT4 (v1.3.8) and JFact5 (v5.0.3). In
general, HermiT achieved better runtime results than JFact, except for the larger ontologies VO
and DTO, where JFact was faster in determining the earlier mentioned subsumption hierarchies.

2
  https://github.com/M-Illich/Computing-CREs.git
3
  https://bioportal.bioontology.org/ontologies
4
  http://www.hermit-reasoner.com/
5
  http://jfact.sourceforge.net/
Table 1
Overview of adapted ontologies with their number of atomic concepts (#con), individuals (#ind), com-
bined individual sets (#sets), considered existential/universal restrictions (#โˆƒ/#โˆ€), average runtime in
ms for โŠค as query, number of answers (#ans) and their percentage distribution w.r.t. their depth, i.e.,
number of contained inverse existential restrictions (* no subsumption hierarchies)
Ontology              #con #ind #sets        #โˆƒ #โˆ€ Runtime #ans          depth (%)
                                                      [ms]          0 1 2-4 5-7 8-10 11-12
codinteraction-A          6    21     13     3   0       71    30 70 30   -   -    -     -
ore_ont_2608            453 212      212    10 10       265 279 76 24     -   -    -     -
ore_ont_4516            509 217      217    10 10       185 284 76 24     -   -    -     -
ore_ont_3313            865 2,070    858 114     0   57,966 9,734 21 8 34 27       9     1
HAO                   2,548 2,764    687 522     1   37,965 2,764 100 -   -   -    -     -
VO                    6,828 167        4 1,513 237    460* 176 95 5       -   -    -     -
DTO                  10,075     3      1 7,958 114    738*      3 100 -   -   -    -     -


Since both VO and DTO, however, possess many existential restrictions but only few individuals
(especially in form of combined sets), the performance was actually best if no subsumption
hierarchies are computed. Table 1 shows the obtained runtime measurements (average of five
runs) for computing all CREs, i.e., using โŠค as query, with HermiT as reasoner, based on an AMD
Ryzen 7 3700X 3.59 GHz processor with 16 GB RAM on Windows 10 (64-Bit).


5. Related Work
Krahmer and van Deemter [6] provide a survey on referring expressions and their computa-
tional generation, given by a historic overview of research developments. Here, the problem
of generating referring expressions is defined as finding a description in form of combined
properties that uniquely identify a given individual in a certain context, which differs from
our work in that we intend to describe unknown, anonymous individuals rather than explicitly
named ones.
   A related approach that works with DL is provided by Areces et al. [1] introducing an
algorithm that takes as input a model โ„ณ and tries to find formulas (concepts) that represent
the individuals in โ„ณ. The basic idea here is that, starting with โŠค as formula for which the
interpretation contains every domain element, new formulas are created by successively refining
the already given ones through appending new conjuncts with the intention to reduce the
number of elements in the formulasโ€™ interpretations. At first, only atomic concepts are added as
conjuncts, before existential restrictions relating to already computed formulas are considered
until the formulasโ€™ interpretations are singular or no further adaptations are possible.
   Unlike in our work, the individuals to be described are already stated and the occurrence
of cycles in referring expressions is not considered. However, both approaches make use of
relations between individuals, but with the distinction that Areces et al. start from the individual
๐‘Ž described by the referring expression and go to another one ๐‘, while we utilize the inverse
case where another (present) individual ๐‘ refers to ๐‘Ž, the (anonymous) one being described.
   The work that comes closest to ours is given by Toman and Weddell [3] dealing with (gen-
eralized) instance retrieval queries on (Horn ๐’œโ„’๐’ž) ontologies, based on a tree automaton for
which a state ๐‘† is given by a set of atomic concepts, while transitions are defined by so-called
matching tuples (๐‘†, ๐‘†0 , ..., ๐‘†๐‘˜ ) stating that some ๐‘†๐‘– with 0 โฉฝ ๐‘– โฉฝ ๐‘˜ can be reached when
in the current state ๐‘† a certain existential restriction โˆƒ๐‘Ÿ๐‘– .๐ถ๐‘– is applied. Starting with a set
๐‘†๐‘Ž = {๐ด | ๐ด(๐‘Ž) โˆˆ ๐’œ} for some individual ๐‘Ž, a tree can be unfolded using those matching
tuples. In order to answer a query  d ๐ต, this tree is traversed beginning with some ๐‘†๐‘Ž until a
state ๐‘†๐‘˜ is reached where ๐’ฆ |= ๐ดโˆˆ๐‘†๐‘˜ ๐ด โŠ‘ ๐ต, for which the sequence ๐‘Ÿ1 ๐ด1 ...๐‘Ÿ๐‘˜ ๐ด๐‘˜ (called
certain path) referring to the applied existential restrictions is then transformed into a CRE
๐ด๐‘˜ โŠ“ โˆƒ๐‘Ÿ๐‘˜โˆ’ .(... ๐ด1 โŠ“ โˆƒ๐‘Ÿ1โˆ’ .{๐‘Ž}) serving as answer for ๐ต.
   Like in our algorithm, the tree automaton starts the construction of CREs with a given
individual and also takes care of occurring cycles. Furthermore, both approaches only use
minimal existential restrictions to ensure singularity, although the definitions for minimality are
different: While Toman and Weddell rely on the general subsumption of the whole restriction
โˆƒ๐‘Ÿ.๐ถ, we consider subsumption for the inner concept ๐ถ instead w.r.t. Definition 3, because only
then singularity can be guaranteed (based on Lemma 2). In this respect, the tree automaton
approach does not fulfill the completeness property due to only working with the most specific
existential restrictions, even though some other ones might lead to singular CREs, too, as
illustrated in Example 3 below. Besides, we are only interested in producing unique CREs, which
is not explicitly handled by Toman and Weddell.
Example 3. Let ๐’ฆ = ๐’ฏ โˆช ๐’œ consist of ๐’œ = {๐ด(๐‘Ž)} and ๐’ฏ = {๐ด โŠ‘ โˆƒ๐‘Ÿ.๐ถ, โˆƒ๐‘Ÿ.๐ถ โŠ‘
๐ต, ๐ต โŠ‘ โˆƒ๐‘Ÿ.๐ท, ๐ถ โŠ‘ ๐‘„, ๐ท โŠ‘ ๐‘„}. For the generalized instance query ๐‘„, we get ans(๐‘„) =
{๐ถ โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}, ๐ท โŠ“ โˆƒ๐‘Ÿโˆ’ .{๐‘Ž}}, even though โˆƒ๐‘Ÿ.๐ท is not minimal w.r.t. subsumption due to
๐’ฆ |= โˆƒ๐‘Ÿ.๐ถ โŠ‘ โˆƒ๐‘Ÿ.๐ท.


6. Conclusion
We presented an algorithm that generates concept referring expressions to identify both named
and anonymous individuals serving as certain, singular and unique answers for an instance
retrieval query on a Horn ๐’œโ„’๐’ž ontology. For that, we start with a named individual and
recursively determine appropriate existential restrictions to create a chain of relations that
defines an implicitly given individual based on its connection to an explicit one. A crucial factor
here is the selection based on reduced existential restrictions (cf. Definition 3), enabling us
to construct every desired singular CRE. Besides, the black box consideration of the applied
reasoner provides more flexibility in choosing optimal reasoning approaches for different
situations.

Future Research. In general, the question which reasoner should be selected for the consid-
ered ontologies is not an easy one and requires further investigations. On the other side, there
may be some advantages of directly integrating the algorithm into a particular reasoner as the
algorithm can then utilize the reasonerโ€™s internal data structures and be optimized accordingly.
Moreover, it may be of interest to consider extensions of Horn ๐’œโ„’๐’ž, too. While some features
might be already supported with none or only small adaptations in the algorithm, others prob-
ably require more examination, especially since we might lose the existence of the tree-like
universal model and thus can no longer make use of our original singularity definition.
References
[1] C. Areces, A. Koller, K. Striegnitz, Referring Expressions as Formulas of Description Logic,
    in: Proceedings of the Fifth International Natural Language Generation Conference, 2008,
    pp. 42โ€“49. doi:https://doi.org/10.3115/1708322.1708332.
[2] A. Borgida, D. Toman, G. Weddell, On Referring Expressions in Query Answering over First
    Order Knowledge Bases, in: Proceedings of the 15th International Conference on Principles
    of Knowledge Representation and Reasoning (KRโ€™16), AAAI Press, 2016, pp. 319โ€“328. URL:
    https://www.aaai.org/ocs/index.php/KR/KR16/paper/viewFile/12860/12488.
[3] D. Toman, G. Weddell, Finding ALL Answers to OBDA Queries Using Referring Expressions,
    in: AI 2019: Advances in Artificial Intelligence, Springer International Publishing, Cham,
    2019, pp. 117โ€“129. doi:https://doi.org/10.1007/978-3-030-35288-2_10.
[4] F. Baader, D. Calvanese, D. McGuinness, P. Patel-Schneider, D. Nardi, et al., The Description
    Logic Handbook: Theory, implementation and applications, Cambridge University Press,
    2003.
[5] Y. Kazakov, Consequence-Driven Reasoning for Horn ๐’ฎโ„‹โ„๐’ฌ Ontologies, in: Proc. of
    the 21st Int. Joint Conf. on Artificial Intelligence, IJCAI 2009, 2009, pp. 2040โ€“2045. URL:
    https://www.ijcai.org/Proceedings/09/Papers/336.pdf.
[6] E. Krahmer, K. van Deemter, Computational Generation of Referring Expressions: A Survey,
    Computational Linguistics 38 (2012) 173โ€“218. doi:https://doi.org/10.1162/COLI_
    a_00088.
[7] M. Horridge, S. Bechhofer, The OWL API: A Java API for OWL ontologies, Semant. Web 2
    (2011) 11โ€“21. doi:https://doi.org/10.3233/SW-2011-0025.
[8] F. Baader, B. Hollunder, B. Nebel, H.-J. Profitlich, E. Franconi, An Empirical Analysis of
    Optimization Techniques for Terminological Representation Systems, Applied Intelligence
    4 (1994) 109โ€“132. doi:https://doi.org/10.22028/D291-24994.
[9] N. Matentzoglu, B. Parsia, ORE 2015 Reasoner Competition Corpus, 2015. URL: https:
    //doi.org/10.5281/zenodo.18578. doi:10.5281/zenodo.18578.