=Paper= {{Paper |id=Vol-3263/paper-9 |storemode=property |title=Exact Learning of ELI Queries in the Presence of DL-Lite-Horn Ontologies |pdfUrl=https://ceur-ws.org/Vol-3263/paper-9.pdf |volume=Vol-3263 |authors=Maurice Funk,Jean Christoph Jung,Carsten Lutz |dblpUrl=https://dblp.org/rec/conf/dlog/FunkJL22 }} ==Exact Learning of ELI Queries in the Presence of DL-Lite-Horn Ontologies== https://ceur-ws.org/Vol-3263/paper-9.pdf
Exact Learning of ℰℒℐ Queries in the Presence of
DL-Lite-Horn Ontologies
Maurice Funk1 , Jean Christoph Jung2 and Carsten Lutz1
1
    Leipzig University
2
    University of Hildesheim


                                         Abstract
                                         Learning, in Angluin’s framework of exact learning, a query in the presence of a description logic ontology
                                         often involves as a crucial (iterated) step the generalization of a hypothesis query. This may be achieved,
                                         for example, by constructing a least general generalization of the hypothesis and a counterexample that
                                         was provided by the oracle. In this research note, we observe that it may pay off to resort to a more
                                         liberal construction that uses the counterexample as a guide to produce a generalization of the hypothesis
                                         while not necessarily achieving a generalization of the counterexample. We use this approach to show
                                         polynomial time learnability of ℰℒℐ concept queries (ELIQs) in the presence of ontologies which are
                                         formulated in a mild restriction of DL-Liteℱ  horn .

                                         Keywords
                                         Exact Learning, Least General Generalizations, DL-Lite-Horn




1. Introduction
Various forms of learning description logic (DL) concepts, ontologies, and queries have been
studied in the literature, including PAC learning [1, 2, 3], the construction of the least common
subsumer (LCS) and the most specific concept (MSC) [4, 5, 6, 7, 8], and learning from labeled
data examples [9, 10, 11, 12, 13, 14]. In this research note, we consider Angluin’s framework of
exact learning where a learner interacts in a game-like fashion with an oracle [15, 16]. The main
aim is to find an algorithm that enables the learner to construct the target object in polynomial
time based on queries that they pose to the oracle, even when the oracle does not answer the
queries in the most informative way.
   The interest in exact learning in DLs started with an investigation of ontology learning in
(the conference version of) [17], see also [18, 19] and the survey [20]. This was complemented
by studies of exactly learning DL concepts and queries: learning ℰℒℐ concept queries (ELIQs)
without ontologies is considered in [21] while [22] studies learning ℰℒ concept queries (ELQs),
ELIQs, and restricted forms of conjunctive queries (CQs) in the presence of ℰℒ and ℰℒℐ
ontologies. Very recently, [23, 24] has investigated learning ELIQs in the presence of ontologies
formulated in the DL-Lite dialects DL-Liteℋ and DL-Liteℱ − where the ‘−’ indicates a restriction
on the use of inverse functional roles.

   DL 2022: 35th International Workshop on Description Logics, August 7–10, 2022, Haifa, Israel
$ mfunk@informatik.uni-leipzig.de (M. Funk); jungj@uni-hildesheim.de (J. C. Jung);
clu@informatik.uni-leipzig.de (C. Lutz)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073
                                       CEUR Workshop Proceedings (CEUR-WS.org)
   To explain the contribution of this article, let us introduce exact learning of queries in the
presence of DL ontologies. Learner and oracle both know and agree on the ontology 𝒪 and
they also agree on the target query 𝑞𝑇 to use only concept and role names from 𝒪. There
are two kinds of queries that the learner may pose to the oracle. In a membership query,
the learner provides an ABox 𝒜 and a candidate answer 𝑎      ¯ and asks whether 𝒜, 𝒪 |= 𝑞𝑇 (𝑎  ¯);
the oracle faithfully answers “yes” or “no”. In an equivalence query, the learner provides a
hypothesis query 𝑞𝐻 and asks whether 𝑞𝐻 is equivalent to 𝑞𝑇 under 𝒪; the oracle answers
“yes” or provides a counterexample, that is, an ABox 𝒜 and tuple 𝑎     ¯ such that 𝒜, 𝒪 |= 𝑞𝑇 (𝑎¯)
and 𝒜, 𝒪 ̸|= 𝑞𝐻 (𝑎 ¯) (positive counterexample) or vice versa (negative counterexample). Then,
polynomial time learnability means that there is a learning algorithm that constructs 𝑞𝑇 (𝑥    ¯ ),
up to equivalence w.r.t. 𝒪, such that at any given time, the running time of the algorithm is
bounded by a polynomial in the sizes of 𝑞𝑇 , 𝒪, and the largest counterexample given by the
oracle so far. A weaker requirement is polynomial query learnability where only the sum of
the sizes of the queries posed to the oracle up to the current time has to be bounded by such a
polynomial.
   We next describe, on an informal level, how a typical learning algorithm works. The described
strategy has been used, e.g., to learn CQs and mappings in database theory [25, 26], LTL
formulas [27], as well as ontologies and queries in a DL context [17, 22, 24]. The algorithm
constructs a sequence
                                     𝑞0 ⊆𝒪 𝑞1 ⊆𝒪 𝑞2 ⊆𝒪 · · ·
of increasingly general hypothesis queries, where ‘⊆𝒪 ’ denotes query containment under the
ontology 𝒪. It maintains the invariant that 𝑞𝑖 ⊆𝒪 𝑞𝑇 for all 𝑖 ≥ 0 where 𝑞𝑇 is the target query
to be learned (only known to the oracle). As the initial hypothesis 𝑞0 , one constructs a very
strong query which implies any possible 𝑞𝑇 . If, for example, the query language is unary CQs
and the ontology 𝒪 does not express any disjointnesses between concept and role names, then
this query might be the single-variable query that has atoms 𝐴(𝑥) and 𝑟(𝑥, 𝑥) for all concept
names 𝐴 and role names 𝑟. If a more restricted query class such as ELIQs is used or the ontology
expresses disjointness constraints, then the construction of the initial hypothesis might be more
subtle and also involve interaction with the oracle, see [22, 24].
   To move from hypothesis 𝑞𝑖 to 𝑞𝑖+1 , the algorithm repeatedly employs a suitable generaliza-
tion strategy, which may be viewed as the heart of the learning algorithm. In the literature, one
finds two main such strategies. To describe them, assume that the class of target queries 𝒬 is a
class of CQs such as all CQs or all ELIQs.
   When query and ontology language are sufficiently restricted, it may happen that the set
of all possible least general generalizations of the current hypothesis 𝑞𝐻 can be computed in
polynomial time. Then, membership queries to the oracle can be used to identify a generalization
that implies the target, and the algorithm does not need to use equivalence queries at all. This
strategy has been used, for example, to learn ELIQs in the presence of ontologies formulated
in DL-Liteℋ and DL-Liteℱ − [21, 24], but it already fails for learning ELIQs under DL-Litehorn
ontologies [24]; recall that the ·horn subscript indicates the presence of conjunction. The second
strategy is to pose the current hypothesis as an equivalence query to the oracle, and to then
construct a least general generalization of the hypothesis 𝑞𝐻 and the returned counterexample
    ¯), which must be positive since the hypothesis is contained in the target. What we mean
(𝒜, 𝑎
here is a 𝒬-LGG, that is, a query 𝑝 such that 𝑞𝐻 ⊆𝒪 𝑝, 𝑞𝒜 ⊆𝒪 𝑝 where 𝑞𝒜 is 𝒜 viewed as
a CQ with answer variables 𝑎    ¯, and 𝑝 ⊆𝒪 𝑝′ for every 𝑝′ ∈ 𝒬 with 𝑞𝐻 ⊆𝒪 𝑝′ and 𝑞𝒜 ⊆𝒪 𝑞 ′ .
This approach has been used to learn unrestricted CQs without ontologies [21] and to learn
syntactically restricted CQs under ℰℒ ontologies [22].
   The aim of this note is to introduce a variation of the second approach to generalization,
and to demonstrate its usefulness by devising a polynomial time learning algorithm for ELIQs
in the presence of DL-Liteℱ    −
                             horn ontologies. In theory, a natural way to construct a 𝒬-LGG of
the hypothesis 𝑞𝐻 and counterexample 𝒜 is to build the universal models of 𝑞𝐻 (viewed as an
ABox) and of 𝒜 under the ontology 𝒪, and to then take their direct product.1 If we interpret
‘universal model’ as meaning homomorphism universal,2 then such models are infinite and
thus the described construction cannot be used in a learning algorithm. But homomorphism
universality is not strictly required to obtain a 𝒬-LGG when 𝒬 is not the class of all CQs. We
may then use 𝒬-universal models which only require that for every query 𝑞 in the target query
language 𝒬, the answers to 𝑞 on 𝒜 under 𝒪 coincide with the answers to 𝑞 on the universal
model. Sometimes, it is possible to construct finite 𝒬-universal models, an approach that has
been used successfully to learn a restricted form of CQs in the presence of ℰℒ ontologies [22]. To
learn ELIQs under DL-Lite ontologies, however, this approach fails since finite ELIQ-universal
models are not guaranteed to exist (even for non-branching ELIQs).

Example 1. Let 𝒪 = {⊤ ⊑ ∃𝑟.⊤} and 𝒜 = {𝐴(𝑎)}. The homomorphism-universal model of 𝒜
and 𝒪 is 𝒜 extended with an infinite 𝑟-path 𝑟(𝑎, 𝑎1 ), 𝑟(𝑎1 , 𝑎2 ), . . . . Any ELIQ-universal model
also needs such a path, and thus the only chance to obtain a finite ELIQ-universal model is to reuse
individuals on the path. But such a model cannot be ELIQ-universal: if 𝑎𝑛 = 𝑎𝑚 , with 𝑛 < 𝑚,
then 𝑎 is an answer to the ELIQ ∃𝑟𝑚 .∃(𝑟− )𝑛 .𝐴 on the universal model, but not on 𝒜 under 𝒪.

  Of course, there could potentially be ways to construct a 𝒬-LGG other than taking the direct
product of 𝒬-universal models. In the presence of DL-Lite ontologies, though, the 𝒬-LGG is not
guaranteed to exist when 𝒬 is the class of CQs or the class of non-branching ELIQs extended
with reflexive role atoms. For non-extended ELIQs, the existence of 𝒬-LGGs remains open.

Example 2. Let 𝒪 = {∃𝑟− .⊤ ⊑ ∃𝑟.⊤, ∃𝑟− .⊤ ⊑ ∃𝑠.⊤}. Consider the unary CQs

𝑝(𝑥) = ∃𝑦∃𝑧 𝑟(𝑥, 𝑥) ∧ 𝑠(𝑥, 𝑦) ∧ 𝑠(𝑧, 𝑦) ∧ 𝑟(𝑧, 𝑧) ∧ 𝐴(𝑧)                        and     𝑞(𝑥) = ∃𝑦 𝐴(𝑥) ∧ 𝑟(𝑥, 𝑦).

We claim that no ELIQ-LGG of 𝑝 and 𝑞 exists, and thus also no 𝒬-LGG for any query class 𝒬
that contains all ELIQs. To see this, assume that the CQ 𝑞(𝑥) ̂︀    is an ELIQ-LGG of 𝑝 and 𝑞, and
consider all ELIQs of the form 𝑞𝑛,𝑚 = ∃𝑟𝑛 .∃𝑠.∃𝑠− .∃(𝑟− )𝑚 .𝐴 with 𝑛, 𝑚 ≥ 1. It is easy to see
that 𝑝 ⊆𝒪 𝑞𝑛,𝑚 and 𝑞 ⊆𝒪 𝑞𝑛,𝑚 if and only if 𝑛 = 𝑚, thus 𝑞̂︀ ⊆𝒪 𝑞𝑛,𝑚 if and only if 𝑛 = 𝑚.
For all 𝑖 ≥ 1, take a homomorphism ℎ𝑖 from 𝑞𝑖,𝑖 to the homomorphism universal model 𝒰 of 𝑞̂︀
(viewed as an ABox) and 𝒪; this model is defined in detail in Section 2. If for some 𝑖, ℎ𝑖 maps
two distinct variables in the ‘∃𝑟𝑖 ’ prefix of 𝑞𝑖,𝑖 to the same element of 𝒰, then an easy pumping
argument shows that 𝑞̂︀ ⊆𝒪 𝑞𝑗,𝑖 for some 𝑗 > 𝑖, a contradiction. Otherwise, there is some 𝑖 ≥ 1
such that ℎ𝑖 chooses as the 𝑠-successor required by the ‘∃𝑠’ infix in 𝑞𝑖,𝑖 an element of 𝒰 that was
1
    This may yield a CQ that does not fall within 𝒬, but there are strategies for the learning algorithm to deal with this.
2
    that is, a universal model of an ABox 𝒜 and 𝒪 admits a homomorphism into every model of 𝒜 and 𝒪.
generated by an existential quantifier, that is, it is in the tree-shaped ‘anonymous’ part of 𝒰. Since
𝑞𝑖,𝑖 is rooted, the ℎ𝑖 -homomorphic image of the ‘∃𝑟𝑖 ’ prefix of 𝑞𝑖,𝑖 enters the anonymous part from
the same non-anonymous element 𝑦 where it also leaves it to eventually reach an element that
satisfies 𝐴 (there are no such elements in the anonymous part). Thus 𝑦 is reachable in 𝑞̂︀ from 𝑥
along an 𝑟-path and 𝑥 reaches an instance of 𝐴 along an 𝑟-path, which means that 𝑞̂︀ ⊆𝒪 ∃𝑟𝑘 .𝐴
for some 𝑘 ≥ 1. But this contradicts 𝑝 ⊆𝒪 𝑞.  ̂︀
   In this paper, we propose to replace 𝒬-LGGs by a more liberal construction, which still
achieves a generalization of the hypothesis, though not necessarily of the counterexample. In
fact, we use the counterexample only as a guide to identify in polynomial time a generalization
of the hypothesis that is contained in the target query. In contrast to the construction of 𝒬-LGGs
via products, our construction is asymmetric in that it treats the hypothesis differently from
the counterexample. We use our construction as a central ingredient to prove polynomial
time learnability of ELIQs in the presence of DL-Liteℱ      −
                                                          horn ontologies. For the type of learning
algorithm that we pursue, there is a ‘natural ensemble’ of lemmas that one may use to prove
correctness and termination in polynomial time [21, 22]. Whenever possible, we establish these
lemmas in a general version, namely for rooted CQs in place of ELIQs and for ℰℒℐℱ ontologies
in place of DL-Liteℱ −
                   horn ontologies. This serves to highlight the places where we crucially rely
on ELIQs and DL-Liteℱ     −
                       horn , and in addition it is potentially useful for future proofs where the
lemmas that admit a general formulation do not need to be reproved.
   Missing proof details will be provided in the long version of the paper.


2. Preliminaries
Ontologies and ABoxes. Let NC , NR , and NI be countably infinite sets of concept, role and
individual names. A role 𝑅 is a role name 𝑟 or the inverse 𝑟− of a role name, and 𝑅− denotes
𝑟 when 𝑅 = 𝑟− . A basic concept 𝐵 is of the form ⊤, 𝐴, or ∃𝑅 where 𝐴 ranges over NC and
𝑅 over roles. A DL-Liteℱ    horn ontology is a set of concept inclusions (CIs) 𝐵1 ⊓ · · · ⊓ 𝐵𝑛 ⊑ 𝐵,
concept disjointness constraints 𝐵1 ⊓ · · · ⊓ 𝐵𝑛 ⊑ ⊥, role disjointness constraints 𝑅1 ⊓ 𝑅2 ⊑ ⊥,
and functionality assertions func(𝑅) where 𝐵𝑖 , 𝐵 range over basic concepts and 𝑅1 , 𝑅2 , 𝑅 over
roles. In a DL-Liteℱ  −
                    horn ontology, we additionally require that if ∃𝑅 occurs on the right-hand side
of a CI, then func(𝑅− ) ∈  / 𝒪 [24].
   An ℰℒℐ concept is an expression 𝐶 that is built according to the rule 𝐶 ::= ⊤ | 𝐴 | 𝐶 ⊓ 𝐶 |
∃𝑅.𝐶 where 𝐴 ranges over concept names and 𝑅 over roles. An ℰℒℐℱ ontology 𝒪 is a finite
set of concept inclusions (CIs) 𝐶 ⊑ 𝐷, emptiness constraints 𝐶 ⊑ ⊥, role disjointness constraints
𝑅1 ⊓ 𝑅2 ⊑ ⊥, and functionality assertions func(𝑅) where 𝐶, 𝐷 range over ℰℒℐ concepts and
𝑅1 , 𝑅2 , 𝑅 over roles. The basic concept ∃𝑅 is a different way to write the ℰℒℐ concept ∃𝑅.⊤.
Note that every DL-Liteℱ   horn ontology is an ℰℒℐℱ ontology. An ℰℒℐℱ ontology (and thus also
a DL-Liteℱ horn ontology)  is in normal form if all concept inclusions in it are of the form 𝐴 ⊑ 𝐶
or 𝐶 ⊑ 𝐴, where 𝐴 is a concept name. Every DL-Liteℱ         horn ontology 𝒪 can be transformed in
polynomial time into a DL-Litehorn ontology 𝒪 in normal form such that 𝒪′ is a conservative
                                   ℱ                ′

extension of 𝒪, and the same is true for ℰℒℐℱ ontologies.
   An ABox 𝒜 is a finite set of concept assertions 𝐴(𝑎) and role assertions 𝑟(𝑎, 𝑏) with 𝐴 a
concept name or ⊤, 𝑟 a role name, and 𝑎, 𝑏 individual names. We use ind(𝒜) to denote the
set of individual names used in 𝒜. We admit concept assertions ⊤(𝑎) in order to represent
interpretations and ABoxes in a uniform way.
   The semantics is defined as usual in terms of interpretations ℐ, which we define to be a
(possibly infinite and) non-empty set of concept and role assertions. We use ∆ℐ to denote the
set of individual names in ℐ and set 𝐴ℐ = {𝑎 | 𝐴(𝑎) ∈ ℐ} for all 𝐴 ∈ NC and 𝐶 ℐ for compound
concepts 𝐶 in the usual way [28]. Note that every ABox is a finite interpretation and, vice versa,
every finite interpretation is an ABox. An interpretation ℐ satisfies a concept inclusion 𝐶 ⊑ 𝐷
if 𝐶 ℐ ⊆ 𝐷ℐ , a constraint 𝐶 ⊑ ⊥ if 𝐶 ℐ = ∅, a functionality assertion func(𝑅) if 𝑅ℐ is a partial
function, a concept assertion 𝐴(𝑎) if 𝑎 ∈ 𝐴ℐ , and a role assertion 𝑟(𝑎, 𝑏) if (𝑎, 𝑏) ∈ 𝑟ℐ . An
interpretation is a model of an ℰℒℐℱ ontology or an ABox if it satisfies all concept inclusions,
constraints, and assertions in it. We write 𝒪 |= 𝐶 ⊑ 𝐷 if every model of the ontology 𝒪
satisfies the concept inclusion 𝐶 ⊑ 𝐷 and 𝒜, 𝒪 |= 𝐵(𝑎) if every model of 𝒜 and 𝒪 satisfies the
concept assertion 𝐵(𝑎). An ABox 𝒜 is satisfiable w.r.t. an ℰℒℐℱ ontology 𝒪 if 𝒜 and 𝒪 have a
common model. We use ℐ1 × ℐ2 to denote the direct product of two interpretations ℐ1 , ℐ2 .
   A signature is a set of concept and role names, uniformly referred to as symbols. For a syntactic
object 𝑂 such as an ontology, we use sig(𝑂) to denote the symbols used in 𝑂 and ||𝑂|| to denote
the size of 𝑂, that is, the length of a representation of 𝑂 as a word in a suitable alphabet.

Queries. Every ℰℒℐ concept 𝐶 can be viewed as an ℰℒℐ query (ELIQ). An individual 𝑎 ∈
ind(𝒜) is an answer to 𝐶 on an ABox 𝒜 w.r.t. an ontology 𝒪, written 𝒜, 𝒪 |= 𝐶(𝑎), if 𝑎 ∈ 𝐶 ℐ
for all models ℐ of 𝒜 and 𝒪. We shall often view ELIQs as unary conjunctive queries (CQs)
and also consider CQs that are not ELIQs. A CQ takes the form 𝑞(𝑥                        ¯ , 𝑦¯) with 𝜑 a
                                                                             ¯ ) = ∃𝑦¯ 𝜑(𝑥
conjunction of concept atoms 𝐴(𝑥) and role atoms 𝑟(𝑥, 𝑦) where 𝐴 ∈ NC and 𝑟 ∈ NR . We call
the variables in 𝑥 ¯ answer variables. The arity of 𝑞 is the length |𝑥¯ | of 𝑥
                                                                             ¯ , and a query is Boolean
if it has arity 0. We use var(𝑞) to denote the set of variables that occur in 𝑞. We may view 𝑞 as a
set of atoms whenever convenient and may write 𝑟− (𝑥, 𝑦) in place of 𝑟(𝑦, 𝑥). A CQ is rooted
if in its Gaifman graph 𝐺𝑞 = (var(𝑞), {{𝑦, 𝑧} | 𝑟(𝑦, 𝑧) ∈ 𝑞}) every variable is reachable from
some answer variable. It is well-known that ELIQs are in 1-to-1 correspondence with rooted,
unary CQs whose Gaifman graph is a tree and that contain no self-loops and multi-edges. We
use 𝒜𝑞 to denote the ABox obtained from CQ 𝑞 by viewing variables as individuals and atoms
as assertions. A CQ 𝑞 is satisfiable w.r.t. ontology 𝒪 if 𝒜𝑞 is. For any CQ 𝑞 and set 𝑈 ⊆ var(𝑞),
𝑞|𝑈 is the restriction of 𝑞 to all atoms that only contain variables in 𝑈 .
    The semantics of CQs is given in terms of homomorphisms as usual. As for ELIQs, we will
write 𝒜, 𝒪 |= 𝑞(𝑎   ¯) if the tuple 𝑎                   ¯ ) on 𝒜 w.r.t. 𝒪. For CQs 𝑞1 and 𝑞2 and an
                                    ¯ is an answer to 𝑞(𝑥
ℰℒℐℱ ontology 𝒪, we say that 𝑞1 is contained in 𝑞2 w.r.t. 𝒪, written 𝑞1 ⊆𝒪 𝑞2 , if for all ABoxes
𝒜 and 𝑎  ¯ from ind(𝒜), 𝒜, 𝒪 |= 𝑞1 (𝑎  ¯) implies 𝒜, 𝒪 |= 𝑞2 (𝑎¯). We call 𝑞1 and 𝑞2 equivalent w.r.t.
𝒪, written 𝑞1 ≡𝒪 𝑞2 , if 𝑞1 ⊆𝒪 𝑞2 and 𝑞2 ⊆𝒪 𝑞1 .

Universal Model. Query answering and query containment w.r.t. DL-Liteℱ      horn ontologies can
be conveniently characterized using universal models. Let 𝒪 be an DL-Liteℱ     horn ontology in
normal form and 𝒜 an ABox
                       d that is satisfiable w.r.t. 𝒪. For′ a set 𝑀 of concept names, we write
  𝑀 as a shorthand for 𝐴∈𝑀 𝐴. For  d 𝑎 ∈ ind(𝒜), 𝑀, 𝑀 sets of concept names, and 𝑅 a role,
d
we write 𝑎 ⇝𝑅𝒜,𝒪 𝑀  if 𝒜, 𝒪 |= ∃𝑅.   𝑀 (𝑎) and 𝑀 is maximal with this condition. We write
        𝒪 𝑀 if 𝒪 |=        𝑀 ⊑ ∃𝑅. 𝑀 ′ and 𝑀 ′ is maximal with this.
𝑀 ⇝𝑅          ′
                        d             d
  A trace for 𝒜 and 𝒪 is a sequence 𝑡 = 𝑎𝑅1 𝑀1 𝑅2 𝑀2 . . . 𝑅𝑛 𝑀𝑛 , 𝑛 ≥ 0 where 𝑎 ∈ ind(𝒜),
𝑅1 , . . . , 𝑅𝑛 are roles in sig(𝒪), and 𝑀1 , . . . , 𝑀𝑛 are sets of concept names in sig(𝒪), such that
   (i) 𝑎 ⇝𝑅
          𝒜,𝒪 𝑀1 and there is no 𝑏 ∈ ind(𝒜) with 𝑅1 (𝑎, 𝑏) ∈ 𝒜,
            1


              𝑅
  (ii) 𝑀𝑖 ⇝𝒪𝑖+1 𝑀𝑖+1 and 𝑅𝑖+1 ̸= 𝑅𝑖− , for 1 ≤ 𝑖 < 𝑛.
The set T of all traces for 𝒜 and 𝒪 forms the domain of the universal model 𝒰𝒜,𝒪 , defined as

        𝒰𝒜,𝒪 = 𝒜 ∪ {𝐴(𝑎) | 𝒜, 𝒪 |= 𝐴(𝑎)} ∪ {𝐴(𝑡𝑅𝑀 ) | 𝑡𝑅𝑀 ∈ T and 𝐴 ∈ 𝑀 } ∪
                   {𝑅(𝑡, 𝑡𝑅𝑀 ) | 𝑡𝑅𝑀 ∈ T}.

For a CQ 𝑞, we usually write 𝒰𝑞,𝒪 instead of 𝒰𝒜𝑞 ,𝒪 . The following property of 𝒰𝒜,𝒪 is crucial
for our technical development.
Observation 3. Let 𝒪 be a DL-Liteℱ horn ontology in normal form, 𝒜 an ABox, and ℐ = 𝒰𝒜,𝒪 ∖ 𝒜.
                        ℐ
Then for every role 𝑅, 𝑅 is a partial function.

𝒪-saturatedness and 𝒪-minimality. Let 𝒪 be an ℰℒℐℱ ontology. A CQ 𝑞 is 𝒪-minimal if
there is no 𝑈 ⊊ var(𝑞) such that 𝑞 ≡𝒪 𝑞|𝑈 . A CQ 𝑞 is 𝒪-saturated if 𝒜𝑞 , 𝒪 |= 𝐴(𝑦) implies
𝐴(𝑦) ∈ 𝑞 for all 𝑦 ∈ var(𝑞) and 𝐴 ∈ NC . Every CQ (or ELIQ) can be converted into an equivalent
𝒪-saturated one in polynomial time when an oracle for queries of the form “𝒜, 𝒪 |= 𝐴(𝑎)?”
is available. In DL-Liteℱ
                        horn , such queries can be answered in polynomial time [29] and thus
𝒪-saturatedness can be established in polynomial time.


3. Guided Generalizations
Recall from the introduction that a CQ 𝑞̂︀ is a least general 𝒬-generalization (𝒬-LGG) of CQs
𝑝, 𝑞 under an ontology 𝒪 if 𝑞 ⊆𝒪 𝑞,
                                 ̂︀ 𝑝 ⊆𝒪 𝑞, ̂︀ and 𝑞̂︀ ⊆𝒪 𝑞 ′ for every 𝑞 ′ ∈ 𝒬 with 𝑞 ⊆𝒪 𝑞 ′ and
𝑝 ⊆𝒪 𝑞 . We consider the following weakening of 𝒬-LGGs.
        ′


Definition 4. Let 𝒪 be an ontology and 𝒬 a class of queries, and let 𝑝, 𝑞 be CQs with 𝑝 ̸⊆𝒪 𝑞. A
CQ 𝑞̂︀ is a 𝑝-guided 𝒬-generalization of 𝑞 under 𝒪 if the following conditions are satisfied:
   1. 𝑞 ⊆𝒪 𝑞;
           ̂︀

   2. 𝑞̂︀ ̸⊆𝒪 𝑞;

   3. 𝑞̂︀ ⊆𝒪 𝑞 ′ , for every 𝑞 ′ ∈ 𝒬 with 𝑞 ⊆𝒪 𝑞 ′ and 𝑝 ⊆𝒪 𝑞 ′ .
   Conditions 1 and 3 match the first and the last condition in the definition of a 𝒬-LGG.
Intuitively, they mean that 𝑞̂︀ is a generalization of 𝑞 (Condition 1) which preserves all common
𝒬-consequences of 𝑝 and 𝑞 (Condition 3). Condition 2 weakens the second condition in the
definition of an LGG: instead of requiring 𝑝 ⊆ 𝑞,  ̂︀ we only want 𝑞̂︀ to strictly generalize 𝑞. In the
context of learning, one may view 𝑝 as orthogonal knowledge about how to imply the unknown
target, and the goal of guided generalization is to incorporate some of that knowledge into 𝑞.        ̂︀
Thus, in contrast to LGGs, guided generalizations are an asymmetric notion in that the two
queries 𝑝 and 𝑞 play different roles: 𝑞 is the query to be generalized and 𝑝 acts as the guide for
doing so. We start with observing that 𝑝-guided 𝒬-generalizations are not uniquely defined.

Example 5. Consider 𝑞(𝑥) = 𝐴(𝑥) ∧ 𝐵(𝑥) ∧ 𝐶(𝑥) and 𝑝(𝑥) = 𝐴(𝑥). Then both 𝑞1 (𝑥) = 𝐴(𝑥)
and 𝑞2 (𝑥) = 𝐴(𝑥) ∧ 𝐵(𝑥) are 𝑝-guided ELIQ-generalizations of 𝑞 under the empty ontology.

   It is not by accident that in Example 5 the ELIQ-LGG of 𝑞 and 𝑝 (which is 𝑞1 ) is also a guided
ELIQ-generalization. In fact, it is not difficult to show that each 𝒬-LGG of two CQs 𝑝, 𝑞 under an
ontology 𝒪 is both a 𝑝-guided 𝒬-generalization of 𝑞 under 𝒪 and a 𝑞-guided 𝒬-generalization
of 𝑝 under 𝒪. The subsequent example shows that the converse direction is not true, that is,
there are cases where a guided generalization exists, but LGGs do not.

Example 6. Consider again queries 𝑝 and 𝑞 and the ontology 𝒪 from Example 2, and recall that
there is no CQ-LGG for 𝑝, 𝑞. However, the query

                                   𝑞(𝑥)
                                   ̂︀   = ∃𝑦∃𝑦 ′ 𝑟(𝑥, 𝑦) ∧ 𝑟(𝑥, 𝑦 ′ ) ∧ 𝐴(𝑦 ′ )

is a 𝑝-guided CQ-generalization of 𝑞 under 𝒪. To illustrate the asymmetry of the notion, observe
that 𝑞̂︀ is not a 𝑞-guided CQ-generalization of 𝑝 under 𝒪, since it does not satisfy Condition 1.

  We now give our main result, namely that guided ELIQ-generalizations of ELIQs under
DL-Liteℱ −
       horn ontologies always exist and can be computed in polynomial time.

Theorem 7. Given a DL-Liteℱ       −
                                horn ontology 𝒪 in normal form and ELIQs 𝑝, 𝑞 such that 𝑝, 𝑞 are
satisfiable w.r.t. 𝒪 and 𝑞 is 𝒪-minimal,3 we can compute in polynomial time a 𝑝-guided ELIQ-
generalization 𝑞̂︀ of 𝑞 under 𝒪 such that 𝑞̂︀ is satisfiable w.r.t. 𝒪.

   Let 𝑞(𝑥1 ), 𝑝(𝑥2 ) be ELIQs. We construct a 𝑝-guided ELIQ-generalization 𝑞̂︀ of 𝑞 under 𝒪 in
three steps as follows. We start with the query 𝑞̂︀ = (𝒰𝑞,𝒪 ×𝒰𝑝,𝒪 )|{(𝑥1 ,𝑥2 )} , that is, the restriction
of 𝒰𝑞,𝒪 × 𝒰𝑝,𝒪 to variable (𝑥1 , 𝑥2 ), which will be the answer variable of 𝑞.    ̂︀ This query is then
extended by first exhaustively applying rule (A1) below and then applying rule (A2).

    (A1) For every (𝑧, 𝑡) ∈ var(̂︀
                                 𝑞) with 𝑧 ∈ var(𝑞) and 𝑡 ∈ ∆𝒰𝑝,𝒪 , every atom 𝑅(𝑧, 𝑧 ′ ) in 𝑞, and
         every atom 𝑅(𝑡, 𝑡 ) ∈ 𝒰𝑝,𝒪 , add the atom 𝑅((𝑧, 𝑡), (𝑧 ′ , 𝑡′ )), and all atoms 𝐴(𝑧 ′ , 𝑡′ ) such
                            ′

         that 𝐴(𝑧 ′ ) ∈ 𝒰𝑞,𝒪 and 𝐴(𝑡′ ) ∈ 𝒰𝑝,𝒪 .

    (A2) For every (𝑧, 𝑡) ∈ var(̂︀
                                𝑞) with 𝑧 ∈ var(𝑞) and 𝑡 ∈ ∆𝒰𝑝,𝒪 and every role 𝑅 such that
             𝑞,𝒪 𝑀 for some 𝑀 and there is no atom of the form 𝑅(𝑧, 𝑧 ) in 𝑞, add the atoms
         𝑧 ⇝𝑅                                                        ′


                                                            ̂︀ 𝑅(𝑧 ′ , 𝑧)
                                                  𝑅((𝑧, 𝑡), 𝑧),        ̂︀

          with 𝑧̂︀ a fresh variable, and add a copy 𝑞 ′ of 𝑞 in which the copy of 𝑧 is 𝑧 ′ .

3
    We conjecture that, given an ELIQ, an equivalent 𝒪-minimal ELIQ can be computed in polynomial time by
    extending the techniques for answering tree-shaped queries over DL-Lite knowledge bases in polynomial time [30]
    to DL-Liteℱ
              horn knowledge bases. For the purpose of this paper the statement in the theorem suffices.
Recall that 𝒰𝑞,𝒪 × 𝒰𝑝,𝒪 , when viewed as an infinitary CQ, may serve as a CQ-LGG of 𝑝 and 𝑞.
Intuitively, the above construction may be viewed as producing an approximation of this product
from below, in the sense that the product may be more general. It is easy to see that after having
applied (A1) exhaustively, we have constructed exactly the restriction of the product 𝒰𝑞,𝒪 ×𝒰𝑝,𝒪
to the elements (𝑡, 𝑡′ ) that are reachable from the element (𝑥1 , 𝑥2 ) and satisfy 𝑡 ∈ var(𝑞). We
will show that this is a finite structure and even of polynomial size, which is essentially due to
Observation 3 on the shape of universal models for DL-Liteℱ        −
                                                                 horn ontologies. What is missing is
the infinite part of 𝒰𝑞,𝒪 × 𝒰𝑝,𝒪 determined by elements (𝑡, 𝑡′ ) where 𝑡 is a proper trace, that is,
𝑡 is not a variable from 𝑞. (A2) approximates this part by traveling the traces of 𝒰𝑞,𝒪 (but not
of 𝒰𝑝,𝒪 ) for only one step and then adding copies of 𝑞 as described. Note that for DL-Liteℱ      −
                                                                                                horn
ontologies 𝒪, the first step into the traces of 𝒰𝑞,𝒪 is enough to regenerate via 𝒪 the entire
universal model 𝒰𝑞,𝒪 . Also note that for (A2) to produce a query that is satisfiable w.r.t. 𝒪, we
rely on the restriction to DL-Liteℱ   −
                                    horn : the precondition of (A2) implies that ∃𝑅 appears on the
right-hand side of some concept inclusion in 𝒪 and thus 𝑅− is not functional.
   We demonstrate our construction on two examples that additionally illustrate (1) that (A2) is
indeed needed and (2) that the result 𝑞̂︀ is not necessarily an ELIQ.
Example 8. (1) Consider the ontology 𝒪 = {𝑋 ⊑ ∃𝑟, ∃𝑟 ⊑ 𝑋, ∃𝑟− ⊑ ∃𝑠} and ELIQs
     𝑞(𝑥1 ) = 𝐵(𝑥1 ) ∧ 𝑋(𝑥1 )             𝑝(𝑥2 ) = ∃𝑥′ ∃𝑦 𝑋(𝑥2 ) ∧ 𝑟(𝑥2 , 𝑦) ∧ 𝑟(𝑥′ , 𝑦) ∧ 𝐵(𝑥′ ) ∧ 𝑋(𝑥′ ).
Note that 𝑝 and 𝑞 are 𝒪-saturated. The result of exhaustively applying Step (A1) is 𝑞̂︀0 (𝑥) = 𝑋(𝑥),4
which generalizes 𝑞, but is too general: the ELIQ
                   𝑞𝑇 (𝑥) = ∃𝑥′ ∃𝑦∃𝑦 ′ ∃𝑧 𝑟(𝑥, 𝑦) ∧ 𝑠(𝑦, 𝑧) ∧ 𝑠(𝑦 ′ , 𝑧) ∧ 𝑟(𝑥′ , 𝑦 ′ ) ∧ 𝐵(𝑥′ )
satisfies 𝑞̂︀0 ̸⊆𝒪 𝑞𝑇 , while 𝑝 ⊆𝒪 𝑞𝑇 and 𝑞 ⊆𝒪 𝑞𝑇 . After additionally applying (A2), we obtain
                         𝑞(𝑥)
                         ̂︀   = ∃𝑥′ ∃𝑦 𝑋(𝑥) ∧ 𝑟(𝑥, 𝑦) ∧ 𝑟(𝑥′ , 𝑦) ∧ 𝐵(𝑥′ ) ∧ 𝑋(𝑥′ ),
which is a 𝑝-guided ELIQ-generalization of 𝑞 under 𝒪.
  (2) Consider the following queries 𝑝′ , 𝑞 ′ and the empty ontology.
                             𝑝′ (𝑥) = ∃𝑦1 ∃𝑦2 𝑟(𝑥, 𝑦1 ) ∧ 𝑟(𝑥, 𝑦2 ) ∧ 𝐴(𝑦1 ) ∧ 𝐵(𝑦2 )
                             𝑞 ′ (𝑥) = ∃𝑦∃𝑧 𝑟(𝑥, 𝑦) ∧ 𝑟(𝑧, 𝑦) ∧ 𝐴(𝑦) ∧ 𝐵(𝑦)
The result of (A1) is the direct product 𝑞 ′ × 𝑝′ of 𝑞 ′ and 𝑝′ which is 𝑞̂︀′ (𝑥) = ∃𝑦1 ∃𝑦2 ∃𝑧 𝑟(𝑥, 𝑦1 ) ∧
𝑟(𝑥, 𝑦2 ) ∧ 𝑟(𝑧, 𝑦1 ) ∧ 𝑟(𝑧, 𝑦2 ) ∧ 𝐴(𝑦1 ) ∧ 𝐵(𝑦2 ), which is not an ELIQ.


4. Exact Learning with Membership and Equivalence Queries
We apply the notion of guided generalizations to show that ELIQs are polynomial time learnable
in the presence of DL-Liteℱ  −
                           horn ontologies using membership and equivalence queries. It is
known that both kinds of queries are needed as otherwise polynomial time learnability fails
(already without functional roles) [22]. Our learning algorithm follows the scheme detailed in
the introduction. The main result is as follows.
4
    We have replaced the single answer variable (𝑥1 , 𝑥2 ) with 𝑥 for the sake of readability.
Algorithm 1 Algorithm for learning ELIQs under DL-Liteℱ −
                                                      horn ontologies
  Input A DL-Liteℱ −
                 horn ontology 𝒪 and a unary CQ 𝑞𝐻 satisfiable w.r.t. 𝒪 such that 𝑞𝐻 ⊆𝒪 𝑞𝑇
                                                 0                                 0

  Output An ELIQ 𝑞𝐻 such that 𝑞𝐻 ≡𝒪 𝑞𝑇
  𝑞𝐻 := extract-minimal-ELIQ(𝑞𝐻 0 )

  while the equivalence query “𝑞𝐻 ≡𝒪 𝑞𝑇 ?” returns a counterexample (𝒜, 𝑎) do
     𝑞𝐷 := extract-minimal-ELIQ(𝑞𝒜 ) where 𝑞𝒜 is 𝒜 viewed as a CQ with answer variable 𝑎
      ′ := a 𝑞 -guided ELIQ-generalization of 𝑞 under 𝒪
     𝑞𝐻       𝐷                                𝐻
     𝑞𝐻 := extract-minimal-ELIQ(𝑞𝐻  ′ )

  end while
  return 𝑞𝐻


Theorem 9. ELIQs are polynomial time learnable under DL-Liteℱ −
                                                            horn ontologies using membership
and equivalence queries.

  To prove the theorem, it suffices to consider ontologies in normal form:

Lemma 10. If ELIQs are polynomial time learnable under DL-Liteℱ      −
                                                                   horn ontologies in normal form
using membership and equivalence queries, the same is true for unrestricted DL-Liteℱ  −
                                                                                    horn ontologies.

   Our learning algorithm is listed in Algorithm 1. It takes as input a DL-Liteℱ  −
                                                                                horn ontology 𝒪
in normal form and a seed query 𝑞𝐻 0 with 𝑞 0 ⊆ 𝑞 . A seed query can be obtained in several
                                            𝐻     𝒪 𝑇
ways, depending on the type of disjointness constraints present in 𝒪; we refer to [24] for details.
As explained in the introduction, the algorithm starts with the seed query and constructs a
sequence of increasingly more general hypothesis queries. In each round, the learner asks
whether the current hypothesis 𝑞𝐻 is the target using an equivalence query. If not, they use
the counterexample provided by the oracle as a guide to generalize 𝑞𝐻 via the construction
from the proof of Theorem 7. Since both the input to that construction and the queries posed as
equivalence queries must be ELIQs, the algorithm relies on the subroutine extract-minimal-ELIQ
to generalize a CQ 𝑞 with 𝑞 ⊆𝒪 𝑞𝑇 into an ELIQ 𝑞 ′ with 𝑞 ′ ⊆𝒪 𝑞𝑇 using membership queries.
In order to attain polynomial running time, extract-minimal-ELIQ additionally ensures a strong
minimality condition on 𝑞 ′ , namely that it is (𝑞𝑇 , 𝒪)-minimal, which means that there is no
𝑈 ⊊ var(𝑞 ′ ) with 𝑞 ′ |𝑈 ⊆𝒪 𝑞𝑇 . Importantly, a (𝑞𝑇 , 𝒪)-minimal query may have at most as
many variables as 𝑞𝑇 (provided that it is 𝒪-saturated, a condition that we shall maintain at all
times), and it is 𝒪-minimal. We next detail the extract-minimal-ELIQ subroutine.
   The extract-minimal-ELIQ subroutine takes as input a unary CQ 𝑞 that satisfies 𝑞 ⊆𝒪 𝑞𝑇 .
It computes an ELIQ 𝑞 ′ with 𝑞 ⊆𝒪 𝑞 ′ ⊆𝒪 𝑞𝑇 by repeatedly attaining (𝑞𝑇 , 𝒪)-minimality and
increasing the length of cycles in 𝑞. A cycle in a CQ 𝑞 is a sequence 𝑅1 (𝑥1 , 𝑥2 ), . . . , 𝑅𝑛 (𝑥𝑛 , 𝑥1 )
of distinct role atoms in 𝑞 such that 𝑥1 , . . . 𝑥𝑛 are distinct. Now, extract-minimal-ELIQ computes
the 𝒪-saturation 𝑝 of 𝑞 and then modifies 𝑝 by exhaustively applying the following two rules:
 Drop variable. Choose a variable 𝑦 ∈ var(𝑝) and let 𝑝′ = 𝑝|var(𝑝)∖{𝑦} . If the response to the
membership query 𝒜𝑝′ , 𝒪 |= 𝑞𝑇 (𝑥) is positive, continue with 𝑝′ in place of 𝑝.
  Double cycle. Choose a role atom 𝑟(𝑥, 𝑦) ∈ 𝑝 that is part of a cycle. Then add a disjoint copy
𝑝′ of 𝑝 to 𝑝 and let 𝑥′ , 𝑦 ′ be the copies of 𝑥, 𝑦 in 𝑝′ . Remove the atoms 𝑟(𝑥, 𝑦), 𝑟(𝑥′ , 𝑦 ′ ) and add
the atoms 𝑟(𝑥, 𝑦 ′ ), 𝑟(𝑥′ , 𝑦).
   We give preference to the first rule, that is, the second rule is only applied when the first one
is not applicable. Clearly, if Drop variable is not applicable, then 𝑝 is (𝑞𝑇 , 𝒪)-minimal. Once no
rule is applicable anymore, extract-minimal-ELIQ returns 𝑞 ′ = 𝑝.
   The following lemma collects the relevant properties of extract-minimal-ELIQ. All properties
except termination are essentially consequences of the definition of the subroutine. The proof
of termination after polynomially many steps relies on Theorem 13 below.
Lemma 11. Let 𝑞 be a unary CQ with 𝑞 ⊆𝒪 𝑞𝑇 that is satisfiable w.r.t. 𝒪. Then,
extract-minimal-ELIQ(𝑞) terminates in time polynomial in ||𝒪|| + ||𝑞|| + ||𝑞𝑇 || and returns
an ELIQ 𝑞 ′ that is 𝒪-saturated, (𝑞𝑇 , 𝒪)-minimal, and satisfies 𝑞 ⊆𝒪 𝑞 ′ ⊆𝒪 𝑞𝑇 .
   To show termination and correctness of our algorithm, we first formalize the notion of a
‘sequence of increasingly general hypotheses which are all contained in 𝑞𝑇 ,’ which is underlying
the general scheme described in the introduction.
Definition 12. Let 𝑞𝑇 be a CQ and 𝒪 an ontology. A sequence 𝑞1 , 𝑞2 , . . . of CQs is a generalization
sequence towards 𝑞𝑇 under 𝒪 if for all 𝑖 ≥ 0, 𝑞𝑖 ⊆𝒪 𝑞𝑖+1 ̸⊆𝒪 𝑞𝑖 , 𝑞𝑖 ⊆𝒪 𝑞𝑇 , and sig(𝑞𝑖 ) ⊆ sig(𝒪).
   Let 𝑞1 , 𝑞2 , . . . be the sequence of ELIQs that are assigned to 𝑞𝐻 during the run of the algorithm.
We show inductively that 𝑞1 , 𝑞2 , . . . is a generalization sequence towards the target query 𝑞𝑇
under 𝒪. For the base case, note that extract-minimal-ELIQ (𝑞𝐻         0 ) computes an initial 𝑞 with
                                                                                                 𝐻
𝑞𝐻0 ⊆ 𝑞 ⊆ 𝑞 . For the inductive step, let (𝒜, 𝑎) be a counterexample provided by the oracle
       𝒪 𝐻        𝒪 𝑇
to the equivalence query “𝑞𝐻 ≡𝒪 𝑞𝑇 ?”. We may assume that 𝒜 uses only symbols from 𝒪 (we
can simply drop all assertions mentioning other symbols). Since 𝑞𝐻 ⊆𝒪 𝑞𝑇 , the counterexample
is positive and thus 𝑞𝒜 ̸⊆𝒪 𝑞𝐻 . The subroutine extract-minimal-ELIQ generalizes 𝑞𝒜 into a
query 𝑞𝐷 , hence 𝑞𝐷 ̸⊆𝒪 𝑞𝐻 . Since 𝑞𝐻         ′ is a 𝑞 -guided ELIQ-generalization of 𝑞 , we have
                                                      𝐷                                     𝐻
𝑞𝐻 ⊆ 𝑞𝐻  ′ (Condition 1 of Definition 4), 𝑞 ′ ̸⊆ 𝑞 (Condition 2), and 𝑞 ′ ⊆ 𝑞 (Condition 3). It
                                                𝐻     𝐻                      𝐻   𝒪 𝑇
remains to note that extract-minimal-ELIQ preserves these conditions.
   It has been observed that already for ELIQs that do not use inverse roles and under the empty
ontology, there is no elementary bound on the length of generalization sequences towards a
given query 𝑞𝑇 [31]. However, since Lemma 11 guarantees that all 𝑞𝑖 are (𝑞𝑇 , 𝒪)-minimal and
𝒪-saturated, the next theorem implies that only polynomially many hypotheses are produced.
Theorem 13. Let 𝑞𝑇 be a rooted CQ and 𝒪 an ℰℒℐℱ ontology in normal form, and let 𝑞1 , 𝑞2 , . . .
be a generalization sequence towards 𝑞𝑇 under 𝒪 such that 𝑞1 is satisfiable w.r.t. 𝒪. If all 𝑞𝑖 are
(𝑞𝑇 , 𝒪)-minimal and 𝒪-saturated, then the sequence has length at most |var(𝑞𝑇 )|3 · |sig(𝒪)|.
   It remains to show that the extract-minimal-ELIQ subroutine terminates after polynomially
many steps. For this, consider the sequence 𝑝1 , 𝑝2 , . . . of queries that Double cycle is applied
to during a run of extract-minimal-ELIQ. All these queries are 𝒪-saturated. By the preference
imposed on rule application, they are also (𝑞𝑇 , 𝒪)-minimal. Since an application of Drop Variable
decreases the size of the query, there are at most polynomially many such applications between
𝑝𝑖 and 𝑝𝑖+1 . Thus, it suffices to show the following lemma and apply Theorem 13.
Lemma 14. The sequence 𝑝1 , 𝑝2 , . . . is a generalization sequence towards 𝑞𝑇 under 𝒪.
  We conclude the section with some comments regarding the (limits of) generality of the
central Theorem 13. It has been shown that Theorem 13 holds for unrestricted CQs when one
considers the restriction ℰℒ of ℰℒℐ as ontology language [22] and we conjecture the same to
be true also for many DL-Lite dialects, e.g., DL-Liteℱ
                                                     horn . However, the extension to unrestricted,
that is, possibly non-rooted, CQs is not possible for ℰℒℐ. The subsequent example illustrates
that it fails already for Boolean CQs with a single variable.

Example 15. Let 𝑋𝑖 , 𝑋𝑖 for 1 ≤ 𝑖 ≤ 𝑛 be concept names and 𝑟 a role name. Let 𝒪 be an ℰℒℐ
ontology that contains the following concept inclusions, for all 𝑖 with 1 ≤ 𝑖 ≤ 𝑛:

                                     𝑋𝑖 ⊑ ∃𝑟.⊤
        ∃𝑟− .(𝑋0 ⊓ · · · ⊓ 𝑋𝑖−1 ⊓ 𝑋𝑖 ) ⊑ 𝑋𝑖              ∃𝑟− .(𝑋0 ⊓ · · · ⊓ 𝑋𝑖−1 ⊓ 𝑋𝑖 ) ⊑ 𝑋𝑖
                         ∃𝑟− .𝑋𝑖 ⊓ 𝑋𝑗 ⊑ 𝑋𝑖                                 ∃𝑟− .𝑋𝑖 ⊓ 𝑋𝑗 ⊑ 𝑋𝑖

Each subset of {𝑋𝑖 , 𝑋𝑖 | 1 ≤ 𝑖 ≤ 𝑛} containing exactly one of 𝑋𝑖 , 𝑋𝑖 for each 𝑖 represents a
number between 0 and 2𝑛 − 1 in an obvious way. Let 𝑞𝑖 be the Boolean CQ that corresponds to
number 𝑖. Clearly, all 𝑞𝑖 are 𝒪-saturated and (𝑞2𝑛 −1 , 𝒪)-minimal. The sequence 𝑞1 , 𝑞2 , . . . , 𝑞2𝑛 −1
is a generalization sequence towards 𝑞2𝑛 −1 under 𝒪, but its length is exponential in 𝑛.


5. Conclusions and Future Work
We have introduced a new form of generalizations and proved its applicability in the context
of exact learning of ELIQs in the presence of DL-Liteℱ        −
                                                            horn ontologies. We believe it is worth
investigating the new notion of guided generalizations more thoroughly. On the one hand,
there are basic open questions such as whether there always exists a least general or most
general 𝑝-guided ELIQ-generalization of 𝑞 for all ELIQs 𝑝, 𝑞. On the other hand, we would
like to understand whether Theorem 7 holds for other relevant combinations of query class
and ontology language, e.g., ELIQs and DL-Liteℱ     horn , CQs and any DL-Lite dialect, ELIQs and
DL-Liteℋ , that is, DL-Lite with role hierarchies. The latter will be challenging since there are no
universal models that satisfy Observation 3. We note that Theorem 7 does not extend to ELIQs
and ℰℒℐ ontologies: Our results imply that if we could compute in polynomial time using an
oracle for queries of the form “𝒜, 𝒪 |= 𝐴(𝑎)”, guided ELIQ-generalization of ELIQs under ℰℒℐ
ontologies, then ELIQs would be polynomial query learnable under ℰℒℐ ontologies which is
known not to be the case [22]. In cases where guided generalizations are not guaranteed to exist,
it would be interesting to study the induced existence and verification decision problems [7].
Finally, we are wondering whether guided generalizations have other applications, for example
in learning from labeled data examples.
   As we have shown, positive answers to (some of) these questions would directly lead to
polynomial time learnability results. Here, interesting open (and challenging) questions are
whether CQs are polynomial time learnable in the presence of DL-Liteℱ         horn (or even DL-Lite)
ontologies, and whether ELIQs are efficiently learnable under DL-Litehorn or DL-Liteℋ ontologies.
                                                                         ℱ
References
 [1] W. W. Cohen, H. Hirsh, The learnability of description logics with equality constraints,
     Mach. Learn. 17 (1994) 169–199.
 [2] W. W. Cohen, H. Hirsh, Learning the classic description logic: Theoretical and experimental
     results, in: Proc. of KR, Morgan Kaufmann, 1994, pp. 121–133.
 [3] M. Frazier, L. Pitt, Classic learning, Mach. Learn. 25 (1996) 151–193.
 [4] F. Baader, Least common subsumers and most specific concepts in a description logic with
     existential restrictions and terminological cycles, in: Proc. of IJCAI, Morgan Kaufmann,
     2003, pp. 319–324.
 [5] F. Baader, R. Küsters, R. Molitor, Computing least common subsumers in description logics
     with existential restrictions, in: Proc. of IJCAI, Morgan Kaufmann, 1999, pp. 96–103.
 [6] F. Baader, B. Sertkaya, A. Turhan, Computing the least common subsumer w.r.t. a back-
     ground terminology, J. Appl. Log. 5 (2007) 392–420.
 [7] J. C. Jung, C. Lutz, F. Wolter, Least general generalizations in description logic: Verification
     and existence, in: Proc. of AAAI, 2020, pp. 2854–2861.
 [8] B. Zarrieß, A. Turhan, Most specific generalizations w.r.t. general ℰℒ-TBoxes, in: Proc. of
     IJCAI, 2013, pp. 1191–1197.
 [9] M. Funk, J. C. Jung, C. Lutz, H. Pulcini, F. Wolter, Learning description logic concepts:
     When can positive and negative examples be separated?, in: Proc. of IJCAI, 2019, pp.
     1682–1688.
[10] J. C. Jung, C. Lutz, H. Pulcini, F. Wolter, Logical separability of incomplete data under
     ontologies, in: Proc. of KR, 2020, pp. 517–528.
[11] J. Lehmann, P. Hitzler, Concept learning in description logics using refinement operators,
     Mach. Learn. 78 (2010) 203–250.
[12] J. Lehmann, J. Völker, Perspectives on Ontology Learning, volume 18 of Studies on the
     Semantic Web, IOS Press, 2014.
[13] M. K. Sarker, P. Hitzler, Efficient concept induction for description logics, in: Proc. of
     AAAI, 2019, pp. 3036–3043.
[14] V. Gutiérrez-Basulto, J. C. Jung, L. Sabellek, Reverse engineering queries in ontology-
     enriched systems: The case of expressive Horn description logic ontologies, in: Proc. of
     IJCAI-ECAI, ijcai.org, 2018, pp. 1847–1853.
[15] D. Angluin, Learning regular sets from queries and counterexamples, Inf. Comput. 75
     (1987) 87–106.
[16] D. Angluin, Queries and concept learning, Mach. Learn. 2 (1987) 319–342.
[17] B. Konev, A. Ozaki, F. Wolter, A model for learning description logic ontologies based on
     exact learning, in: Proc. of AAAI, AAAI Press, 2016, pp. 1008–1015.
[18] B. Konev, C. Lutz, A. Ozaki, F. Wolter, Exact learning of lightweight description logic
     ontologies, J. Mach. Learn. Res. 18 (2018) 1–63.
[19] A. Ozaki, C. Persia, A. Mazzullo, Learning query inseparable ℰℒℋ ontologies, in: Proc. of
     AAAI, 2020, pp. 2959–2966.
[20] A. Ozaki, Learning description logic ontologies: Five approaches. where do they stand?,
     KI - Künstliche Intelligenz (2020).
[21] B. ten Cate, V. Dalmau, Conjunctive queries: Unique characterizations and exact learn-
     ability, in: Proc. of ICDT, volume 186 of LIPIcs, Schloss Dagstuhl - Leibniz-Zentrum für
     Informatik, 2021, pp. 9:1–9:24.
[22] M. Funk, J. C. Jung, C. Lutz, Actively learning concept and conjunctive queries under
     ℰℒ𝑟 -ontologies, in: Proc. of IJCAI, 2021, pp. 1887–1893.
[23] M. Funk, J. C. Jung, C. Lutz, Actively learning ELI queries under DL-Lite ontologies, in:
     Proc. of DL 2021), volume 2954 of CEUR Workshop Proceedings, CEUR-WS.org, 2021.
[24] M. Funk, J. C. Jung, C. Lutz, Frontiers and exact learning of ℰℒℐ queries under DL-Lite
     ontologies, in: Proc. of IJCAI, 2022.
[25] B. ten Cate, P. G. Kolaitis, K. Qian, W. Tan, Active learning of GAV schema mappings, in:
     Proc. of PODS, 2018, pp. 355–368. doi:10.1145/3196959.3196974.
[26] B. ten Cate, V. Dalmau, P. G. Kolaitis, Learning schema mappings, ACM Trans. Database
     Syst. 38 (2013) 28:1–28:31.
[27] M. Fortin, B. Konev, V. Ryzhikov, Y. Savateev, F. Wolter, M. Zakharyaschev, Unique
     characterisability and learnability of temporal instance queries, in: Proc. of KR, 2022.
[28] F. Baader, I. Horrocks, C. Lutz, U. Sattler, An Introduction to Description Logics, Cambridge
     University Press, 2017.
[29] A. Artale, D. Calvanese, R. Kontchakov, M. Zakharyaschev, The DL-Lite family and
     relations, J. Artif. Intell. Res. 36 (2009) 1–69.
[30] M. Bienvenu, M. Ortiz, M. Simkus, G. Xiao, Tractable queries for lightweight description
     logics, in: Proc. of IJCAI, 2013, pp. 768–774.
[31] F. Kriegel, Navigating the ℰℒ subsumption hierarchy, in: M. Homola, V. Ryzhikov, R. A.
     Schmidt (Eds.), Proc. of DL, volume 2954 of CEUR Workshop Proceedings, CEUR-WS.org,
     2021.