<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>DL</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Towards Conceptual Clustering in EL with Simulation Graphs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ruud van Bakel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Cochez</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Patrick Koopmann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Vrije Universiteit Amsterdam</institution>
          ,
          <addr-line>De Boelelaan 1105, 1081 HV Amsterdam</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>38</volume>
      <fpage>3</fpage>
      <lpage>6</lpage>
      <abstract>
        <p>We introduce ℰℒ clustering, a type of conceptual clustering in which each cluster is described by an ℰℒ concept. The cluster then contains all instances of that concept. This contribution can be seen as a form of unsupervised learning of ℰℒ concepts from data, useful for analyzing graph-based data as well as for ontology engineering. Towards a practical method for ℰℒ clustering, we introduce complete simulation graphs, a structure from which all possible ℰℒ clusterings of the given data can be extracted. From this structure, a good ℰℒ clustering is then selected based on a utility function. We evaluate a first prototypical implementation of this idea on ABoxes of ontologies from a known benchmark, and show that bounded simulation graphs and ℰℒ clusterings can often be computed in practice.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;unsupervised learning</kwd>
        <kwd>conceptual clustering</kwd>
        <kwd>summarization</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>One of the central functions of an ontology is to specify the meaning of concepts from some domain of
interest. In the case of ontologies formalized in description logics (DLs), these concepts are specified
through axioms, usually producing a subsumption hierarchy of concepts from more general to more
specific ones. Such an ontology can then be used in combination with a dataset (an ABox), and a
reasoner can infer which objects in the data belong to which concepts defined in the ontology, thus
organizing these objects into diferent categories of difering specificity. This makes ontologies useful
for organizing and accessing data for many use cases. For example, consider a digital health record
containing information about patients and what they are treated for in a hospital. Using a medical
ontology, we can group patients into more high-level categories and e.g. find out how many patients
are treated for heart diseases.</p>
      <p>A major downside of ontologies is that they need to be formalized first, which requires the right
expertise and can be a laborous process. As a solution, we propose to use conceptual clustering as an
approach to mine DL concepts directly from data. This may be useful for diferent reasons: 1) the
obtained concepts could be relevant concepts that should be added to an ontology, and thus support
ontology engineering, and 2) they can help analyzing and exploring data that are organized in graphs
(e.g. knowledge graphs), by organizing the objects in the data into categories that can also be described
in an intuitive way.</p>
      <p>
        A pletora of research considers learning of DL descriptions, though not much considers the precise
setting we have in mind. Approaches for active learning, e.g. following Angluin’s exact learning
approach, assume that the learning system is interacting with a teacher (e.g. a domain expert) to whom
it can send queries and current formalizations [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ]. Another line of research considers concept learning:
here, we start with a set of individuals from the data grouped into positive and negative examples, and
try to learn a DL concept that describes all of the positive and none of the negative examples [
        <xref ref-type="bibr" rid="ref3 ref4 ref5 ref6 ref7">3, 4, 5, 6, 7</xref>
        ].
Both types of learning are supervised in the sense that additional information from a domain expert is
required.
      </p>
      <p>
        In unsupervised learning, the goal is to learn axioms or concepts with less user input from the data.
Unsupervised learning of GCIs is investigated in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] using an induction-based approach. A limitation of
this approach is that a set of interesting concepts has to be known beforehand, which arguably makes
this approach less unsupervised. D’Amato et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] developed a method for conceptual clustering to
automatically generate hierarchies of ℒ concepts from data. The idea of conceptual clustering goes
back to [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ]. Early systems such as COBWEB [12] cluster data represented as vectors with values
for diferent features into a hierarchy of clusters that can be described based on their features. Clusters
can then be described by “concepts” that associate to the diferent features a probability distribution
of the weights. This is similar to classical clustering, where objects are typically clustered based on
distance measures. However, instead of distance measures, usually a utility criterion is used that takes
the quality of the cluster description and its predictive power into account. Moreover, the expressivity
of the description language determines which clusters are possible, since every entity in the cluster
must be described by the concept. D’Amato et al. follow this idea using the more expressive ℒ to
describe the clusters. Unfortunately, the method does not scale well on larger data sets.
      </p>
      <p>
        Moving to the more scalable DL ℰ ℒ, a line of research considers unsupervised learning following the
idea of formal concept analysis (FCA)[13, 14, 15, 16]. Those works focus on finding a finite base of axioms
that fully characterize a given interpretation, i.e. cover all ℰ ℒ axioms that hold in this interpretation.
ℰ ℒ provides nice model-theoretical properties that allow to use techniquess such as products and
simulations to more eficiently determine concepts that are relevant in a given interpretation. [ 17] uses
the ideas from these FCA methods to apply a form of conceptual clustering for ℰ ℒ with bounded role
depth. The method was able to cluster 15 objects on a knowledge graph with 1,216 triples into ℰ ℒ
concepts with role depth two, but could not scale beyond. In this paper, we want to better understand
the general feasibility of such a approach, using more closely the original idea of conceptual clustering
used in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], using an ad-hoc, but well-motivated, first utility measure.
      </p>
      <p>Diferent from classical clustering approaches on graphs that first translate nodes into feature vectors,
e.g. through embeddings [18, 19, 20], our conceptual clustering approach directly works on the graph
structure, which adds to the transparency of the clustering result. The central idea is to use simulation
graphs, a compressed representation of all possible clusterings of a given ABox. Similar structures
were also used in works on FCA with ℰ ℒ. To obtain scalability, we used an optimized algorithm to
summarize ABox based on simulation—identifying objects that cannot be distinguished by ℰ ℒ concepts.
An extended version of the paper with additional details is provided together with the experimental
data on Zenodo [21].</p>
    </sec>
    <sec id="sec-2">
      <title>2. Preliminaries</title>
      <sec id="sec-2-1">
        <title>2.1. The Description Logic ℰ ℒ</title>
        <p>Fix three countably infinite, pair-wise disjoint sets NC, NR and NI of respectively concept names, roles
and individuals. Concept names and roles correspond to unary and binary predicates in first-order
logic, and individuals to constants. An ABox  is a set of ground atoms called assertions, which are
of the forms () and (, ), where  ∈ NC,  ∈ NR and ,  ∈ NI. We use NI()/NC()/NR() to
refer to the individuals/concept names/roles in . ℰ ℒ concepts  are built based on the syntax rule
 ::= ⊤ |  |  ⊓  | ∃., where  ∈ NC and  ∈ NR. The depth of an ℰ ℒ concept is the maximal
nesting depth of role restrictions ∃. occurring in it. The semantics of DLs is defined using first-order
interpretations, e.g. tuples ℐ = ⟨Δℐ , · ℐ ⟩ with a non-empty domain Δℐ and an interpretation function · ℐ
assigning to every individual  ∈ NI an element ℐ ∈ NI, every concept name  ∈ NC a set ℐ ⊆ Δℐ ,
and to every role  ∈ NR a relation ℐ ⊆ Δℐ × Δℐ . The interpretation function is lifted to ℰ ℒ concepts
as follows: ⊤ℐ = Δℐ , ( ⊓ )ℐ = ℐ ∩ ℐ , and (∃.)ℐ = { ∈ Δℐ | ∃⟨, ⟩ ∈ ℐ s.t.  ∈ ℐ }.
An individual  ∈ NI satisfies a concept  in ℐ, in symbols ℐ |= (), if ℐ ∈ ℐ . ℐ is a model of an
ABox , in symbols ℐ |= , if ℐ ∈ ℐ for every () ∈ , and ⟨ℐ , ℐ ⟩ ∈ ℐ for every (, ) ∈ .
TBoxes or ontologies put additional constraints on an interpretation—an interpretation that satisfies
these constraints is then a model of this TBox. The syntax is not relevant in this paper. A knowledge
base (KB) is the union  ∪  of a TBox and an ABox. A model of such a KB is an interpretation that is
a model for both  and . Note that a TBox and an ABox on their own is already a KB. If an assertion
 is satisfied in every model of a KB , we write  |=  and say  is entailed by . In the special case
where  = (), we say that  is an instance of  in .</p>
        <p>Every ABox has a unique minimal model that has one domain element per individual in the ABox
which can be embedded into every other model of the ABox. For convenience, we may sometimes
identify ABoxes with their minimal models.</p>
        <sec id="sec-2-1-1">
          <title>2.2. Model Theory</title>
          <p>Relevant to this work are also the notions of simulations , bisimulations and products of interpretations,
which characterize the expressivity of DLs.</p>
          <p>A pointed interpretation is a tuple ⟨ℐ, ⟩ of an interpretation ℐ and a domain element  ∈ Δℐ .
Given two pointed interpretations ⟨ℐ, ⟩ and ⟨ , ⟩, a simulation from ⟨ℐ, ⟩ into ⟨ , ⟩ is a relation
⪯ ⊆ Δℐ × Δ that satisfies:
1.  ⪯ ,
2. if ′ ∈ ℐ and ′ ⪯ ′, then ′ ∈  ,
3. if ⟨1, 2⟩ ∈ ℐ and 1 ⪯ 1, then there must be 2 ∈ Δ s.t. 2 ⪯ 2 and ⟨1, 2⟩ ∈  .
We sometimes just write ⟨ℐ, ⟩ ⪯ ⟨ , ⟩ to indicate that there exists such a simulation, without
indicating the precise ⪯ . If Conditions 2 and 3 also hold for the inverse relation, then ⪯ is called
a bisimulation. Simulations characterize the expressivity of ℰ ℒ model-theoretically: for countable
interpretations, if there is a simulation from ⟨ℐ, ⟩ into ⟨ , ⟩, then for every ℰ ℒ concept ,  ∈ ℐ
implies  ∈  . In the same way, bisimulations characterize the expressivity of the classical DL ℒ,
which plays only a minor role in this paper and is therefore not introduced beyond this fact. For
elements ,  in an interpretation ℐ, we say that  is (bi-)similar to  if there exists a (bi-)simulation
⪯ s.t. ⟨ℐ, ⟩ ⪯ ⟨ℐ , ⟩. (Note that similarity is in general not symmetric, contrary to what the name
suggests. Bisimilarity on the other hand forms an equivalence relation.)</p>
          <p>The product of two pointed interpretations ⟨ℐ1, ⟩ and ⟨ℐ2, ⟩ is a pointed interpretation ⟨ , ⟨, ⟩⟩
defined inductively as the smallest interpretation that satisfies the following:
1. ⟨, ⟩ ∈ Δ ,
2. for every ⟨′, ′⟩ ∈ Δ s.t. ′ ∈ ℐ1 and ′ ∈ ℐ2 , we have ⟨′, ′⟩ ∈  ,
3. for every ⟨1, 1⟩ ∈ Δ s.t. ⟨1, 2⟩ ∈ ℐ1 and ⟨1, 2⟩ ∈ ℐ2 , there is ⟨2, 2⟩ ∈ Δ and
⟨⟨1, 1⟩, ⟨2, 2⟩⟩ ∈  .</p>
          <p>The construction of ⟨ , ⟨, ⟩⟩ ensures that ⟨ , ⟨, ⟩⟩ is a maximal interpretation such that
⟨ , ⟨, ⟩⟩ ⪯ ⟨ℐ 1, ⟩ and ⟨ , ⟨, ⟩⟩ ⪯ ⟨ℐ 2, ⟩. As a consequence, ⟨, ⟩ satisfies exactly those ℰ ℒ
concepts that both  and  satisfy in their respective interpretations. If Δℐ1 and Δℐ2 are both finite,
then |Δ | ≤ | Δℐ1 | · | Δℐ2 |.</p>
          <p>Lemma 1. Let ⟨ , ⟨, ⟩⟩ be the product of two pointed interpretations ⟨ℐ1, ⟩ and ⟨ℐ2, ⟩. Then, for
every ℰ ℒ-concept  such that  ∈ ℐ1 and  ∈ ℐ2 , ⟨, ⟩ ∈  . Moreover, ⟨, ⟩ satisfies no other
concepts in  .
3. ℰ ℒ Clusterings and Simulation Graphs
This paper is about computing ℰ ℒ clusterings:
Definition 1. Let  be a KB. An ℰ ℒ clustering for  is a finite set of pairs ⟨, A⟩ where  is an ℰ ℒ
concept and A is the set of instances of  in . We call the pairs ⟨, A⟩ ℰ ℒ clusters.</p>
          <p>Note that this definition is highly unspecific, and even allows empty clusters or clusters that only
difer in the concept but not in the set of individuals. For example, ℰ ℒ clusters may be pair-wise
disjoint, or they may form a (subsumption-)hierarchy. We also do not require the clustering to cover all
individuals in the ABox, to allow for robustness against outliers. We will come to the question of what
constitutes a good ℰ ℒ clustering later.</p>
          <p>At the center of our approach is the simulation graph, which is a structure that abstracts ABoxes into
diferent sets of individuals in a way that preserves simulations. Fix an ABox . Given sets A, B ⊆  ,
we write A→−  B if for every  ∈ A, there exists  ∈ B s.t. (, ) ∈ .</p>
          <p>Definition 2 (Simulation graphs). Let  be an ABox. A simulation graph for  is a directed labeled
graph  = ⟨, ,  ,  ⟩ with vertex-labeling   and edge labeling   s.t.  ⊆ 2NI(),   :  → 2NC
and   :  → 2NR , and which satisfies the following conditions:
1.  (A) = { ∈ NC | () ∈  for all  ∈ A},
2. if A→−  B, then ⟨A, B⟩ ∈  with  (⟨A, B⟩) = { ∈ NR | A→−  B},
3. for every  ∈ NI() and A ∈  , if () ∈  for every  ∈  (A) and {}→−
⟨A, B⟩ ∈  s.t.  ∈  (⟨A, B⟩), then  ∈ A.</p>
          <p>B for every</p>
          <p>The vertices in the simulation graphs correspond to sets of individuals from the ABoxes, with edges
between vertices if there is a role connection between the corresponding individuals. The last condition
ensures that the graph, intuitively, groups individuals based on simulations. Since simulation graphs
label vertices with concept names and edges with roles, we can identify them with corresponding
interpretations, and lift the definition of products and simulations to also include simulation graphs.
Our definition then ensures that for all A ∈  , A contains exactly those individuals  ∈ NI() for
which ⟨, A⟩ ⪯ ⟨ , ⟩. This allows us to establish a connection between simulation graphs and ℰ ℒ
clusterings:
Lemma 2. One can extract from every simulation graph  for  an ℰ ℒ clustering  for  that has an
ℰ ℒ cluster ⟨, A⟩ for every node A in  and no other clusters. Moreover, for every ℰ ℒ clustering  for ,
there exists a simulation graph  for  that has a node A for every cluster ⟨, A⟩ ∈ .</p>
          <p>Note that the simulation graph corresponding to an ℰ ℒ clustering may have more vertices than the
clustering has clusters, in order to deal with role-successors.</p>
          <p>We define some desirable properties of simulation graphs if we want to use them for computing ℰ ℒ
clusterings.</p>
          <p>Definition 3 (Properties). A simulation graph  = ⟨, ,  ,  ⟩ for  is
• complete if for every ℰ ℒ concept , there exists some A ∈  that contains exactly the instances of
 in ,
• -complete, where  ≥ 0, if for every ℰ ℒ concept  of depth at most , there exists some A ∈ 
that contains exactly the instances of  in ,
• summarizing if for every  ∈ NI() there is some vertex A ∈  that contains exactly the individuals
in NI() that are similar to .
• -summarizing, where  ≥ 0, if for every  ∈ NI(), there is some vertex A ∈  s.t. A = { ∈</p>
          <p>NI() |  |= () for every ℰ ℒ-concept  of depth at most  s.t.  |= ()}.</p>
          <p>Completeness implies all the other properties in this definition. Being -summarizing is an
approximation to being summarizing in the same way as -completeness approximates completeness: for
this, it is suficient to recall that, if  is similar to , then  is an instance of every ℰ ℒ concept that 
is an instance of. For  ≥ | NI()|, the notions -complete and -summarizing are equivalent to the
non-approximative versions complete and summarizing.</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>3.1. Summarizing Simulation Graphs</title>
          <p>As we will see later, summarizing simulation graphs can be computed eficiently. They have the potential
to be useful not only for conceptual clustering in ℰ ℒ, but also to improve reasoning times in general
and to be used for other learning problems in DLs up to the expressivity of ℒ. The key property is
that summarizing simulation graphs preserve all the relevant information about the individuals in an
ABox, even if a TBox is added that is expressed in the more expressive DL ℒ. For a given simulation
graph  = ⟨, ,  ,  ⟩, introduce a distinct individual A for every A ∈  and define the induced
ABox  as</p>
          <p>= {(A) |  ∈  (A)} ∪ {(A, B) | ⟨A, B⟩ ∈  and  ∈  (⟨A, B⟩)}.</p>
          <p>Theorem 1. Let  = ⟨, ,  ,  ⟩ be a summarizing simulation graph for an ABox  and  an ℒ
TBox. Let  ∈ NI() and let A ∈  be the smallest node in  (wrt. ⊆ ) s.t.  ∈ A. Then, for every ℒ
concept ,  ∪  |= () if  ∪  |= (A).</p>
          <p>Proof. The definition of summarizing simulation graphs makes sure that A is not only similar to , but
indeed bisimilar. Consequently, given a model ℐ of  ∪ , we can find a model  of  ∪  s.t. ⟨ℐ, ℐ ⟩
and ⟨ , A ⟩ are bisimilar, which implies that they satisfy the same ℒ concepts, and the same holds
for the other direction.</p>
          <p>Theorem 1 justifies why we define and compute simulation graphs only for ABoxes: adding a TBox,
even if it is more expressive than ℰ ℒ, cannot lead to new clusters being introduced. At the same time,
reasoning is usually less computationally expensive without TBox. As we will see later, we are able to
compute summarizing simulation graphs even for large ABoxes in very short time, and the number
of nodes in such a graph is often significantly smaller than the size of the ABox. This indicates that
simulation graphs also have the potential to speed up reasoning for KBs with large ABoxes. Theorem 1
also indicates potential for using simulation graphs for learning in more expressive logics:
Corollary 1. Let  =  ∪  be an ℒ KB and  = ⟨, ,  ,  ⟩ a summarizing simulation graph
for . Then, for every ℒ concept , the set of instances of  in  is the union over some subset  ′ ⊆  .</p>
          <p>This means that even for supervised learning problems, that try to find a concept based on a given
set of positive and negative examples, solutions can be obtained from summarizing simulation graphs.
In fact, together with Lemma 2, we obtain that all solutions to concept learning problems in ℒ can
be expressed as disjunction over ℰ ℒ concepts. (Complement and value restrictions may however lead
to more concise concepts.)</p>
        </sec>
        <sec id="sec-2-1-3">
          <title>3.2. Complete Simulation Graphs</title>
          <p>It is an easy consequence of our definitions that complete simulation graphs are unique.
Corollary 2. For any ABox , there is exactly one complete simulation graph.</p>
          <p>We can get to this conclusion also by observing that complete simulation graphs are simply
summarizing simulation graphs that are closed under products. For this, we first need to adapt the definition
of products to simulation graphs. Given a simulation graph  = ⟨, ,  ,  ⟩ and two elements
A, B ∈  , we can define the product of A and B as a node A × B in a new simulation graph
A× B = ⟨ ′, ′,  ′,  ′⟩ defined inductively as the smallest extension of  with new vertices A′ × B′
that satisfies the following conditions:
1. A × B ∈  ′,
2. for any A′ × B′ ∈  ′, A′ ∪ B′ ⊆ A′ × B′,
3. for any A′ × B′ ∈  ′,  ′(A′ × B′) =  (A′) ∩  (B′),
4. for any A1× B1 ∈  ′, ⟨A1, A2⟩, ⟨B1, B2⟩ ∈ , there is A2× B2 ∈  ′ and ⟨A1× B1, A2× B2⟩ ∈
′ with  ′(⟨A1 × B1, A2 × B2⟩) =  (⟨A1, A2⟩) ∩  2(⟨B1, B2⟩).</p>
          <p>Note that as a consequence of Item 3 in Definition 2, A′ × B′ may contain more elements than A∪B—we
require it to be the smallest extension that satisfies these criteria. The product is well-defined, since
we can compute it using a fixpoint construction. We now say that a simulation graph is closed under
products if applying the product on any pair of vertices produces again the same simulation graph. By
using Lemma 1 and relating products in simulation graphs to products in pointed interpretations, we
obtain the following theorem:
Theorem 2. For all  ≥ 0, any simulation graph that is (-)summarizing and closed under products is
(-)complete.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>3.3. Extracting ℰ ℒ-Clusterings</title>
        <p>Because every ℰ ℒ concept is represented with its set of instances, complete simulation graphs give an
upper bound on what can be included in an ℰ ℒ-clustering. If we are only interested in ℰ ℒ concepts of
bounded depth, -complete simulation graphs are suficient. Indeed, we can extract for each node in
an -complete simulation graph the corresponding ℰ ℒ-concept of depth at most  using an inductive
procedure. Fix a simulation graph  = ⟨, ,  ,  ⟩. We associate to every A ∈  and  ≥ 0 the ℰ ℒ
concept A() defined inductively as follows:
1. A(0) = d  (A)
2. For  &gt; 0, A() = d  (A) ⊓ d⟨A,B⟩∈ d∈ (⟨A,B⟩) ∃.B(− 1)
It follows directly from our definitions that  |= A () for every  ∈ A, and that for an -complete
simulation graph, we can obtain all possible ℰ ℒ clusters with concepts of depth at most . We can use
those clusters also together with an ℰ ℒ- or ℒ TBox. It can then be that we need less clusters: concepts
for diferent vertices may become equivalent because of the TBox, and even capture the same set of
individuals if they are not equivalent. The TBox can also be used to obtain more concise descriptions of
the clusters: the method in [22] can compute for a given ℰ ℒ concept a concept of minimal length that is
equivalent wrt. the TBox. We may use this in future work to obtain more user-friendly representations
of conceptual clusterings.</p>
        <p>To determine which subset  ′ ⊆  constitutes a good ℰ ℒ-clustering, we introduce quality measures
for ℰ ℒ-clusterings, which we here define in the most general way.</p>
        <p>Definition 4. An ℰ ℒ-cluster quality measure is a function  mapping pairs of KBs and ℰ ℒ clusters to real
numbers. An ℰ ℒ-clustering quality measure is a function  that assigns to every KB and set of ℰ ℒ-clusters
a real number.</p>
        <p>Quality measures may take into account diferent aspects such as the number of individuals in the
cluster, the length of the concept, or try to minimize the overlap between diferent clusters. What
constitutes a good quality measure will ultimately depend on the use case: for instance, if we want
to use clusterings to support ontology engineering, we may also want to take into account whether
introducing a concept name for a cluster will allow us to describe individuals more concisely. We keep a
deeper investigation of quality measures as future work, and use for now a quality measure based on the
idea of category utility already used in the conceptual clustering algorithm COBWEB [12]. Intuitively,
the utility of a cluster captures its predictive power: how does knowing that an individual belongs to a
cluster help us in predicting which concepts it satisfies? This is in in particular helpful if we want to
use ℰ ℒ clusterings to organize and explore the individuals in a KB: For this, we may organize the ℰ ℒ
clustering in a subsumption hierarchy, and navigate it starting from the most general concept. Assume
we want to explore a dataset about employees in a company and the gender of an employee has no
correlation to any of the other properties in the dataset. Then, organizing the data by gender does not
bring any advantage, since it does not have any predictive power. Rather, we would like to organize the
employees based on concepts that let us predict additional information.</p>
        <p>Simulation graphs also help with defining this predictive power. Fix a KB  =  ∪ . Given two
sets A, B ⊆ NI(), the conditional probability of A given B is defined as  (A | B) = |A|B∩B|| , and
the probability of A as  (A) =  (A | NI()). Given a concept name , we use  to refer to the
instances of  in . By abuse of notation, given A ⊆  and  ∈ NR, we use ∃.A to denote the
individuals that have an -successor in A:</p>
        <p>∃.A = { ∈ NI() | (, ) ∈ ,  ∈ A}
Definition 5. Let  =  ∪  be a KB and  = ⟨, ,  ,  ⟩ a simulation graph for . The utility of a
cluster ⟨, A⟩ in context C ⊆ NI(), relative to , is defined as
 (A | C) =  (A | C) · ⎝ ∑︁ (︀  ( | A ∩ C)2 −  ( | C)2)︀
⎛</p>
        <p>∈NC
∈NR B∈
⎞
+ ∑︁ ∑︁ (︀  (∃.B | A ∩ C)2 −  (∃.B | C)2)︀ ⎠
The utility of ⟨, A⟩ ∈  is defined as  (A | NI()).</p>
        <p>Note that the concept  used to describe the cluster is not relevant here.  determines which subsets
of NI() can be distinguished by the measure. In the case where  is complete, these are indeed the
subsets of NI() that can be distinguished using ℰ ℒ concepts. Individuals can be described based on
which concept names they satisfy, and which ℰ ℒ concepts their -successors satisfy. The intuitive idea
of the utility of a vertex A is that guessing whether an individual  belongs to A gives us an advantage
in guessing any of these characteristics.</p>
        <p>Looking at the utility of vertices in isolation is not suficient for extracting a good ℰ ℒ clustering,
because there may be vertices with high utility that cover almost the same set of individuals. We therefore
consider also the relative utility of one vertex to another: Assuming we know that an individual 
belongs to A or B, does guessing which one it belongs to give any advantage in guessing its other
properties? This corresponds to  (A | A ∪ B). If A and B help predicting the same things, then
this value is going to be lower than if they predict diferent things, so that minimizing this value will
lead to a more useful set of ℰ ℒ clusters. We can thus define the relative utility of a cluster ⟨, A⟩ in a
given ℰ ℒ-clustering  as</p>
        <p>(A | ) = min{ (A | A ∪ B) | ⟨, B⟩ ∈ ,  ̸= }.</p>
        <p>Intuitively, if the relative utility of a cluster is low, then dropping it from the clustering is not going to
afect its overall predictive power, as the predictive power of that cluster is anyway low in presence of
other clusters.</p>
        <p>We now define the utility of an ℰ ℒ-clustering  (relative to a simulation graph ) as
 () =</p>
        <p>∑︁
⟨,A⟩∈
 (A | ).
4. Computing ℰ ℒ-Clusterings in Practice
We first compute a summarizing simulation graph  = ⟨, ,  ,  ⟩ from the given Abox . For ease
of presentation, we assume in the following that every concept assertion () is represented as a role
assertion (, * ) with * some specific fresh individual. The backbone of our algorithm is the process
of partition refinement [23]. The intuition here is that we cluster many individuals together at first, then
as we consider deeper ℰ ℒ concepts, we further refine the clusters in smaller ones. We use the following
notation to refer to our partitions on the individuals:
Definition 6 (Partition on the individuals).  () :=  ()() is the partition on the set of individuals NI,
generated by ℰ ℒ concepts of depth .</p>
        <p>Grandson</p>
        <p>Man</p>
        <p>Thing
Married Man</p>
        <p>Married Parent</p>
        <p>Married Woman
Married Father</p>
        <p>Married Mother
Married Grandfather</p>
        <p>Married Grandmother</p>
        <p>We will now explain what it means for ℰ ℒ concepts to generate a partition. The first partition always
clusters all individual together in one set:  (0) = {NI}. For any subsequent partition we have the
following rule:
Definition 7 (Refinement) . Any two concepts  and  are clustered together in  () (with  ≥ 1) if all of
the following conditions hold:
•  and  are clustered together in  (− 1),
• for every (, ) ∈  there exists an (, ) ∈ , such that  and  are clustered together in  (− 1),
• for every (, ) ∈  there exists an (, ) ∈ , such that  and  are clustered together in  (− 1).</p>
        <p>If we assume a finite ABox, then when computing the partitions at some point the algorithm will
ifnd a fixed point , i.e.  (+1) =  (). From this point onward all the partitions will stay the same.
Therefore when the fixed point is found, our algorithm stops. We can now get  as the union of all the
partitions:  = ⋃︀</p>
        <p>=0  ().</p>
        <p>For the edges  we have the following rule:
Definition 8 (Simulation graph edges). Any triple (S, , O) is part of  if and only if there exists at least
one level  that satisfies the following:
• S ∈  (),
• O ∈  (− 1),
• there exist individuals  ∈ S and  ∈ O, such that (, ) ∈ .</p>
        <p>With the vertices and edges generated, we can now simply take the labelings as defined in Definition 2
to create . It is easy to verify that this results in a summarizing simulation graph. Moreover, if we
replace  by  = ⋃︀</p>
        <p>=0  (), we obtain a -summarizing simulation graph.</p>
        <p>To extend the summarizing simulation graph to a complete one, which at the same time serves as
initial clustering, we add products of pairs of vertices in rounds until a fixpoint is reached. To optimize
the utility, we follow a greedy approach, and remove the worst cluster according to  (A | A ∪ B)
until a termination criterion is met. In our experiments, we used the desired number of clusters as
criterion.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Evaluation</title>
      <p>
        We applied the implementation on the family-history ontology that was used in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] to evaluate concept
learning, which is an ontology with a very simple TBox (no complex concepts), and whose ABox contains
famility relations as roles and concepts for diferent types of family members. The ABox contained 1780
assertions with 202 individuals, which were summarized into 244 nodes in the summarizing simulation
graph, and 30 in the 1-summarizing graph. We computed the 1-complete graph after 4 rounds of
building products of all nodes, thus obtaining all relevant ℰ ℒ concepts without nested role restrictions.
This graph contained 53 nodes, from which we extracted 10 ℰ ℒ clusters using the greedy algorithm.
      </p>
      <p>When computing the utility (here and in later experiments), we used the complete summarizing graph,
and not just the 1-summarizing one. While the corresponding concepts were relatively complex, they
mostly correspond to simple concepts in human language, e.g. “Person ⊓ Male” corresponding to
“Man”—Figure 1 shows the clusters organized into the subsumption hierarchy, where we represent each
ℰ ℒ concept with the simple concept in natural language (the full list of concepts is in the appendix). We
ifnd more or less the concepts one would expect as central concepts in a family tree, with the exception
of “Woman”, which one would have expected together with “Man”, and the fact that “Grandson” is the
only cluster representing children. This might indicate a bias in the data, but also shows that more
research is needed into utility measure and alternative methods for computing the clustering. Note that
the most general cluster “Thing” is needed to allow for higher relative utility of the other clusters.</p>
      <p>The next experiment was about getting a first idea of the feasibility of ℰ ℒ clustering, and used
ontologies from the OWL EL materialization track of the OWL reasoner competition 2015 [24, 25]. This
corpus contains 110 ontologies in the OWL EL profile with non-empty ABoxes. We removed ABoxes
axioms that were not supported by our method (e.g. equivalence assertions or assertions involving
complex concepts), which resulted in 16 ontologies without ABox, leaving us with 94 ontologies to
be used in the experiment. For each ontology, we computed the summarizing simulation graphs and
attempted to compute 2-complete simulation graphs, from which we then extract ℰ ℒ-clusterings of size
10 using the greedy algorithm. We used a time limit of 10 minutes for each ontology, which resulted in
12 timeouts. The detailed results are shown in the appendix. Section 5 visualizes how the ABox sizes
relate to the sizes of the 2-complete graphs and to the computation times. The number of nodes in
the summarizing simulation graph is often much smaller than the number of individuals, and that the
computation of clusters, even though so far only the summarizing step is really optimized, is possible in
short time in most cases. Quite remarkably, many very large ABoxes resulted in summarizing simulation
graphs with under 10 nodes, including the largest one with almost 750,000 individuals which had only
5 nodes in the summarizing graph. This indicates the potential of summarizing simulation graphs also
for speeding up ABox reasoning.</p>
    </sec>
    <sec id="sec-4">
      <title>6. Conclusion</title>
      <p>
        While our summarization method is already quite eficient, we want to investigate more dedicated
algorithms for computing the clusterings. One option is to integrate the computation of products
directly into the summarization implementation, to be able to compute products more eficiently on the
diferent levels of the simulation graphs. Another way could be to use an incremental approach as done
in the conceptual clustering algorithm for ℒ [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Our selection of good clusterings is currently done
in a greedy fashion—we want to investigate whether encodings into answer set programming might
allow us to compute globally optimal solutions. Future evaluations should look into diferent corpora
than the ORE corpus, for instance knowledge graphs without an ontology. To understand whether our
utility function makes sense and compare with other alternatives, user studies will be required.
      </p>
    </sec>
    <sec id="sec-5">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.
Artif. Intell. Rev. 52 (2019) 1267–1296. URL: https://doi.org/10.1007/s10462-018-9627-1. doi:10.
1007/S10462-018-9627-1.
[12] D. H. Fisher, Knowledge acquisition via incremental conceptual clustering, Machine learning 2
(1987) 139–172.
[13] F. Distel, Learning description logic knowledge bases from data using methods from formal concept
analysis, Ph.D. thesis, Dresden University of Technology, 2011. URL: https://nbn-resolving.org/urn:
nbn:de:bsz:14-qucosa-70199.
[14] D. Borchmann, F. Distel, F. Kriegel, Axiomatisation of general concept inclusions from finite
interpretations, J. Appl. Non Class. Logics 26 (2016) 1–46. URL: https://doi.org/10.1080/11663081.
2016.1168230. doi:10.1080/11663081.2016.1168230.
[15] F. Kriegel, Eficient axiomatization of OWL 2 EL ontologies from data by means of formal concept
analysis, in: M. J. Wooldridge, J. G. Dy, S. Natarajan (Eds.), Thirty-Eighth AAAI Conference on
Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial
Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artificial Intelligence,
EAAI 2014, February 20-27, 2024, Vancouver, Canada, AAAI Press, 2024, pp. 10597–10606. URL:
https://doi.org/10.1609/aaai.v38i9.28930. doi:10.1609/AAAI.V38I9.28930.
[16] R. Guimarães, A. Ozaki, C. Persia, B. Sertkaya, Mining ℰℒ⊥ bases with adaptable role depth, J.</p>
      <p>Artif. Intell. Res. 76 (2023) 883–924. URL: https://doi.org/10.1613/jair.1.13777. doi:10.1613/JAIR.
1.13777.
[17] F. Hodo, S. Pranav, B. Sertkaya, Clustering knowledge graphs using concept lattices (extended
abstract), in: O. Kutz, C. Lutz, A. Ozaki (Eds.), Proceedings of the 36th International Workshop on
Description Logics (DL 2023) co-located with the 20th International Conference on Principles of
Knowledge Representation and Reasoning and the 21st International Workshop on Non-Monotonic
Reasoning (KR 2023 and NMR 2023)., Rhodes, Greece, September 2-4, 2023, volume 3515 of CEUR
Workshop Proceedings, CEUR-WS.org, 2023. URL: https://ceur-ws.org/Vol-3515/abstract-12.pdf.
[18] F. Martel, A. Zouaq, Taxonomy extraction using knowledge graph embeddings and hierarchical
clustering, in: C. Hung, J. Hong, A. Bechini, E. Song (Eds.), SAC ’21: The 36th ACM/SIGAPP
Symposium on Applied Computing, Virtual Event, Republic of Korea, March 22-26, 2021, ACM, 2021,
pp. 836–844. URL: https://doi.org/10.1145/3412841.3441959. doi:10.1145/3412841.3441959.
[19] L. Guo, Q. Dai, Graph clustering via variational graph embedding, Pattern Recognit. 122
(2022) 108334. URL: https://doi.org/10.1016/j.patcog.2021.108334. doi:10.1016/J.PATCOG.2021.
108334.
[20] Q. Wang, Z. Mao, B. Wang, L. Guo, Knowledge graph embedding: A survey of approaches and
applications, IEEE Trans. Knowl. Data Eng. 29 (2017) 2724–2743. URL: https://doi.org/10.1109/
TKDE.2017.2754499. doi:10.1109/TKDE.2017.2754499.
[21] P. Koopmann, R. Bakel, M. Cochez, Towards conceptual clustering in ℰℒ with simulation graphs
— experimental data and extended version, 2025. URL: https://doi.org/10.5281/zenodo.16794054.
doi:10.5281/zenodo.16794054.
[22] N. Nikitina, P. Koopmann, Small is beautiful: Computing minimal equivalent ℰℒ concepts, in:
S. Singh, S. Markovitch (Eds.), Proceedings of the Thirty-First AAAI Conference on Artificial
Intelligence, February 4-9, 2017, San Francisco, California, USA, AAAI Press, 2017, pp. 1206–1212.</p>
      <p>URL: https://doi.org/10.1609/aaai.v31i1.10684. doi:10.1609/AAAI.V31I1.10684.
[23] R. Paige, R. E. Tarjan, Three partition refinement algorithms, SIAM Journal on Computing 16
(1987) 973–989. URL: https://doi.org/10.1137/0216062. doi:10.1137/0216062.
[24] B. Parsia, N. Matentzoglu, R. S. Gonçalves, B. Glimm, A. Steigmiller, The OWL reasoner evaluation
(ORE) 2015 competition report, J. Autom. Reason. 59 (2017) 455–482. URL: https://doi.org/10.1007/
s10817-017-9406-8. doi:10.1007/S10817-017-9406-8.
[25] N. Matentzoglu, B. Parsia, ORE 2015 reasoner competition corpus, 2015. URL: https://doi.org/10.
5281/zenodo.18578. doi:10.5281/zenodo.18578.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Konev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ozaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Wolter</surname>
          </string-name>
          ,
          <article-title>Exact learning of lightweight description logic ontologies</article-title>
          ,
          <source>J. Mach. Learn. Res</source>
          .
          <volume>18</volume>
          (
          <year>2017</year>
          )
          <volume>201</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>201</lpage>
          :
          <fpage>63</fpage>
          . URL: https://jmlr.org/papers/v18/
          <fpage>16</fpage>
          -
          <lpage>256</lpage>
          .html.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M. R. C.</given-names>
            <surname>Duarte</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Konev</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Ozaki,
          <article-title>ExactLearner: A tool for exact learning of ℰ ℒ ontologies</article-title>
          , in: M.
          <string-name>
            <surname>Thielscher</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Toni</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Wolter</surname>
          </string-name>
          (Eds.),
          <source>Principles of Knowledge Representation and Reasoning: Proceedings of the Sixteenth International Conference, KR</source>
          <year>2018</year>
          , Tempe, Arizona,
          <volume>30</volume>
          <fpage>October</fpage>
          - 2
          <source>November</source>
          <year>2018</year>
          , AAAI Press,
          <year>2018</year>
          , pp.
          <fpage>409</fpage>
          -
          <lpage>414</lpage>
          . URL: https://aaai.org/ocs/index.php/KR/KR18/ paper/view/18006.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lehmann</surname>
          </string-name>
          , DL-Learner:
          <article-title>Learning concepts in description logics</article-title>
          ,
          <source>J. Mach. Learn. Res</source>
          .
          <volume>10</volume>
          (
          <year>2009</year>
          )
          <fpage>2639</fpage>
          -
          <lpage>2642</lpage>
          . URL: https://dl.acm.org/doi/10.5555/1577069.1755874. doi:
          <volume>10</volume>
          .5555/1577069. 1755874.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>G.</given-names>
            <surname>Rizzo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Fanizzi</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>d'Amato, Class expression induction as concept space exploration: From DL-Foil to DL-Focl, Future Gener</article-title>
          .
          <source>Comput. Syst</source>
          .
          <volume>108</volume>
          (
          <year>2020</year>
          )
          <fpage>256</fpage>
          -
          <lpage>272</lpage>
          . URL: https://doi.org/10.1016/ j.future.
          <year>2020</year>
          .
          <volume>02</volume>
          .071. doi:
          <volume>10</volume>
          .1016/J.FUTURE.
          <year>2020</year>
          .
          <volume>02</volume>
          .071.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Heindorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Blübaum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Düsterhus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Werner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. N.</given-names>
            <surname>Golani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Demir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Ngomo</surname>
          </string-name>
          ,
          <article-title>Evolearner: Learning description logics with evolutionary algorithms</article-title>
          , in: F. Laforest,
          <string-name>
            <given-names>R.</given-names>
            <surname>Troncy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Simperl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gionis</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Herman</surname>
          </string-name>
          , L. Médini (Eds.),
          <source>WWW '22: The ACM Web Conference</source>
          <year>2022</year>
          ,
          <string-name>
            <given-names>Virtual</given-names>
            <surname>Event</surname>
          </string-name>
          , Lyon, France,
          <source>April 25 - 29</source>
          ,
          <year>2022</year>
          , ACM,
          <year>2022</year>
          , pp.
          <fpage>818</fpage>
          -
          <lpage>828</lpage>
          . URL: https: //doi.org/10.1145/3485447.3511925. doi:
          <volume>10</volume>
          .1145/3485447.3511925.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Ngomo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Demir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Kouagou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Heindorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Karalis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bigerl</surname>
          </string-name>
          ,
          <article-title>Class expression learning with multiple representations</article-title>
          , in: P. Hitzler,
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Sarker</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Eberhart (Eds.),
          <source>Compendium of Neurosymbolic Artificial Intelligence</source>
          , volume
          <volume>369</volume>
          <source>of Frontiers in Artificial Intelligence and Applications</source>
          , IOS Press,
          <year>2023</year>
          , pp.
          <fpage>272</fpage>
          -
          <lpage>286</lpage>
          . URL: https://doi.org/10.3233/FAIA230145. doi:
          <volume>10</volume>
          .3233/ FAIA230145.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>B. ten Cate</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Funk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Jung</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Lutz, SAT-based PAC learning of description logic concepts</article-title>
          ,
          <source>in: Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI</source>
          <year>2023</year>
          ,
          <fpage>19th</fpage>
          -25th
          <source>August</source>
          <year>2023</year>
          , Macao,
          <string-name>
            <surname>SAR</surname>
          </string-name>
          , China, ijcai.org,
          <year>2023</year>
          , pp.
          <fpage>3347</fpage>
          -
          <lpage>3355</lpage>
          . URL: https://doi.org/10.24963/ijcai.
          <year>2023</year>
          /373. doi:
          <volume>10</volume>
          .24963/IJCAI.
          <year>2023</year>
          /373.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sazonau</surname>
          </string-name>
          , U. Sattler, G. Brown,
          <article-title>General terminology induction in OWL</article-title>
          , in: M.
          <string-name>
            <surname>Arenas</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Corcho</surname>
            , E. Simperl,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Strohmaier</surname>
            , M. d'Aquin,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Srinivas</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Groth</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Dumontier</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Heflin</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Thirunarayan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Thirunarayan</surname>
          </string-name>
          , S. Staab (Eds.),
          <source>The Semantic Web - ISWC 2015</source>
          , Springer International Publishing, Cham,
          <year>2015</year>
          , pp.
          <fpage>533</fpage>
          -
          <lpage>550</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>C. d'Amato</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Staab</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Fanizzi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Esposito</surname>
          </string-name>
          ,
          <article-title>DL-Link: a conceptual clustering algorithm for indexing description logics knowledge bases</article-title>
          ,
          <source>Int. J. Semantic Comput</source>
          .
          <volume>4</volume>
          (
          <year>2010</year>
          )
          <fpage>453</fpage>
          -
          <lpage>486</lpage>
          . URL: https://doi.org/10.1142/S1793351X10001085. doi:
          <volume>10</volume>
          .1142/S1793351X10001085.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10] R. S. Michalski, R. E. Stepp,
          <article-title>Learning from observation: Conceptual clustering</article-title>
          ,
          <source>Machine learning: An artificial intelligence approach</source>
          (
          <year>1983</year>
          )
          <fpage>331</fpage>
          -
          <lpage>363</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A. P.</given-names>
            <surname>Suárez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. F. M.</given-names>
            <surname>Trinidad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Carrasco-Ochoa</surname>
          </string-name>
          ,
          <article-title>A review of conceptual clustering algorithms,</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>