<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Induction of Terminological Cluster Trees</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Preliminaries</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Model</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Method</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Perspectives</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Giuseppe Rizzo</institution>
          ,
          <addr-line>Claudia d'Amato, Nicola Fanizzi, and Floriana Esposito</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Via Orabona 4</institution>
          ,
          <addr-line>70125, Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we tackle the problem of clustering individual resources in the context of the Web of Data, that is characterized by a huge amount of data published in a standard data model with a well-de ned semantics based on Web ontologies. In fact, clustering methods o er an e ective solution to support a lot of complex related activities, such as ontology construction, debugging and evolution, taking into account the inherent incompleteness underlying the representation. Web ontologies already encode a hierarchical organization of the resources by means of the subsumption hierarchy of the classes, which may be expressed explicitly, with proper subsumption axioms, or it must be detected indirectly, by reasoning on the available axioms that dene the classes (classi cation). However it frequently happens that such classes are sparsely populated as the hierarchy often re ect a view of the knowledge engineer prior to the actual introduction of assertions involving the individual resources. As a result, very general classes are often loosely populated, but this may happen also to speci c subclasses, making it more di cult to check the types of a resource (instance checking ), even through reasoning services. Among the large number of algorithms proposed in the Machine Learning literature, we propose a clustering method that is able to organize groups of resources hierarchically. Specifically, in this work, we introduce a conceptual clustering approach that combines a distance measure between individuals in a knowledge base in a divide-and-conquer solution that is intended to elicit ex post the underlying hierarchy based on the actual distributions of the instances.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction and Motivation</title>
      <p>
        With the growth of the Web of Data, along with the Linked Data initiative [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ],
a large number of datasets/vocabularies are being published on the ground of a
standard data model and connected within a uniform semantic space exploiting
RDF and the Web infrastructure. However, this huge quantity of data is also
known as inherently uncertain, for being often inconsistent /con ictual, vague
and, especially, incomplete [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>Web ontologies already encode a hierarchical organization of the data by
means of the subsumption hierarchy of the classes, which may be expressed
explicitly, with proper subsumption axioms, or it has to be determined indirectly,
by reasoning on the available axioms de ning the various classes (classi cation
task). However it frequently happens that such classes are sparsely populated
as the hierarchy often re ects a view of the knowledge engineer prior to the
actual introduction of assertions involving the individual resources. As a result,
very general classes are often loosely populated, but this may happen also to
speci c subclasses, hindering or, at least, making it di cult to check the types
of a resource (instance checking), even employing reasoning services. This may
have a strong negative impact on services based on query answering, which often
only rely on explicit assertions regarding the data.</p>
      <p>
        Machine Learning methods can be employed to elicit implicit pieces of
knowledge from the datasets, also in the face of such cases of inherent incompleteness,
and provide non-standard inference services [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. In particular, in this paper we
will address conceptual clustering [18] as a means to exploit the data to detect
further classes that may arise with the underlying hierarchical structure.
Specifically, we introduce an approach that combines semantic distance measures over
the space of individuals together with a divide-and-conquer solution that is
intended to elicit in retrospection an additional class hierarchy to re ect the actual
distributions of the instances.
      </p>
      <p>
        Clustering is an unsupervised learning task aiming at partitioning a
collection of objects into subsets or clusters, so that those within each cluster are
more closely related to one another than to the objects assigned to di erent
clusters [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In the Web of Data context, clustering can enable the de nition of
new emerging concepts (concept formation) on the grounds of those employed in
a knowledge base; supervised methods can exploit these clusters to induce new
concept de nitions or to re ne existing ones (ontology evolution); intensionally
de ned groupings may speed-up the task of search and discovery; clustering may
also induce criteria for ranking the retrieved resources [
        <xref ref-type="bibr" rid="ref10 ref9">10, 9</xref>
        ].
      </p>
      <p>An object is usually described by a xed set of feature (attribute values )
and the most common notion of similarity between the objects is expressed in
terms of a distance function based on this set; for example, datasets made up of
objects described by tuples of numeric features and (extensions of) the Euclidean
distance are adopted to determine object and cluster similarity.</p>
      <p>
        An important di erence among the various clustering techniques is related
to the type of membership that is adopted. In the simplest (crisp) case, e.g.
kMeans [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], cluster membership can be exclusive: each object belongs exactly to
one cluster. Extensions, such as fuzzy c-means [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] or EM [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], admit an overlap
between the clusters and a degree of membership (responsibility ) of each object
to a cluster. Further extensions include non- at clustering structures such as
those induced via hierarchical clustering [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>In this work, moving on to conceptual clustering, we will require that classes
with an intensional de nition may account for and de ne these clusters. To
this purpose the algorithms should be able to exploit a background knowledge
that can be expressed using expressive representation languages1, such as
Description Logics. Again, in such methods, clustering requires the de nition of
(dis)similarity measure between the set of the objects to be clustered.</p>
      <p>
        We propose a generalized solution based on logical trees [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], namely
Terminological Cluster Trees, as an extension of terminological decision trees [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. They
adopt a pseudo-metric de ned over the space of individuals as a criterion to
separate groups of objects rather than information gain or other notions of purity
devised for supervised concept learning and inductive classi cation methods,
terminological decision trees. The proposed solution provides intensional de nitions
that can be used for describing the individuals in the cluster, and unlike other
methods [
        <xref ref-type="bibr" rid="ref11 ref13">13, 11</xref>
        ], does not have to resort to complex approximations such as the
most speci c concept [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] as the representative of individuals on the conceptual
level. Even more so, terminological cluster trees can determine autonomously
the number of clusters to be generated, which is a required parameter for several
other methods having a strong impact on the quality of the partitions obtained.
      </p>
      <p>The rest of the paper is organized as follows: the next section illustrates
the basic notions about the underlying Description Logics foundations for the
intended representation and reasoning services; Sect. 3 introduces the problem
of clustering individuals of a knowledge base while Sect. 4 presents the approach
required for inducing terminological cluster trees. Finally, Sect. 5 illustrates the
conclusions and some possible extensions.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Basics</title>
      <p>
        In the following, we will borrow notation and terminology from Description
Logics (DLs) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] as the representation and reasoning services for the knowledge
bases in the Web of Data ultimately rely on such a family of languages. Hence
we will use the terms concept (description) and role as synonyms of class and
property 2, respectively.
      </p>
      <p>In these languages, a domain is modeled through atomic concepts (classes)
NC and roles (relations) NR, which can be used to build complex descriptions
regarding individuals (instances, objects), by using speci c operators (complement,
conjunction and disjunction between concepts) that depend on the adopted
language. A knowledge base is a couple K = (T ; A) where the TBox T contains
axioms concerning concepts and roles (typically inclusion axioms such as C v D)
and the ABox A contains assertions, i.e. axioms regarding the individuals (C(a),
resp. R(a; b)). The set of individuals occurring in A is denoted by Ind(A).</p>
      <p>The semantics of individuals, concepts, and roles is de ned through
interpretations. An interpretation is a couple I = ( I ; I ) where I is the domain of
the interpretation and I is a mapping such that, for each individual a, aI 2 I ,
for each concept C, CI I and for each role R, RI I I . The semantics
1 In the past, various fragments of First-Order Logic have been adopted, such as</p>
      <p>Clausal Logics, especially in Inductive Logic Programming.
2 Datatype properties, i.e. roles ranging on concrete domains will not be considered
in this study.
of complex descriptions descends from the interpretation of the primitive
concepts/roles and of the operators employed, depending on the adopted language.
I satis es an axiom C v D (C is subsumed by D) when CI DI and an
assertion C(a) (resp. R(a; b)) when aI 2 CI (resp. (aI ; bI ) 2 RI ). I is a model
for K i it satis es each axiom/assertion in K, denoted with I j= . When
is satis ed w.r.t. these models, we write K j= .</p>
      <p>We will be interested in the instance-checking inference service: given an
individual a and a concept description C determine if K j= C(a). Due to the
Open World Assumption (OWA), answering to a class-membership query is more
di cult w.r.t. ILP settings where the closed-world reasoning is the standard
form. Indeed, one may not be able to prove the truth of either K j= C(a) or
K j= :C(a), as there may be possible to nd di erent interpretations that satisfy
either cases.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Conceptual Clustering for DL Knowledge Bases</title>
      <p>As we are targeting conceptual clustering for DL Knowledge Bases, the problem,
in a simple formulation, may be formalized as follows:</p>
      <sec id="sec-3-1">
        <title>De nition 3.1 (conceptual clustering { at case).</title>
      </sec>
      <sec id="sec-3-2">
        <title>Given:</title>
      </sec>
      <sec id="sec-3-3">
        <title>Find:</title>
        <p>{ a knowledge base K = hT ; Ai
{ a set of training individuals TI</p>
        <p>Ind(A)
{ a partition of TI in n pairwise disjoint clusters fC1; : : : ; Cng
{ for each i = 1; : : : ; n, a concept description Di that accounts for Ci, i.e.
such that 8a 2 Ci :
1. K j= Di(a)
2. K j= :Dj (a)</p>
        <p>8j 2 f1; : : : ; ng; j 6= i</p>
        <p>
          Note that in this setting the number of clusters (n) is not required as a
parameter. Condition 2. may be relaxed (e.g. K 6j= Dj (a)) to allow some overlap
between the clusters/concepts that may be further extended towards
probabilistic clustering methods and models [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>This problem can be regarded as a recursive one, as each cluster, in its
turn, might yield its internal partitioning and each sibling sub-cluster would be
characterized intensionally by its sub-class.</p>
        <p>The decision on whether to partition recursively a given cluster or not
generally depends on cohesion metrics, assessing a measure of intra-cluster similarity
(within the cluster) and the w.r.t. the inter-cluster dissimilarity (w.r.t. the
sibling partitions).</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Terminological Cluster Trees</title>
      <p>
        The notion of terminological cluster tree extends logical clustering trees
introduced in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] and learned through C0.5, a system derived from Tilde [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The
induction of the model combines elements of logical decision trees induction (i.e.
the approach based on recursive partitioning and exploiting of re nement
operator for specializing concept descriptions) with other elements of instance-based
learning (i.e. the employment of a distance measure over the instance space).
De nition 4.1 (Terminological Cluster Trees). Given a knowledge base K,
a terminological cluster tree (TCT) is a (binary) logical tree where:
{ each leaf node stands for a cluster of individuals, C
{ each node contains a concept description D (over the signature of K);
{ each edge from an internal node corresponds to the outcome of the
membership test of individuals w.r.t. the concept installed in the node3.
      </p>
      <p>Hence, a tree-node can be represented by a quadruple hD; C; Tleft; Trighti,
indicating the two subtrees connected by either edge.</p>
      <p>Person</p>
      <sec id="sec-4-1">
        <title>Person u 9hasPublication:&gt;</title>
        <p>C4</p>
      </sec>
      <sec id="sec-4-2">
        <title>Person u 9hasPublication:(SWJ)</title>
        <p>C3
C1</p>
        <p>C2
Example 4.1 (a TCT). Fig. 1 illustrates a simple example of a TCT for
describing individuals in the academic domain. The root node of the tree contains the
concept Person. Two edges depart from this node: the left branch is used to
denote a positive membership of an individual w.r.t. the concept Person, while
3 By convention the left branch is for positive instances and the right one is for negative
instances.
the right branch denote a negative membership w.r.t. the concept. On one hand,
the right child of this node is a leaf that contains a cluster C4 composed by
individuals that are not instances of the concept Person. On the other hand, the
left child of the root node contains a further complex concept description. Again,
there are two edges departing from this node: the right edge links the node to
another leaf containing the cluster C3 of individuals denoting individuals that
have no publication. tu</p>
        <p>The details of the algorithms for (a) growing a TCT and (b) deriving
intensional de nitions are reported in the sequel.
4.1</p>
        <p>
          A Method for Inducing Terminological Cluster Trees
A terminological cluster tree T is induced by means of a recursive strategy, which
follows the same schema proposed for terminological decision trees (TDTs) [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
The sketch is reported in Alg. 1.
        </p>
        <p>Algorithm 1 Routines for inducing a TCT</p>
        <p>The main routine is to be invoked as induceTCT(TI; C), where C may be
&gt; or any other general concept the individuals in TI are known to belong to by
default.</p>
        <p>In this recursive algorithm, the base case depends on a test (via
stopCondition) on a threshold over the heuristics employed for growing the tree, measuring
the cohesion of the cluster of individuals routed to the node in terms of a given
metric. If this value exceeds then the branch is marked as completed and the
cluster is stored in a leaf node. Further details about the heuristics and the stop
condition will be reported later on.</p>
        <p>In the inductive step, the current (parent) concept description C has to be
specialized by means of an re nement operator ( ) exploring the search space of
specializations of C. A set S of candidate specializations (C) is obtained. For
each E 2 S, the set of positive and negative individuals, denoted, resp., by P
and N are retrieved.</p>
        <p>A tricky situation may occur (line 24) when either N or P is empty for a
given concept (e.g. due to a total lack of disjointness axioms). In such a case,
the algorithm can re-assign individuals I to N (resp. P) based on the distance
between them and the prototype of P (resp. N) when this goes beyond a given
threshold .</p>
        <p>For P and N, a representative element is determined as a prototype, i.e. their
medoid, a central element in a cluster, having a minimal average distance w.r.t.
the other elements. Then, function selectBestConcept evaluates the
specializations of the concept according to the closeness w.r.t. the medoids, determined
according to a given distance measure (discussed later). The best concept E 2 S
is the one for which the distance between the medoids for positive and negative
instance set is maximized. Then E is installed in the current node.</p>
        <p>After the assessment of the best concept E , the individuals are partitioned
by split to be routed along the left or right branch. Di erently from TDTs,
the routine does not decide the branch where the individuals will be sorted
according to a concept membership test (instance check): rather it decides to
split individuals according to the distance w.r.t. the prototypes of positive and
negative membership w.r.t. E , i.e. the medoids of P and N. This
divide-andconquer strategy is applied recursively until the instances routed to a node satisfy
the stop condition. Note that, the number of the clusters is not required as an
input but it depends on the number of paths generated in the growing phase:
the algorithm is able to determine it naturally following the data distribution.
Re nement Operator The proposed approach relies on a downward re
nement operator that can generate the concepts to be installed in child-nodes
performing a specialization process on the concept, say C, installed in a
parentnode:
1 by adding a concept atom (or its complement) as a conjunct: C0 = C u (:)A;
2 by adding a general existential restriction (or its complement) as a conjunct:</p>
        <p>C0 = C u (:)(9)R:&gt;;
3 by adding a general universal restriction (or its complement) as a conjunct:</p>
        <p>C0 = C u (:)(8)R:&gt;;
4 by replacing a sub-description Ci in the scope of an existential restriction in</p>
        <p>C with one of its re nements: 9R:Ci0 2 (9R:Ci) ^ Ci0 2 (Ci);
5 by replacing a sub-description Ci in the scope of a universal restriction with
one of its re nements: 8R:Ci0 2 (8R:Ci) ^ Ci0 2 (Ci).</p>
        <p>Note that the cases of 4 and 5 are recursive.</p>
        <p>
          Prototypes Despite the common schema of the algorithms employed for
growing TCTs and TDTs, the latter are obtained by selecting the best test in terms
of information gain maximization [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], while the clustering procedure for TCTs
resorts to a distance-based criterion on the individuals in the knowledge base.
Speci cally, the heuristic adopted for selecting the best concept description that
will be installed as new node can be de ned as follows:
        </p>
        <p>E
= arg max d (p(P); p(N))</p>
        <p>
          D2 (C)
where P and N are sub-clusters obtained from splitting I w.r.t. D, d( ; ) is a
distance measure between individuals and p( ) is a function which maps a set of
individuals to its prototype. As previously mentioned, the algorithm computes
the medoids of both the set of positive and negative instances w.r.t. the test.
Distance Measure The computation of medoids requires a (possibly)
languageindependent measure for individuals whose de nition should capture aspects of
their semantics in the context of the knowledge base. However individuals don't
have an algebraic structure that can be exploited directly. In the TCT induction
algorithm a language-independent dissimilarity measure [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] has been adopted.
        </p>
        <p>Given a knowledge base K, the idea is to use the behavior of an individual
w.r.t. a set of concepts C = fC1; C2; : : : ; Cmg that is dubbed context or committee
of features. After C has been chosen, a projection function for each Ci 2 C can
be de ned as a mapping i : Ind(A) ! f0; 12 ; 1g such that
8 a 2 Ind(A)
i(a) =
8 1 if K j= Ci(a)
&lt;</p>
        <p>
          0 if K j= :Ci(a)
: 12 otherwise
For the value of i associated to the uncertain membership case (owing to the
OWA) we adopted a uniform prior probability. It can be set more accurately
according to the membership to Ci, i : Ind(A) ! [0; 1], with x 2 Ind(A) 7!
i(x) = P[K j= Ci(x)], if such a value can be estimated. For example, for densely
populated ontologies this may be estimated as jrA(Ci)j=jInd(A)j, where rA()
indicates the retrieval of the argument concept w.r.t. A, i.e. the set of individuals
in that are known to belong to the concept [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]: rA(C) = fa 2 Ind(A) j K j=
C(a)g. For largely populated concept C, this may be estimated on the ground
of the assertions contained in A avoiding the bottleneck of reasoning.
        </p>
        <p>
          Through the projection function, it is possible to de ne a family of distance
measures fdpCgp2N for individuals as follows: dpC : Ind(A) Ind(A) ! [0; 1] such
that
In previous papers, e.g. [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], we also used the form: ppPi wij[ i(a) i(b)jp.
The vector of weights w can be set according to the entropy of the considered
concepts computed over the individuals occurring in K [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ].
        </p>
        <p>To speed up the algorithm, the projection functions can be pre-computed
(once) before the learning phase.</p>
        <p>Stop Condition The growth of a TCT can be stopped by resorting to a
heuristic that is similar to the one employed for selecting the best concept description.
This can be made by introducing a threshold 2 [0; 1], for the value of d( ; ). If
the value is lower than the threshold, the branch growth is stopped.
4.2</p>
        <p>Extracting Concepts from TCTs
Alg. 2 reports the sketch of the function for deriving the concept descriptions
describing the clusters obtained through a TCT. Given a TCT T , essentially
Algorithm 2 Routines for deriving concept de nitions from a TCT
1 function DeriveConcepts(C; T ): set of concepts
2 input C: concept name
3 T : TCT
4 begin
5 let T = hD; I; Tleft; Trighti
6 if Tleft = Tright = null then f leaf g
7 return fCg
8 else
9 CSleft DeriveConcepts(C u D; Tleft)
10 CSright DeriveConcepts(:C u :D; Tright)
11 return (CSleft [ CSright)
12 end
function deriveConcepts traverses the tree structure to collect the concept
descriptions that are used as parents of the leaf-nodes. In this phase, it generates
a set of concept descriptions CS.</p>
        <p>Example 4.2. The set CS that can be obtained from the tree reported in Fig. 1
contains the following concept descriptions (corresponding to the four clusters):
CS = f D1; D2; D3; D4 g =
f Person u 9hasPublication:&gt; u 9hasPublication:(SWJ);
(Person u 9hasPublication:&gt;) u :(Person u 9hasPublication:(SWJ));
Person u :(Person u 9hasPublication:&gt;);
:Person g
Such concepts might be also simpli ed before being output.
tu</p>
        <p>Note that also the internal nodes contain concept descriptions that account
for the individuals routed to such nodes. Hence it is straightforward to extend the
extraction procedure, in order to produce a list of subsumption axioms between
couples of the node concepts which might be submitted to a knowledge engineer
for validation.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusions, Ongoing and Future Work</title>
      <p>
        In this work, we have proposed an extension of terminological decision trees [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
which were originally employed for (supervised) concept learning, in order to
solve (unsupervised) conceptual clustering problems in the context of the datasets
belonging to the Web of Data.
      </p>
      <p>The algorithm essentially adopts a divide-and-conquer strategy that
generates concept descriptions to be installed in the inner nodes via a re nement
operator and selects the most promising ones to represent clusters of similar
individuals using a suitable distance measure.</p>
      <p>
        This preliminary work can be extended along various possible directions
(some extensions are already being carried out), which can be listed as follows:
{ a comparison to other clustering methods in order to understand the
feasibility of the proposed solution;
{ new distance measures between the individuals : in this work we adopted a
language-independent distance measure between the individuals of a
knowledge base. In this perspective, it may be interesting to investigate other
distance measures (e.g. less computationally expensive functions);
{ new re nement operators : we adopted a re nement operator used for solving
supervised learning problems. Further extensions of this work may consider
new re nement operators that can be borrowed from other machine learning
algorithms, e.g. DL-Learner [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] or other methods devised for the speci c
method;
{ new heuristics : the approach installs concept descriptions so that the overlap
between the corresponding sets of positive and negative individuals is
minimized. It may be interesting to investigate a di erent heuristic that allows
to quantify the degree of overlap between the two set of individuals;
{ discovery of disjointness axioms in order to enrich a knowledge base through
the induction of terminological cluster trees and evaluation of the approach;
{ strategies to obtain simpler trees : we plan to investigate the e ectiveness of
post-pruning procedures
{ inducing di erent kind of clusters : we focused on crisp clustering in this
work. Another possible extension concerns the integration of theories for
uncertainty management such as probabilistic models or the Dempster-Shafer
theory allowing overlapping clusters;
{ scalability : another extension is to adopt solutions to make the proposed
method scalable. Some possible solutions span from the implementation of
distributed version of TCT induction algorithm and the employment of
approximate reasoners in order to cope with the computational costs related
to the underlying reasoning services adopted by the algorithm.
18. Stepp, R.E., Michalski, R.S.: Conceptual Clustering: Inventing Goal Oriented
Classi cations of Structured Objects. In: Machine Learning: An Arti cial Intelligence
Approach, Vol II. Morgan Kaufmann (1986)
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Aggarwal</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Reddy</surname>
            ,
            <given-names>C.K.</given-names>
          </string-name>
          :
          <article-title>Data Clustering: Algorithms and Applications</article-title>
          . Chapman &amp; Hall/CRC, 1st edn. (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Baader</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Calvanese</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McGuinness</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nardi</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Patel-Schneider</surname>
            ,
            <given-names>P</given-names>
          </string-name>
          . (eds.):
          <article-title>The Description Logic Handbook</article-title>
          . Cambridge University Press, 2nd edn. (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bezdek</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          :
          <article-title>Pattern Recognition with Fuzzy Objective Function Algorithms</article-title>
          . Kluwer Academic Publishers (
          <year>1981</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Blockeel</surname>
          </string-name>
          , H.,
          <string-name>
            <surname>De Raedt</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Top-down induction of rst-order logical decision trees</article-title>
          .
          <source>Artif. Intell</source>
          .
          <volume>101</volume>
          (
          <issue>1-2</issue>
          ),
          <volume>285</volume>
          {
          <fpage>297</fpage>
          (
          <year>1998</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>d'Amato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fanizzi</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Esposito</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Query Answering and Ontology Population: An Inductive Approach</article-title>
          . In: Bechhofer,
          <string-name>
            <surname>S.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of ESWC 2008. LNCS</source>
          , vol.
          <volume>5021</volume>
          , pp.
          <volume>288</volume>
          {
          <fpage>302</fpage>
          . Springer (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>De Raedt</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blockeel</surname>
          </string-name>
          , H.:
          <article-title>Using logical decision trees for clustering</article-title>
          . In: Lavrac, N., Dzeroski, S. (eds.)
          <source>Proceedings of ILP 1997. LNAI</source>
          , vol.
          <volume>1297</volume>
          , pp.
          <volume>133</volume>
          {
          <fpage>140</fpage>
          . Springer (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Dempster</surname>
            ,
            <given-names>A.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Laird</surname>
            ,
            <given-names>N.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rubin</surname>
            ,
            <given-names>D.B.</given-names>
          </string-name>
          :
          <article-title>Maximum likelihood from incomplete data via the EM algorithm</article-title>
          .
          <source>Journal of the Royal Statistical Society, Series B</source>
          <volume>39</volume>
          (
          <issue>1</issue>
          ),
          <volume>1</volume>
          {
          <fpage>38</fpage>
          (
          <year>1977</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Fanizzi</surname>
          </string-name>
          , N.,
          <string-name>
            <surname>d'Amato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Esposito</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Induction of Concepts in Web Ontologies through Terminological Decision Trees</article-title>
          . In: Balcazar,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of ECML/PKDD2010. LNAI</source>
          , vol.
          <volume>6321</volume>
          , pp.
          <volume>442</volume>
          {
          <fpage>457</fpage>
          . Springer (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Fanizzi</surname>
          </string-name>
          , N.,
          <string-name>
            <surname>d'Amato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>A Hierarchical Clustering Method for Semantic Knowledge Bases</article-title>
          . In: Apolloni,
          <string-name>
            <given-names>B.</given-names>
            ,
            <surname>Howlett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.J.</given-names>
            ,
            <surname>Jain</surname>
          </string-name>
          , L.C. (eds.)
          <source>Proceedings of KES</source>
          <year>2007</year>
          ,
          <article-title>Part III</article-title>
          .
          <source>LNCS</source>
          , vol.
          <volume>4694</volume>
          , pp.
          <volume>653</volume>
          {
          <fpage>660</fpage>
          . Springer (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Fanizzi</surname>
          </string-name>
          , N.,
          <string-name>
            <surname>d'Amato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Esposito</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <source>Evolutionary Conceptual Clustering Based on Induced Pseudo-Metrics. Int. J. Semantic Web Inf. Syst</source>
          .
          <volume>4</volume>
          (
          <issue>3</issue>
          ),
          <volume>44</volume>
          {
          <fpage>67</fpage>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Fanizzi</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iannone</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Palmisano</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Semeraro</surname>
          </string-name>
          , G.:
          <article-title>Concept Formation in Expressive Description Logics</article-title>
          . In: Boulicaut,
          <string-name>
            <surname>J.</surname>
          </string-name>
          , et al. (eds.)
          <source>Proceedings of ECML 2004. LNAI</source>
          , vol.
          <volume>3201</volume>
          , pp.
          <volume>99</volume>
          {
          <fpage>110</fpage>
          . Springer (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Heath</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bizer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Linked Data: Evolving the Web into a Global Data Space</article-title>
          .
          <source>Synthesis Lectures on the Semantic Web</source>
          , Morgan &amp; Claypool Publishers (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Kietz</surname>
            ,
            <given-names>J.U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Morik</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          :
          <article-title>A Polynomial Approach to the Constructive Induction of Structural Knowledge</article-title>
          .
          <source>Mach Learn</source>
          <volume>14</volume>
          ,
          <issue>193</issue>
          {
          <fpage>217</fpage>
          (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Laskey</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Costa</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kokar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lukasiewicz</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Uncertainty Reasoning for the World Wide Web</article-title>
          .
          <source>Tech. rep., URW3</source>
          W3C Incubator Group (
          <year>2008</year>
          ), http://www.w3.org/2005/Incubator/urw3/XGR-urw3-
          <fpage>20080331</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Lehmann</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>DL-Learner: Learning Concepts in Description Logics</article-title>
          .
          <source>Journal of Machine Learning Research (JMLR) 10</source>
          ,
          <fpage>2639</fpage>
          {
          <fpage>2642</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>MacQueen</surname>
          </string-name>
          , J.B.:
          <article-title>Some Methods for Classi cation and Analysis of MultiVariate Observations</article-title>
          . In: Le Cam,
          <string-name>
            <given-names>L.M.</given-names>
            ,
            <surname>Neyman</surname>
          </string-name>
          ,
          <string-name>
            <surname>J</surname>
          </string-name>
          . (eds.)
          <source>Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability</source>
          . vol.
          <volume>1</volume>
          , pp.
          <volume>281</volume>
          {
          <fpage>297</fpage>
          . University of California Press (
          <year>1967</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Rettinger</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , Losch, U.,
          <string-name>
            <surname>Tresp</surname>
          </string-name>
          , V.,
          <string-name>
            <surname>d'Amato</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fanizzi</surname>
          </string-name>
          , N.:
          <article-title>Mining the Semantic Web - Statistical learning for next generation knowledge bases</article-title>
          .
          <source>Data Min. Knowl. Discov</source>
          .
          <volume>24</volume>
          (
          <issue>3</issue>
          ),
          <volume>613</volume>
          {
          <fpage>662</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>