<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Structural Similarity in Expressive Description Logics: An Extended Family of Kernels for OWL</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nicola Fanizzi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudia d'Amato</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Floriana Esposito</string-name>
          <email>espositog@di.uniba.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Concept Similarity in Ontologies</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dipartimento di Informatica, Universita degli studi di Bari</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the context of the Semantic Web many applications of inductive inference ultimately rely on a notion of similarity for the standard knowledge representations of the ontologies. We have tackled the problem of statistical learning with ontologies proposing a family of structural kernels for ALCN that has been integrated with support vector machines. Here we extend the de nition of the kernels to more expressive languages of the family, namely those backing OWL.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        As such the kernels are designed for comparing concept descriptions. In order
to apply them to instance comparison w.r.t. real ontologies, that likely exploit
the full extent of the most expressive DL languages, (upper) approximations of
the most speci c concepts that cover the single individuals had to be computed
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This exposes the method to a number of problems. Firstly, a (partially)
structural normal form may fail to fully capture the semantic similarity between the
individuals. Scaling up to more complex languages wold require speci c normal
forms. Moreover, the existence of a most speci c concept is not guaranteed [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Coupling valid kernels with e cient algorithms such as the support vector
machines, many tasks based on inductive classi cation can be tackled.
Particularly, we demonstrate how to perform important inferences on semantic
knowledge bases, namely concept retrieval and query answering. These tasks are
generally grounded on merely deductive procedures which easily fail in case of
(partially) inconsistent or incomplete knowledge. The methods has been shown to
work comparably well w.r.t. standard reasoners [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ], allowing the suggestion
of new knowledge that was not previously logically derivable.
      </p>
      <p>
        Moreover, these kernels naturally induce distance measures which allow for
extensions to further metric-based learning methods such as nearest-neighbor
classi cation [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] and conceptual clustering [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. However these extensions are
beyond the scope of this work.
2
      </p>
      <p>
        Preliminaries on Representation and Inference
The basics of the DLs will be brie y recalled. The reader may refer to the DL
handbook [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for a thorough reference. Such representations provide the basic
constructors adopted by the standard ontology languages employed in the SW,
such as the Ontology Markup Language OWL. We will focus on the ALCQ logic
which may represent a tradeo between expressiveness and reasoning e ciency.
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>Knowledge Bases in Description Logics</title>
      <p>Let us consider a triple hNC ; NR; NI i made up. respectively, by a set of primitive
concept names NC , to be interpreted as sets of objects in a certain domain, a set
of primitive role names NR, to be interpreted as binary relationships between the
mentioned objects, and a set of individual names NI for the objects themselves.</p>
      <p>The semantics of the descriptions is de ned by an interpretation I = ( I ; I ),
where I is a non-empty set, the domain of the interpretation, and I is the
interpretation function that maps each individual a 2 NI to a domain object
aI 2 I , each primitive concept name A 2 NC to its extension AI I and
for each R 2 NR the extension is a binary relation RI I I .</p>
      <p>
        Complex descriptions can be built in ALCQ using the language constructors
listed in Table 1, along with their semantics derived from the interpretation of
atomic concepts and roles [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The top concept &gt; is interpreted as the whole
domain I , while the bottom concept ? corresponds to ;. Complex descriptions
can be built in ALCQ using the following constructors. The language supports
full negation: a concept negation :C has an extension that amounts to the
complement of CI w.r.t. the domain. The conjunction and disjunction of two
concepts are simply interpreted as the intersection and union of their extensions.
Concepts can be also de ned as restrictions on the roles. The existential
restriction constrains the elements of its extension to be related via a certain role R to
some instances of concept C while the value (or universal ) restriction comprises
those elements who are related through R only to instances of concept C (if
any). Finally, quali ed numeric restrictions de ne concepts by constraining its
instances with the minimal or maximal number of instances of concept C related
through R.
      </p>
      <p>
        OWL o ers further constructors that extend the expressiveness of this
language. Its DL equivalent is SHOIQ(D) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], that extends ALCQ with individual
classes, role hierarchies, transitive and inverse roles (I). Besides concrete
domains (D), i.e. well-founded external data types (such as numerical types, tuples
of the relational calculus, spatial regions, or time intervals), can be dealt with.
      </p>
      <p>The main inference employed with these representations is assessing whether
a concept subsumes another concept based on their semantics:</p>
    </sec>
    <sec id="sec-3">
      <title>De nition 2.1 (subsumption). Given two descriptions C and D, C is sub</title>
      <p>sumed by D, denoted by C v D, i for every interpretation I it holds that
CI DI . When C v D and D v C then they are equivalent, denoted with
C D.</p>
      <p>Note that this naturally induces a generality relationship on the space of
concepts.</p>
      <p>Generally subsumption is not assessed in isolation but rather related to the
models of a system of axioms representing the background knowledge and the
world state:
De nition 2.2 (knowledge base). A knowledge base K = hT ; Ai contains
a TBox T and an ABox A. T is the set of terminological axioms of concept
descriptions C D, where C is the concept name and D is its description. A
contains assertions on individuals C(a) and R(a; b).</p>
      <p>An interpretation that satis es all its axioms is a model of the knowledge base.</p>
      <p>Note that de ned concepts should have a unique de nition. However de
ning concepts through inclusions axioms (C v D) is also generally admitted.
Moreover, such de nitions are assumed to be unfoldable.</p>
      <p>Example 2.1 (royal family). This example shows a knowledge base modeling
concepts and roles related to the British royal family:
T = f Male :Female,</p>
      <p>Woman Human u Female,
Man Human u Male,
Mother Woman u 9hasChild.Human,
Father Man u 9hasChild.Human,
Parent Father t Mother,
Grandmother Mother u 9hasChild.Parent,
Mother-w/o-daughter Mother u 8hasChild.:Female,</p>
      <p>Super-mother Mother u 3 hasChild.Human g
A = f Woman(elisabeth), Woman(diana), Man(charles), Man(edward),
Man(andrew), Mother-w/o-daughter(diana),
hasChild(elisabeth, charles), hasChild(elisabeth,edward),
hasChild(elisabeth, andrew), hasChild(diana, william),
hasChild(charles, william) g</p>
      <p>Note that the Open World Assumption (OWA) is made in the underlying
semantics, since normally knowledge representation systems are applied in
situations where one cannot assume that the knowledge in the base is complete.
Although this is quite di erent from the standard settings considered in machine
learning and knowledge discovery, it is convenient for the Semantic Web context
where new resources (e.g. Web pages, Web services) may be continuously made
available.</p>
    </sec>
    <sec id="sec-4">
      <title>2.2 Inference Services</title>
      <p>The basic inference services for DL knowledge bases can be viewed as
entailments as they amount to verifying whether a generic relationship is a logical
consequence of the knowledge bases axioms.</p>
      <p>
        Typical reasoning tasks for TBoxes regards the satis ability of a concept,
subsumption, equivalence or disjointness between two concepts. For example
in the knowledge base reported in Ex. 2.1, Father is subsumed by Man and is
disjoint with Woman. These inferences can be reduced to each of them, hence
normally reasoners support only one of them. Reasoning is performed though
tableau algorithms [
        <xref ref-type="bibr" rid="ref1 ref9">9, 1</xref>
        ].
      </p>
      <p>Inference problems concerning ABoxes are mainly related to checking its
consistency, that is, whether it has a model, especially w.r.t. those of the TBox.
For example, an ABox like the one in Ex. 2.1 containing also the assertion
Father(elisabeth) is not consistent.</p>
      <p>
        Since we aim at crafting inductive methods that manipulate individuals, a
prototypical inference is instance checking [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], that amounts to checking whether
an assertion is entailed by the knowledge base (K j= ). If is a role assertion
then it is quite easy to perform the inference as normally roles do not have a
de nition in the TBox. However, we will focus on deciding whether an individual
is an instance of a given concept. This can be easily reduced to a consistency
problem: K j= C(a) i K [ f:C(a)g is inconsistent.
      </p>
      <p>The adopted open world semantics has important consequences on the way
queries are answered. While the closed-world semantics identi es a database
with a single model an ABox represents possibly in nitely many interpretations.
Hence a reasoner might be unable to answer certain queries because it may be
actually able to build models for both a positive (K [ fC(a)g) and a negative
(K [ f:C(a)g) answer.</p>
      <p>Example 2.2 (open world semantics).</p>
      <p>Given the ABox is Ex. 2.1, while it is explicitly stated that diana is a
Mother-w/odaughter, it cannot be proven for elisabeth because she may have daughters that
are simply not known in the knowledge base. Analogously, elisabeth is an instance
of Super-mother while this cannot be concluded for diana as only two children
are known in the current knowledge base. Although these instance checks fail
because of the inherent incompleteness of the knowledge state, this does not
imply that the individuals belong to the negated concepts, as could be inferred
in case the CWA were made.</p>
      <p>Another related inference is retrieval which consists in querying the
knowledge base to know the individuals that belong to a given concept:
De nition 2.3 (retrieval). Given an knowledge base K and a concept C, nd
all individuals a such that K j= C(a).</p>
      <p>A straightforward algorithm for a retrieval query can be realized via instance
checking on each individual occurring in the ABox, testing whether it is an
instance of the concept.</p>
      <p>Dually, it may be necessary to nd the (most speci c) concepts which an
individual belongs to. This is called a realization problem. One especially seeks
for the most speci c one (up to equivalence):</p>
      <sec id="sec-4-1">
        <title>De nition 2.4 (most speci c concept). Given an ABox A and an individual</title>
        <p>a, the most speci c concept of a w.r.t. A is the concept C, denoted MSCA(a),
such that A j= C(a) and for any other concept D such that A j= D(a), it holds
that C v D.</p>
        <p>
          This is a typical way to lift individuals to the conceptual level [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ].
        </p>
        <p>
          For some languages, the MSC may not be expressed by a nite description [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ],
yet it may be approximated by a more general concept [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. Generally
approximations up to a certain depth k of nested levels are considered, denoted MSCk.
We will generically indicate a maximal depth approximation with MSC . For
further details see also [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
2.3
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>A Normal Form for ALCQ</title>
        <p>
          Many semantically equivalent (yet syntactically di erent) descriptions can be
given for the same concept. Equivalent concepts can be reduced to a normal
form by means of rewriting rules that preserve their equivalence [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]. We will
adopt a normal form extending one given for ALC descriptions [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ], for concepts
that are already in negation normal normal form.
        </p>
        <p>Preliminarily, some notation is necessary for naming the various nested
subparts (levels) of a concept description D:
{ prim(D) is the set of all the primitive concepts (or their negations) at the
top-level of D;
{ valR(D) is the1 concept in ALCQ normal form in the scope of the value
restriction (if any) at the top-level of D (otherwise valR(D) = &gt;);
{ exR(D) is the set of the descriptions C0 appearing in existential restrictions
9R:C0 at the top-level conjunction of D;
{ minR:C (D) = maxfn 2 IN j D v nR:Cg (always a nite number);
{ maxR:C (D) = minfn 2 IN j D v nR:Cg (if unlimited, maxR:C (D) = 1).</p>
        <p>A normal form may be recursively de ned as follows:</p>
      </sec>
      <sec id="sec-4-3">
        <title>De nition 2.5 (ALCQ normal form). A concept description C is in ALCQ</title>
        <p>normal form i C = ? or C = &gt; or if C = C1 t t Cn with</p>
        <p>l
Ci =</p>
        <p>P u
P 2prim(Ci) R2NR:</p>
        <p>8
l &lt;8R:valR(Ci) u l 9R:E u</p>
        <p>E2exR(Ci)</p>
        <p>l h
C2NC</p>
        <p>i
mR:C R:C u</p>
        <p>9
M Ri:C R:Ci=
;
where miR:C = minR:C (Ci), M Ri:C = maxR:C (Ci) and, for all R 2 NR, valR:C(Ci)
and every sub-description in exR:C(Ci) are, in their turn, in ALCQ normal
form.</p>
        <p>Example 2.3 (ALCQ normal form). The concept description
C (:A1 u A2) t (9R1:B1 u 8R2:(9R3:(:A3 u B2)))
is in normal form, whereas the following is not:
D A1 t B2 u :(A3 u 9R3:B2) t 8R2:B3 u 8R2:(A1 u B3)
where Ai's and Bj 's are primitive concept names and the Rk's are role names.</p>
        <p>
          This normal form can be obtained by means of repeated applications of
equivalence preserving operations, namely replacing de ned concepts with their
de nition as in the TBox and pushing the negation into the nested levels
(negation normal form). This normal form induces an AND-OR tree structure for the
concept descriptions in this language. This structure can be used for comparing
di erent concepts and asses their similarity.
1 A single one because multiple value restrictions can be gathered using the equivalence
8R:C u u 8R:Cn 8R:(C1 u u Cn). Then the nested descriptions can be
transposed in normal form using further rewriting rules (distributiveness, etc.. . . )
[
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
        </p>
        <p>
          Structural Kernels for ALCQ
in ALCQ normal form:
disjunctive descriptions:
A simple way to de ne a kernel function for concept descriptions in normal form
would require to adapt a tree kernel [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] where similarity between trees depends
on the number of similar subtrees (or paths unraveled from such trees) which
does not fully capture the semantic nature of expressive DLs languages.
        </p>
        <p>The kernel function de nition should not be based only on structural
elements but also (partly) on their semantics, since di erent descriptions may have
similar extensions and vice-versa, especially with expressive DL languages such
as ALCQ.</p>
        <p>Normal form descriptions can be decomposed level-wise into sub-descriptions.
For each level, there are three possibilities: the upper level is dominated by the
disjunction (1) of concepts that, in turn, are made up of a conjunction (2) of
complex or primitive (3) concepts. In the following the de nition of the ALCQ
kernel is reported.</p>
        <p>Y kId (valR(C1); valR(C2))
R2NR</p>
        <p>Y
R2NR CCij12 22 eexxRR((CC12))</p>
        <p>X
kd (Ci1; Cj2)</p>
        <p>I
R2NR C2NC</p>
        <p>Y</p>
        <p>Y kn((minR:C (C1); maxR:C (C1)); (minR:C (C2); maxR:C (C2)))</p>
        <p>I
numeric restrictions:
if min(MC ; MD) &gt; max(mC ; mD)
kn((mC ; MC ); (mD; MD)) =</p>
        <p>I
min(MC ; MD)
max(MC ; MD)
max(mC ; mD) + 1
min(mC ; mD) + 1
2 We use the superscripts k for more clarity.
otherwise kn((mC ; MC ); (mD; MD)) = 0.</p>
        <p>I
primitive concepts:
kp (P1; P2) = kset(P1I ; P2I ) = jP1I \ P2I j</p>
        <p>
          I
where kset is the kernel for set structures de ned in [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. This case includes also
the negation of primitive concepts: (:P )I = I n P I
Preliminarily, the extension of the concepts can be approximated by the
cardinality of their retrieval. Note also that this is a family of kernels parameterized
on an interpretation and on a real number .
        </p>
        <p>The rst kernel function (kd) computes the similarity between disjunctive
descriptions as the sum of the cross-similarities between all couples of disjuncts
from the top-level of either description. The term is employed to lessen the
contribution coming from the similarity of the sub-descriptions (i.e. amount of
indirect similarity between concepts that are related to those at this level) on
the grounds of the level where they occur.</p>
        <p>The conjunctive kernel (kc) computes the similarity between two input
descriptions, distinguishing primitive concepts, those referred in value restrictions
and those referred in existential restrictions. These values are multiplied re
ecting the fact that all the restrictions have to be satis ed at a conjunctive level.</p>
        <p>The similarity of the quali ed numeric restrictions is simply computed (by
kn) as a measure of the overlap between the two intervals. Namely it is the ratio
of the amounts of individuals in the overlapping interval and those the larger
one, whose extremes are minimum and maximum. Note that some intervals may
be unlimited above: max = 1. In this case we may approximate with an upper
limit N greater than j I j + 1.</p>
        <p>
          The similarity between primitive concepts is measured (by kp) in terms of
the intersection of their extension. Since the extension is in principle unknown,
we will epistemically approximate it recurring to the notion of retrieval (see
Def. 2.3). Making the unique names assumption on the names of the individual
occurring in the ABox A, one can consider the canonical interpretation [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] I,
using Ind(A) as its domain ( I := Ind(A)). Note that the ABox may be thought
of as a (partially complete) graph structure where multiple instances are located
accounting for a number of possible worlds.
        </p>
        <p>Besides, the kernel can be normalized as follows: since the kernel for primitive
concepts is essentially a set kernel it may be multiplied by a constant p =
1= I so that the cardinality of the intersection is weighted by the number of
individuals occurring in the overall ABox. Alternatively, another choice could be
parameterized on the primitive concepts of the kernel de nition P = 1=jP1I [P2I j
which would weight the rate of similarity (the extension intersection) measured
by the kernel with the size of the concepts measured in terms of the individuals
belonging to their extensions.</p>
        <p>Discussion. Being partially based on the concept structure and only ultimately
on the extensions of the concepts at the leaves of the tree, it may be objected that
the proposed kernel functions may only roughly capture the semantic similarity
of the concepts. This may be well revealed by the case of input concepts that
are semantically almost equivalent yet structurally di erent. However, it must
be also pointed out that the rewriting process for putting the concepts in normal
form tends to eliminate these di erences. More importantly, the ultimate goal
for de ning a kernel is comparing individuals rather than concepts. This will
be performed recurring to the most speci c concepts of the individuals w.r.t.
the same ABox (see Sect. 4). Hence, it was observed that semantically similar
individuals tend to share the same structures as elicited from the same source.</p>
        <p>
          The validity of a kernel depends on the fact that the function is de nite
positive. Yet validity can be also proven exploiting some closure properties of the
class of kernel functions w.r.t. several operations [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Namely, the multiplication
of a kernel by a constant, the addition or multiplication of kernels yields valid
kernels. Thus one can demonstrate that the functions introduced above are indeed
valid kernels for the given space of hypotheses. Then, exploiting these closure
properties it can be proven that:
Proposition 3.1. Given an interpretation I, the function kI is a valid kernel
for the space X of ALCQ descriptions in normal form.
        </p>
        <p>
          Observe that the core function is the one on primitive concept extensions. It is
essentially a set kernel [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. The kernel functions for top-level conjunctive and
disjunctive descriptions are also positive de nite being essentially based on the
primitive kernel. Descending through the levels there is an interleaving of the
employment of these function up the the basic case of the function for primitive
descriptions.
        </p>
        <p>
          As regards the computational complexity, it is possible to show that the
kernel function can be computed in time O(jN1jjN2j) where jNij, i = 1; 2, is
the number of nodes of the concept AND-OR trees. It can computed by means
of dynamic programming. It is also worthwhile to note that Knowledge Base
Management Systems, especially those dedicated to storing instances [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ],
generally maintain information regarding concepts and instances which may further
speed-up the computation.
4
        </p>
        <sec id="sec-4-3-1">
          <title>Extensions</title>
          <p>It has been objected that the kernels in this familty would not work for concepts
that are equivalent yet syntactically di erent. However, they are not intended for
assessing a concept similarity: ultimately kernel machines employed in inductive
tasks need to be applied to instances described in this representation, therefore
the most important extension is towards the case of individuals.</p>
          <p>Indeed, the kernel function can be extended to the case of individuals a; b 2
Ind(A) by taking into account the approximations of their MSCs (see Sect. 2.2).
In this way, we move from a graph representation like the ABox portion
containing an individual to an intensional tree-structured representation:
kI (a; b) = kI (MSC (a); MSC (b))</p>
          <p>Note that before applying the kernel functions a sort of completion of the
input descriptions is necessary, substituting the de ned concepts with the concept
descriptions corresponding to their de nitions, so to make explicit the relevant
knowledge concerning either individual (example).</p>
          <p>The extension of the kernel function to more expressive DL is not trivial.
DLs allowing normal form concept de nitions can only be considered. Moreover,
for each constructor not included in the ALCQ logic, a kernel de nition has to
be provided.</p>
          <p>Another extension for the kernel function could be made taking into
account the similarity between di erent relationships in a more selective way. This
would amount to considering each couple of existential and value restrictions
with one element from each description (or equivalently from each related
ANDOR tree) and the computing the convolution of the sub-descriptions in the
restriction. As previous suggested for , this should be weighted by a measure of
similarity between the roles measured on the grounds of the available
semantics. We propose therefore the following weight: given two roles R; S 2 NR:
RS = jRI \ SI j=j I I j.</p>
          <p>All of these weighting factors need not to be mere constants. Another
possible extension is considering them as functions of the depth of the nodes to
be compared: : N 7! ]0; 1] (e.g. (n) = 1=n). In this way one may control the
decay of impact of the similarity of related individuals/concepts located ad more
deeply nested levels.</p>
          <p>As suggested before, the intersection could be measured on the grounds of
the relative role extensions with respect to the whole domain of individuals, as
follows:</p>
          <p>RS = jRI \ SI j</p>
          <p>
            jRI [ SI j
It is also worthwhile to recall that some DLs knowledge bases contain also an
R-box [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ] with axioms concerning the roles, one knows beforehand that, for
instance, R v S and compute their similarity consequently.
          </p>
          <p>In order to increase the applicability of precision of the structural kernels and
tackle the DL languages supporting the OWL versions, they should be able to
work with inverse roles and nominals. The former may be easily accommodated
by considering, in the normal form and kernels, a larger set of role names NR =
NR [ fS j 9R 2 NR : S = R g. The latter can be dealt with with a set kernel,
as in the sub-kernel for primitive concepts. Given the set of individual names O1
and O2: ko (O1; O2) = kset(O1I ; O2I ) = jO1I \ O2I j.</p>
          <p>I</p>
          <p>
            Finally, as discussed in [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ], related distance measures can also be derived
from kernel functions which essentially encode a notion of similarity between
concepts and between individuals. This can enable the de nition of various
distance-based methods for these complex representations spanning from
relational clustering [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ] to instance-based methods [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ].
          </p>
        </sec>
        <sec id="sec-4-3-2">
          <title>Conclusions and Outlook</title>
          <p>Kernel functions have been de ned for OWL descriptions which was integrated
with a SVM for inducing a statistical classi er working with the complex
representations. The resulting classi er could be tested on inductive retrieval and
classi cation problems.</p>
          <p>The induced classi er can be exploited for predicting or suggesting missing
information about individuals, thus completing large ontologies. Speci cally, it
can be used to semi-automatize the population of an ABox. Indeed, the new
assertions can be suggested to the knowledge engineer that has only to validate
their inclusion. This constitutes a new approach in the SW context, since the
e ciency of the statistical and numerical approaches and the e ectiveness of a
symbolic representation have been combined.</p>
          <p>The derivation of distance measures from the kernel function may enable a
series of further distance-based data mining techniques such as clustering and
instance-based classi cation. Conversely, new kernel functions can be de ned
transforming newly proposed distance functions for these representations, which
are not language dependent and allow the related data mining methods to better
scale w.r.t. the number of individuals in the ABox.</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nardi</surname>
          </string-name>
          , and P. Patel-Schneider, editors.
          <source>The Description Logic Handbook</source>
          . Cambridge University Press,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bloehdorn</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sure</surname>
          </string-name>
          .
          <article-title>Kernel methods for mining instance data in ontologies</article-title>
          . In K. Aberer et al., editors,
          <source>In Proceedings of the 6th International Semantic Web Conference, ISWC2007</source>
          , volume
          <volume>4825</volume>
          <source>of LNCS</source>
          , pages
          <volume>58</volume>
          {
          <fpage>71</fpage>
          . Springer,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Brandt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ku</surname>
          </string-name>
          <article-title>sters, and</article-title>
          <string-name>
            <given-names>A.-Y.</given-names>
            <surname>Turhan</surname>
          </string-name>
          .
          <article-title>Approximation and di erence in description logics</article-title>
          . In D. Fensel et al., editors,
          <source>Proceedings of the 8th International Conference on Principles of Knowledge Representation and Reasoning, KR02</source>
          , pages
          <fpage>203</fpage>
          {
          <fpage>214</fpage>
          . Morgan Kaufmann,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Buitelaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          , and B. Magnini, editors.
          <source>Ontology Learning from Text: Methods</source>
          ,
          <article-title>Evaluation And Applications</article-title>
          . IOS Press,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>W.</given-names>
            <surname>Cohen</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Hirsh</surname>
          </string-name>
          .
          <article-title>Learning the CLASSIC description logic</article-title>
          . In P. Torasso et al., editors,
          <source>Proceedings of the 4th International Conference on the Principles of Knowledge Representation and Reasoning</source>
          , pages
          <volume>121</volume>
          {
          <fpage>133</fpage>
          . Morgan Kaufmann,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Cumby</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Roth</surname>
          </string-name>
          .
          <article-title>On kernel methods for relational learning</article-title>
          . In T. Fawcett and N.Mishra, editors,
          <source>Proceedings of the 20th International Conference on Machine Learning, ICML2003</source>
          , pages
          <fpage>107</fpage>
          {
          <fpage>114</fpage>
          . AAAI Press,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>C. d'Amato</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Fanizzi</surname>
            , and
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Esposito</surname>
          </string-name>
          .
          <article-title>Query answering and ontology population: An inductive approach</article-title>
          . In S. Bechhofer et al., editors,
          <source>Proceedings of the 5th European Semantic Web Conference, ESWC2008</source>
          , volume
          <volume>5021</volume>
          <source>of LNCS</source>
          , pages
          <volume>288</volume>
          {
          <fpage>302</fpage>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>C. d'Amato</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Staab</surname>
            , and
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Fanizzi</surname>
          </string-name>
          .
          <article-title>On the in uence of description logics ontologies on conceptual similarity</article-title>
          .
          <source>In A. Gangemi and J</source>
          . Euzenat, editors,
          <source>Proceedings of the 16th EKAW Conference, EKAW2008</source>
          , volume
          <volume>5268</volume>
          <source>of LNAI</source>
          , pages
          <volume>48</volume>
          {
          <fpage>63</fpage>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Donini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lenzerini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nardi</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Schaerf</surname>
          </string-name>
          .
          <article-title>Deduction in concept languages: From subsumption to instance checking</article-title>
          .
          <source>Journal of Logic and Computation</source>
          ,
          <volume>4</volume>
          (
          <issue>4</issue>
          ):
          <volume>423</volume>
          {
          <fpage>452</fpage>
          ,
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Fanizzi</surname>
          </string-name>
          and
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>d'Amato. A declarative kernel for ALC concept descriptions</article-title>
          . In F. Esposito et al., editors,
          <source>In Proceedings of the 16th International Symposium on Methodologies for Intelligent Systems, ISMIS2006</source>
          , volume
          <volume>4203</volume>
          of Lecture Notes in Computer Science, pages
          <volume>322</volume>
          {
          <fpage>331</fpage>
          . Springer,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>N.</given-names>
            <surname>Fanizzi</surname>
          </string-name>
          , C. d'Amato, and
          <string-name>
            <given-names>F.</given-names>
            <surname>Esposito</surname>
          </string-name>
          .
          <article-title>Randomized metric induction and evolutionary conceptual clustering for semantic knowledge bases</article-title>
          . In M. Silva et al., editors,
          <source>Proceedings of the ACM International Conference on Knowledge Management, CIKM2007</source>
          , pages
          <fpage>51</fpage>
          {
          <fpage>60</fpage>
          , Lisbon, Portugal,
          <year>2007</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>N.</given-names>
            <surname>Fanizzi</surname>
          </string-name>
          , C. d'Amato, and
          <string-name>
            <given-names>F.</given-names>
            <surname>Esposito</surname>
          </string-name>
          .
          <article-title>Learning with kernels in Description Logics</article-title>
          . In F. Zelezny and N. Lavrac, editors,
          <source>Proceedings of the 18th International Conference on Inductive Logic Programming, ILP2008</source>
          , volume
          <volume>5194</volume>
          <source>of LNAI</source>
          , pages
          <volume>210</volume>
          {
          <fpage>225</fpage>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>N.</given-names>
            <surname>Fanizzi</surname>
          </string-name>
          , C. d'Amato, and
          <string-name>
            <given-names>F.</given-names>
            <surname>Esposito</surname>
          </string-name>
          .
          <article-title>Statistical learning for inductive query answering on OWL ontologies</article-title>
          . In A. Sheth et al., editors,
          <source>Proceedings of the 7th International Semantic Web Conference, ISWC2008</source>
          , volume
          <volume>5318</volume>
          <source>of LNCS</source>
          , pages
          <volume>195</volume>
          {
          <fpage>212</fpage>
          . Springer,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Ga</surname>
          </string-name>
          <article-title>rtner</article-title>
          , J. Lloyd, and
          <string-name>
            <given-names>P.</given-names>
            <surname>Flach</surname>
          </string-name>
          .
          <article-title>Kernels and distances for structured data</article-title>
          .
          <source>Machine Learning</source>
          ,
          <volume>57</volume>
          (
          <issue>3</issue>
          ):
          <volume>205</volume>
          {
          <fpage>232</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>D.</given-names>
            <surname>Haussler</surname>
          </string-name>
          .
          <article-title>Convolution kernels on discrete structures</article-title>
          .
          <source>Technical Report UCSCCRL-99-10</source>
          , Department of Computer Science, University of California { Santa Cruz,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>I. R.</given-names>
            <surname>Horrocks</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Turi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Bechhofer</surname>
          </string-name>
          .
          <article-title>The instance store: DL reasoning with large numbers of individuals</article-title>
          . In V. Haarslev and R. Moller, editors,
          <source>Proceedings of the 2004 Description Logic Workshop</source>
          ,
          <string-name>
            <surname>DL</surname>
          </string-name>
          <year>2004</year>
          , volume
          <volume>104</volume>
          <source>of CEUR Workshop Proceedings</source>
          , pages
          <volume>31</volume>
          {
          <fpage>40</fpage>
          .
          <string-name>
            <surname>CEUR</surname>
          </string-name>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mantay</surname>
          </string-name>
          .
          <article-title>Commonality-based ABox retrieval</article-title>
          .
          <source>Technical Report FBI-HH-M291/2000</source>
          , Department of Computer Science, University of Hamburg, Germany,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Shawe-Taylor</surname>
          </string-name>
          and
          <string-name>
            <given-names>N.</given-names>
            <surname>Cristianini</surname>
          </string-name>
          .
          <article-title>Kernel Methods for Pattern Analysis</article-title>
          . Cambridge University Press,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>