<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Ontology Alignment Through Instance Negotiation: a Machine Learning Approach.</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ignazio Palmisano</string-name>
          <email>ignazio@csc.liv.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Luigi Iannone</string-name>
          <email>luigi@csc.liv.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Domenico Redavid</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Semeraro</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dipartimento di Informatica</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Universita` degli Studi di Bari</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Via Orabona</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italy Email:</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>redavid</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>semeraro}@di.uniba.it</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Department, Liverpool University</institution>
          ,
          <addr-line>Ashton Building, Ashton Street, L69 BX Liverpool</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>- The Semantic Web needs methodologies to accomplish actual commitment on shared ontologies among different actors in play. In this paper, we propose a machine learning approach to solve this issue relying on classified instance exchange and inductive reasoning. This approach is based on the idea that, whenever two (or more) software entities need to align their ontologies (which amounts, from the point of view of each entity, to add one or more new concept definitions to its own ontology), it is possible to learn the new concept definitions starting from shared individuals (i.e. individuals already described in terms of both ontologies, for which the entities have statements about classes and related properties); these individuals, arranged in two sets of positive and negative examples for the target definition, are used to solve a learning problem which as solution gives the definition of the target concept in terms of the ontology used for the learning process. The method has been applied in a preliminary prototype for a small multi-agent scenario (where the two entities cited before are instantiated as two software agents). Following the prototype presentation, we report on the experimental results we obtained and then draw some conclusions.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. MOTIVATION</title>
      <p>
        The Semantic Web (SW), being an evolution of the World
Wide Web, will inherit Web decentralized architecture.
Decentralization, in its turn, was one of the success factors of the
Internet, granting its structure with scalability, no single point
of failures, and so on. However, in order to implement the SW
scenario envisioned in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], semantic convergence is crucial.
This point has been always identified in the employment of
ontologies that, even before the rise of SW, were considered
as ”shared formalizations of conceptualizations” [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>In the SW vision, different actors that want to take
advantage from interoperating should be able to converge onto
shared ontologies in order to communicate. Such a problem
turned out to be crucial and very difficult to work out. Indeed,
each party involved has its peculiar view of the domain it
shares with other parties. Very often, different applications are
interested in different aspects of the same world state, e.g. in
a typical B2B scenario in which suppliers and final customers
usually are interested into different aspects of the goods that
are dealt with. Both quantitative and, more likely, qualitative
concepts often turn out to be fuzzy depending on those actors
interested in assessing them to interoperate. Suppliers may
likely consider features like materials, provenance, standard
processes, whereas customer satisfaction may depend also on
other factors, e.g. warranties, comfort, support, which may
turn out to be only marginal for the suppliers, unexpressed
or simply difficult to acquire.</p>
      <p>In such cases, partially overlapping visions of the same
world should be integrated in value chains in order to provide
goods. However, The wide range of possible B2B
scenarios hinders a centralized approach to ontology sharing for
semantic convergence, which contrasts with the architectural
paradigm of the SW. Indeed, such a scenario requires
flexibility, since it basically builds up open situations where the
interacting actors cannot be precisely individuated in advance.
Hence, a central ontology should foresee a high number of
particular situations, and therefore it would be detailed enough,
and therefore useful, only for restricted cases and toy domains.
Moreover, single application ontologies are supposed to vary
in time, an issue that is not easily manageable in centralized
approaches.</p>
      <p>We agree on the fact that semantic convergence has to
be addressed at the ontology level, but we argue that the
approaches for solving such issues should fulfil the following
properties to be effectively applied in the mentioned scenarios:
• they have to be automatizable, so that applications could
integrate among each other without human intervention
(or with as little human intervention as possible);
• they have to be flexible. It should be possible to apply
the same convergence schema to any domain on which
the actors have to communicate their own (partially
overlapping) conceptualizations;
• they have to be on-demand and limited in scope. The aim
is not to fully map the actors conceptualization, which is
often impossible or even useless, but to reconciliate only
those parts that are necessary for the dialogue.</p>
      <p>Developing platforms enabling SW systems to communicate
with each other, without committing to a superimposed
ontology, ensures a very loose coupling among them which allows
for independent evolution and maintaining of the applications.
In our opinion, other SW technologies spanning from Web
Service discovery, orchestration, and choreography, up to
software agents immersed in the SW, could profit from the
solutions that research will find to these issues.</p>
      <p>In this paper, we suggest a Machine Learning approach
to accomplish semantic integration, taking into account the
requirements listed above. The remainder of the paper is
organized as follows: the next section will briefly summarize the
state of the art (to our knowledge) on these problems. Sect. III
illustrates our approach to Ontology Alignment together with
a proposed algorithm. Sect. IV explains in detail the Machine
Learning techniques used to implement the algorithm, while
Sect. V presents the implemented prototype employed to carry
out a preliminary empirical evaluation reported in this section.
Finally, in Sect. VI, some conclusions are drawn outlining
future work directions.</p>
    </sec>
    <sec id="sec-2">
      <title>II. RELATED WORK</title>
      <p>
        In this section, we present a short discussion of some recent
relevant research describing different approaches to ontology
alignment. Ontology Alignment is a broad umbrella for a
variety of subtask that try to solve some of the issues pointed
out in the previous section. In order to keep the subject simple
and general, we might define alignment as: any process that
enables communication among two or more systems, adopting
two or more different ontologies. Following Noy [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], there
are (at least) two types of alignment processes, classified
according to their output:
• Ontology Merging: in which the outcome of the
alignment of n ontologies is a single new one, embedding the
knowledge coming from the n sources
• Ontology Mapping: in which the results of the alignment
among n different ontologies consists of the starting ones
plus the mapping between similar entities of the input
sources.
      </p>
      <p>In the following, we briefly discuss a non-exhaustive list of
some remarkable research approaches pursuing such alignment
strategies.</p>
    </sec>
    <sec id="sec-3">
      <title>A. GLUE.</title>
      <p>
        GLUE [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] exploits Machine Learning techniques to
find semantic mappings between concepts stored in distinct
and autonomous ontologies. Basically, its strategy falls into
the Ontology Merging category as its output is a mediator
ontology (or database schema in its earlier version). The
process relies on a initial manual bootstrapping phase, in
which user provides some sample mappings between some
entities within the ontologies to be merged. GLUE then,
tries to infer a set of rules (or classifiers) that can synthesize
the mapping strategy followed by the users for the initial
mappings. GLUE then, applies what it learnt to find other
semantic mappings. One of its strong points is the possibility
of using any kind of probabilistic similarity measure (measure
neutrality) and, furthermore, also the weighting of the single
similarity assessments is learnt by the system. In [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] the
authors underline that there are many kinds of mappings that
can be found, comparing entities from different ontologies,
however GLUE focuses on finding 1-1 mappings between
concepts belonging to two taxonomies.
      </p>
      <sec id="sec-3-1">
        <title>B. PROMPT Suite.</title>
        <p>
          The PROMPT Suite by Stanford Medical Informatics [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ],
[
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] provides two tools for alignment: iPROMPT and
ANCHORPROMPT. iPROMPT is an interactive tool performing ontology
merging. It starts from initial mappings among two entities
(provided either by users or by a automatic process based
on linguistic similarity) and generates suggestions for further
mappings between other entities. After the user triggers one
among the suggestions proposed (or performs some changes
on the resulting ontology), iPROMPT applies the required
changes, computes the possible cases of inconsistency trying
to suggest fixes for those, and produces new suggestions for
further mappings.
        </p>
        <p>ANCHORPROMPT, on the contrary, is an ontology mapping
tool. It can be used for supporting iPROMPT in order to
individuate new related concepts during the iterations. Unlike
iPROMPT, it exploits also non-local contexts for the concept
pairs whose similarity has to be assessed. Indeed, it
represents ontologies as graphs and compares the paths in which
concepts are involved, both within the taxonomy and in slot
dependencies and domain/range constraints.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>C. APFEL.</title>
      <p>
        APFEL is really a whole framework rather than a single
alignment technique. In [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], Ehrig et al. propose a very
interesting classification of alignment strategies dividing them
into manually predefined automatic methods and learning
from instance representations methods. The former are very
complex to design, since all the possible peculiarities of the
domains they will be applied must be considered; the latter,
on the contrary, can be used only in presence of instances (and
sometimes this is not the case) though they show more
flexibility as the underlying ontology domain changes. The proposed
framework (PAM - Parameterizable Alignment Methodology)
is fully parameterizable to the extent of being able to reproduce
the strategy used in other methodologies by just adjusting
some parameters. APFEL tries to learn from pre-validated
example mappings the right parameters that would instantiate
the fittest PAM for the particular learning problem. Such
parameters are:
• Features, i.e. the smallest set of features on which to
compute similarity among entities
• Selection criteria for the next pair of entities to compare
• Similarity measures, aggregation, and interpretation
strategies
• Iteration, that is the extent to which neighbor entity pairs
are influenced by the current pair (whose similarity has
been evaluated).
      </p>
    </sec>
    <sec id="sec-5">
      <title>III. ONTOLOGY ALIGNMENT ALGORITHM</title>
      <p>According to Noy’s classification cited in the previous
section, the approach described in this work falls into the
ontology mapping methodology. Indeed, without loss of
generality, we assume to work on two ontologies dealing with
the same domain. Besides, we also assume that some of the
concepts within the input ontologies may partially overlap
in their semantics. We intentionally will not define formally
what this semantic overlap means, due to the large variety of
practical situations in which this may happen in real-world
applications. We might have concepts defined with different
names, structures, etc. that share exactly the same meaning.
On the other hand, we can have concepts that, though sharing
their name, might be different in their actual meaning, and it
could be necessary that an application understands what its
peer exactly means for such a concept, possibly in terms of
its own ontological primitives.</p>
      <p>A couple of examples can better explain these situations.
The former case can be exemplified as follows. Suppose that
there are two applications dealing with two ontologies about
cars, differing just for the language: one uses Italian names
and the other English names. The concept of, say, Wheel is
equivalent to the concept Ruota in the Italian version of the car
ontology. The latter case, instead, may happen when you have
two ontologies describing a domain from two different
perspectives. Suppose that the ontology FoodRestaurant deals
with food from the typical point of view of a restaurant and
that the ontology FoodNutritionist deals with the same subject
from the side of a dietician. The concept of HighQualityFood
depends on different aspects of food according to the adopted
viewpoint. Indeed, when defining this concept, a dietician,
would care of nutritional values of the ingredients, whereas a
chef would take into account the provenance of the ingredients,
their rarity and so on. In such case, we have two legal
HighQualityFood concepts; an agent for dinner arrangements
dealing with restaurants and people could be more effective if
it knows both senses.</p>
      <p>The intuition behind our approach is that a useful mapping
is the one in which at least one of the two parties can
understand what the other means in terms of its own ontology.
Indeed, if an application is to use information coming from
another system, it should be able to frame this information
in its own knowledge model, hence, in our opinion, it has to
grasp some meaning using primitive concepts and properties
from its own ontology. Though this may seem straightforward,
many approaches limit themselves to find a correspondence
between equivalent (or very similar) entities, provided that
such an equivalence exists. Yet such approaches disregard the
importance of maintaining different definitions for the same
entity (the viewpoints discussed in the previous example) and
of having one’s own definitions, in order to communicate with
different parties.</p>
      <p>
        Therefore, a way is needed for allowing a system to build
a representation of a concept belonging to another ontology
using its own terminology. We argue that the inductive
reasoning techniques of Machine Learning are a suitable way
to accomplish this objective. In particular, we employ the
Concept Learning paradigm [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] for treating this problem.
Indeed, with Concept Learning aims at inducing an intensional
descriptions of concepts starting from its examples and
(possibly) counterexamples.
      </p>
      <p>In particular, we intend to implement the following scenario.
Suppose there are two applications A and B with their
respective ontologies OA and OB and suppose that application
A wants to communicate about some concept, say a : C,
with B (we suppose to indicate concepts with the usual URIs,
i.e. namespace:conceptLocalName) regarding a concept
that does not appear in OB. Then, in our scenario B could
ask A to provide some instances (examples) of a : C and,
possibly, also some counterexamples (i.e.: instances of the
complement class of a : C). If such examples occur as
instances within OB, then B, by means of a Concept Learning
algorithm, can infer a definition of a : C in terms of the
descriptions of those instances contained in OB. In this way
B obtains a definition of a : C using its own terminology,
that is, in other words, B understands a : C according
to what it knows (its ontology). In the following section
(Sect. IV) the particular Concept Learning algorithm employed
is described; however, for the purposes of this section the
particular algorithm is irrelevant since the only requirement is
that it solves a generic Concept Learning problem, described
formally in the following definition:</p>
      <sec id="sec-5-1">
        <title>Concept Learning problem</title>
        <p>Given a knowledge base K and a set of instances divided into
examples and counterexamples E = E+ ∪ E− and described
in K
Learn the definition of a concept C such that all elements of
E+ none of E− are instances of C.</p>
        <p>The result of the learning problem can be further revised in
the following way. B identifies a suitable subset of instances
(different from those provided by A as a training set), classifies
them as belonging or not to the learnt concept and passes them
to A asking the peer to do the same. If A and B disagree
on the classification of some individuals, it will be the case
that B refines properly the learnt definition in order to fix the
discrepancy with the classification provided by A.</p>
        <p>
          After learning a definition of a : C, let us call it b : C
supposing there are no clashes with pre-existing names, B
has to compare it with the concepts in its OB, evaluating
the similarity of b : C to pre-existing concepts. This can
accomplished using suitable similarity measures such as the
one proposed in [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] that score the semantic affinity/diversity of
a pair of concepts within an ontology. If the similarity between
b : C and another concept in OB exceeds a given threshold
or, better, if the equivalence can be proved by a reasoner, then
an equivalence axiom can be added, otherwise the definition
of b : C is left unchanged within OB.
        </p>
        <p>Careful readers should have already found out that there
is an assumption on which the whole process relies on. The
assumption is that there exists a subset of individual URIs in
common between OA and OB. This corresponds to the initial
knowledge that is provided in other systems during the
bootstrapping phase. To cite only the related work, GLUE needs
a complete example of mapping to start the learning phase,
whereas iPROMPT relies on initial suggestions on linguistic
similarity, assuming that concept names are expressed in their
natural language form (or very close to it). ANCHORPROMPT,
on the contrary, exploits the graph structure assuming that,
for ontologies partially overlapping on the same domain, their
graph structure should be similar. Our assumption seems at
least as reasonable as these, also because it can be relaxed
thinking that, instead of having some identical individual URIs
across ontologies, we have mechanisms to map some URIs
into some others (either manually or automatically).</p>
        <p>Consider, for instance, the ontologies about paper indexing
systems, e.g. DBLP and the ACM Digital Library, and suppose
there exist their OWL (or RDF) 1 versions. Each system has
automatic strategies for generating identifiers and there is a
large subset of research works that are indexed by both of
them. It could be easy then to provide URIs translator for
the two systems. Actually, ACM and DBLP bibliographic
information are used in one of the test cases we devised for
our system; more details in Sect. V-C.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>IV. MACHINE LEARNING IN DESCRIPTION LOGIC</title>
      <p>
        In this section, we describe in more details the learning
algorithm we developed for solving a Concept Learning
Problem in Description Logic. In particular, we focus on concept
descriptions in the ALC[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] Description Logic as it is the
least expressive subset of OWL-DL allowing for disjunction
and ensuring a reasonable expressiveness without features (like
role hierarchies) that would make learning intractable.
      </p>
      <p>First we cast the generic Concept Learning Problem
reported in Def. III to the Description Logic setting.</p>
      <p>learning/tuning problem In a search space of concept
definitions (S, v) ordered by subsumption
Given a knowledge base K = hT , Ai and a set of positive and</p>
      <p>+
negative assertions AC = AC ∪AC− regarding the membership
(or non-membership) of some individuals to a concept C such
that: A 6|=T AC
Find a new T-box 2 T 0 = (T \ {C ≡ D}) ∪ {C ≡ D0} such
that: A |=T 0 AC</p>
      <p>A Concept Tuning (or Revision) Problem is similar but for
the fact that learner already knows an incorrect definition for
C which has to be revised. This means that there are some
assertions within AC that are not logical consequences of the
knowledge base and the current definition of C. Such incorrect
definition can be incomplete, i.e.: there are some individuals
that are said to belong to C in AC but this cannot be entailed,
and/or it can be incorrect, meaning that there are individuals
that are deduced to belong to C from the definition of C,
while they actually don’t belong to it.</p>
      <p>In general, Concept Learning relies on the intuition that
the solution of any problem can be found traversing a proper
search space.</p>
      <p>A learning algorithm, then, is cast as a search that moves
across such a space with two kinds of possible movements,
called refinements:
• Specific towards general (generalization)
• General towards specific (specialization)</p>
      <p>We provide here the formal (though generic) definitions for
both kinds of refinement.</p>
      <p>downward refinement operator - ρ Let (S, ) be a search
space of concept descriptions C ordered by subsumption v.</p>
      <p>1http://www.w3.org/2004/OWL/, http://www.w3.org/RDF/
2T-box (terminological box), is the portion of the knowledge base
containing the theory (T ), while A-box (assertional box) is the portion containing
the ground statements
We define the function ρ : S → 2S such that ρ(C) ⊆ {E ∈
2S | E v C}.</p>
      <p>upward refinement operator - δ Let (S, ) be a search space
of concept descriptions C ordered by subsumption relation v
we define the function δ : S → 2S such that δ(C) ⊆ {E |
E ∈ 2S ∧ C v E}.</p>
      <p>Notice that it does not hold in general that:</p>
      <p>E v C → E ∈ ρ(C) or C v E → E ∈ δ(C)</p>
      <p>
        Refinement has been thoroughly studied in literature (see
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]) also specifically for Description Logic knowledge
representations [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. For the sake of brevity we will not
expose here all the theoretical framework for refinement. It
suffices to saying that, unlike many other approaches, we try
to use examples and counterexamples of the concept to be
learnt to bias the search and, hence, the refinement strategies,
rather than using them just for testing intermediate candidate
refinements. Indeed, during learning, as the algorithm searches
for a solution of the problem, it may find various intermediate
hypotheses that satisfy the required constraints3 but not all
of them. Hence refinement operators must be applied to fulfil
such constraints. The problem is that at each refinement step
there are many alternative directions that can be chosen.
      </p>
      <p>Examples come into play in this choice, guiding the refinement
process towards selecting the most promising candidates, thus
pruning the search space and avoiding the learner to backtrack
for the sake of efficiency.</p>
      <p>
        Our solution to the learning problem is the algorithm
implemented in the learning system named YINYANG that is
an evolution of the system described in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In YINYANG
examples are not processed at the individual description level but,
for each of them, a concept level representative is computed.
      </p>
      <p>This is very frequent in Machine Learning when the example
and hypothesis languages do not coincide, and is called Single
Representation Trick. Concept level representatives should be
as much adherent as possible to the examples they stand for.</p>
      <p>In Description Logic there exist a non standard inference
called Most Specific Concept (denoted msc) that fulfills
such a requirement but, unfortunately, it needs not exist for
many Description Logics (such as ALC). Hence it will be
approximated provided that it has to be possible to distinguish
among concept level representative of positive and negative
examples. In other words it cannot exist a unique concept
level representative for a positive and a negative example.</p>
      <p>The algorithm starts choosing a seed that is a concept level
representative of an example. Then, it keeps generalizing it
until there are residual positive to be covered or until the
generalization covers some negative example (overgeneralization).</p>
      <p>In the former case, the algorithm exits returning a complete
and correct generalization. In the latter, specialization is
necessary in order to correct the overly general hypothesis (concept
definition).</p>
      <p>3It should covers a subset of examples and does not cover a part of
counterexamples, but there would remain some uncovered examples and/or
some covered counterexamples.</p>
      <p>
        Specialization can take place in two different ways. The first
one is based on the notion of counterfactuals [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. The idea
behind is that if we have a concept that covers some negative
examples one can single out the parts of such definitions that
are responsible for the coverage and rule them out by means
of negation. There is a well known non-standard inference
in DLs called concept difference or residual [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] used for
the aforementioned identifications of part of definitions which
are responsible for the incorrect coverage (also known as
blame assignment). For eliminating in one single negation
all the different parts that are responsible for the coverage
of each negative, a generalization of the blamed parts is
needed in order to obtain a single description to be negated
(i.e. a counterfactual). Thus, again generalization is needed,
for the negative residuals with respect to the overly general
hypothesis. Such negative residuals, will now become positive
examples in a new nested learning problem and the solution
of such problem will be our counterfactual. Since its negation
has to be conjoined to the starting incorrect hypothesis in
order to specialize, one do not want to incur in the dual error,
i.e. overspecialization. Indeed, the refined resulting concept
definition must keep covering the positive examples it covered
before specialization. Hence the counterfactual should not
cover the residual of covered positive example representatives
w.r.t. the covering hypothesis to specialize. This means that
they represent negative examples for the new nested learning Fig. 1. SELA Alignment Protocol
problem solved by the recursive call.
      </p>
      <p>
        Specialization by means of counterfactuals can be proved
not to be complete in the sense that it sometimes fails agents that have their own knowledge bases and wish to
in finding a correct counterfactual for properly specializing dialogue with each other. In such environments, it is very likely
overly general hypotheses. That is why another specialization that there is no common ontology to commit to; this can offer
strategy was introduced, named Add Conjunct, which is plenty of possible scenarios in which ontology alignment can
applied whenever the Counterfactual routine fails. Such a be evaluated.
strategy aims at finding a conjunct per positive example that Here we present the prototype in terms of its architecture
does not appear yet in the hypothesis to be specialized nor in and we propose an evaluation on some simple alignment
none of the negatives that are currently covered. In order to problems that are characteristic (in our opinion) of typical
minimize the number of conjuncts, each positive provides such situations that can happen in such settings. We conducted such
a conjunct only if there are no conjuncts (already provided by simple tests in order to single out the direction towards which
some other positives) that cover it. After having computed a mature system should move as well as the practical issues
such conjuncts, they are generalized into their least common that arise. Therefore, these tests are not to be considered a
subsumer (lcs) [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and such lcs is conjoined to the overly thorough empirical evaluation, which will follow once robust
general hypothesis. Since the requirements for a conjunct to solutions for the issues described in the following have been
be chosen for each positive are very tight, it may some- devised.
times happen that such specialization strategy fails as well Our prototype is called SELA (Self Explaining and
Learnas counterfactual before it. In such cases, the overly general ing Agents) and consists of a software agent (namely
SEhypothesis is simply replaced by the lcs of its covered positive LAAgent) that has two main tasks: it can either explain
example representatives, such lcs is added as a disjunct to the concepts, providing examples and validating classifications,
main generalization and the algorithm chooses another seed or it can learn concepts from examples acquired by some
starting again the inner loop. other SELAAgent and ask for validations of what it learned.
      </p>
      <p>The agent has been developed within the framework provided
V. PROTOTYPE DESCRIPTION AND PRELIMINARY by the Java Agent Development Environment (JADE http:</p>
      <p>EVALUATION //jade.tilab.com) and the alignment methodology has
We developed a preliminary prototype that implements the been designed as a protocol as sketched in Fig. 1.
methodology in Sect. III. We thought that one of the most We can describe it briefly as follows. A SELAAgent A (the
suitable settings could be a Multi-Agent System. Indeed, it teacher) initiates a conversation with another SELAAgent B
seems natural to think of communicating actors as autonomous (the learner) asking whether B knows the concept a : C which
is what A wants to speak about. B can confirm it knows a : C
or disconfirm, stating it does not. In case of disconfirmation
A provides some individual names that are instances of a : C
and some names of counterexamples of a : C (individuals in
the extension of the complement of a : C). After receiving the
names of the instances (both examples and counterexamples)
B takes into account only the subset of individuals it owns
in its ontology. Then B starts learning as described in the
previous Section. Notice that instance descriptions are not to
be provided by A, but they are built on the assertions that
B has about such individuals. When learning terminates, B
has a definition of a : C in terms of its ontology vocabulary.</p>
      <p>As we shall see in the following, this needs not to be exactly
corresponding to the definition of a : C that is actually stored
into the ontology of A. B then, has somehow to be sure that
its idea of a : C corresponds to the one of A. This can be
accomplished to some extent by means of the cyclic remainder
of the protocol. B chooses some individuals and classifies
them employing the definition of a : C it has learned. Then,
it sends the classification results to A that validates them. A
answers with validation results and then B can fine-tune (see
Def. IV) its definition a : C. The loop stops when the number
of discordances between B classifications and A validations
is below a given threshold.</p>
      <sec id="sec-6-1">
        <title>A. Artificial Ontologies</title>
        <p>We prepared two sample pairs of ontologies. The first pair
is made of two ontologies that are structurally identical but for
concept and property names. In fact, we prepared an ontology
in the academic domain with an English nomenclature and
then its translation in Italian (see Fig. 2 for its layout in
Prote´ge´).</p>
        <p>The second pair presents a different situation. We started
from a common ontology in a fancy restaurants domain
that deals with food, meals, and courses. Actually, it is a
simplified version of the famous food ontology sample on the
W3C website.4 Then we derived two ontologies containing
4http://www.w3.org/TR/2002/WD-owl-guide-20021104/
food.owl
both the concept called HighQuality defined, however, in
two different namespaces (in order to represent two different
points of views for it). One deals with HighQuality from the
perspective of a particular restaurants defining it as:
HighQuality ≡ Meal u ((∃hasCourse.Starter u
∃hasCourse.MainCourse u ∃hasCourse.Dessert) t
∃hasCourse.(∃hasDrink.Alcoholic))</p>
        <p>The other one defines its notion of HighQuality from the
hypothetical perspective of a dietician as:
HighQuality ≡ Meal u
(∀hasCourse.(∀hasDrink.(¬Alcoholic) u
∃hasFood(LowCaloric t NegligibleCaloric t
ModerateCaloric)))</p>
        <p>It is obvious that there are two different alignment purposes
and situation. Indeed, in the former, the process should detect
that there is a complete correspondence among the entities of
the two ontologies. In the latter, there is not a 1-1 mapping
between the two conceptualizations of HighQuality but it could
be interesting if each party (restaurant owner and dietician)
could build up, in their respective knowledge base, the point
of view of their counterpart during the dialogue.</p>
        <p>As concerns the mapping between the two academic
ontologies, we report briefly that for each concept the learning
SELAAgent was able to build up a definition that was
equivalent to the corresponding translated concept in its own
ontology, even when a few instances where exchanged. Such
a good result depends obviously on the structural similarity
of the ontologies: the individuals in common between the
two ontologies have the same structure but for the names of
the relations; moreover, the A-box contains all the relevant
information for the individuals, which in general may not be
the case (see Sect. V-C for further details).</p>
        <p>
          In the restaurant domain the results were not as satisfactory
as the previous case. Consider the more likely scenario in
which the agent with the restaurant owner version of
HighQuality tries to learn the dietician’s concept of HighQuality.
It learns a definition that is more specific than the desired one
though it never required tuning in all the iterations we ran
through. This is likely due to the limitedness of the number
and especially of the variety of examples. It is indeed well
known in inductive learning that too few examples not so
different among each other may yield a phenomenon called
overfitting. Overfitting occurs when the learned definition is
too adherent to the training set used for building it up and
hence suffers for poor predictive accuracy on future unseen
examples classification. Future experiments will aim to devise
also suitable strategies for choosing examples and, above all,
counterexamples. As a matter of fact, a careful selection of
counterexamples could improve the effectiveness of learning.
Winston [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] claimed the usefulness of near-misses
counterexamples that are instances that very close to the concept to be
learnt but not belonging to its extension. In our case, we could
exploit the taxonomy for individuating the most likely
nearmisses: e.g. the instances of the superconcepts of the concept
to be learned (or those of its sibling concepts) that do not
belong to it.
        </p>
      </sec>
      <sec id="sec-6-2">
        <title>B. Ontology Alignment Evaluation Initiative Ontologies</title>
        <p>Our second test is based on two ontologies taken from the
Ontology Alignment Evaluation Initiative test set for 2005,
namely onto101 and onto2065; onto101 is the reference
ontology for the contest, which consists on a set of ontologies
that are modifications of onto101 (e.g., hierarchy flattening,
deletion of concepts, translation of names, URIs, comments,
or deletion of comments). In particular, onto206 is the
French translation of onto101, where all local names of
the defined classes and properties have been translated from
English to French. These ontologies consist of 39 named
classes, 24 properties and over 70 individuals, of which about
50 are identified by an URI; these individuals have the same
URI in both onto101 and onto206, and this enables us
to apply our algorithm to them; in particular we assigned
onto101 to one of our agents (the teacher) and onto206 to
the other agent, then we made the teacher agent try to teach
the definition of the http://oaei.inrialpes.fr/
2005/benchmarks/101/onto.rdf\#Institution6
concept; however, the resulting definition:</p>
        <p>∃adresse.Adresse u E´diteur
is poor w.r.t. the original definition (Fig. 3) and moreover it is
overfitting (being E´ diteur is not necessary to be Institution);
while the second issue only depends on the available
individuals (only two individuals of class E´ diteur were available in
the ontology), the first issue depends on the targeted DL, since
the expressivity of the ontologies we use here is greater than
that of the language our learner uses (specifically, cardinality
restrictions are not handled). This example clearly shows
that at least cardinality restrictions should be added to the
representation language of the algorithm in order for it to be
useful in real world scenarios.</p>
        <p>Fig. 3. inria206:Institution Definition</p>
      </sec>
      <sec id="sec-6-3">
        <title>C. ACM and DBLP Test Case</title>
        <p>Our last test case has been built from ACM SIGMOD
dataset7 and from DBLP RDF dump8; the ACM dataset is
an XML file with associated DTD; in order to translate it into
OWL, we designed a very small ontology reflecting the DTD,
and then produced the OWL ontology we needed.</p>
        <p>In order to find common individuals, we analyzed the
available data and found that a good way to identify matching
individuals for this setting is to look at the paper titles;
this enabled us to choose a subset of the DBLP individuals
(amounting to roughly 700 individuals) that were described
both in terms of the DBLP ontology and of our homemade
ACM SIGMOD ontology (both sketched in Fig. 49; the
learning problem here consisted in finding the definition for the
concept Article in the ACM SIGMOD ontology; the definition
we obtain for Article in the original ACM SIGMOD ontology
is:</p>
        <p>sigmod:Article u ∃sigmod:author.&gt;
while the definition w.r.t. the DBLP ontology is:</p>
        <p>dblp:Article u ∃dblp:url.&gt; u ∃dblp:ee.&gt;</p>
        <p>The definitions appear to be very short; in fact, this depends
on the two chosen ontologies being very small, and having
mainly datatype properties, which are not useful in the learning
algorithm.</p>
        <p>At the time of writing, no quantitative tests have been run
on this dataset; the reason is that, being all the individuals in
DBLP and ACM SIGMOD almost equal from the structural
point of view, a very small amount of examples is enough to
converge on the presented definitions. These definitions score
100% precision on the presented datasets, but the reason for
this high performance lies in the regularity of the dataset (the
actual number of examples used in the tests is 10 positive
examples and 5 negative examples out of 700 available
individuals). Also, the fact that so many properties in real ontologies
5Full test set at http://oaei.ontologymatching.org/2005/; as
a side note, it is necessary to operate some corrections on the ontologies, since
there are some small errors: the literals for year, month and day should be
typed with the corresponding XSD types, but they are plain literals in both
ontologies, and the DIG reasoner we use (Pellet, http://www.mindswap.
org/2003/pellet/) finds the ontology inconsistent
6Namespaces will be shortened in the following
7available at http://www.cs.washington.edu/research/
xmldatasets/www/repository.html
8available at http://www.semanticweb.org/library/
9it is worth noting that the DBLP RDF dump needed a little data cleaning,
since it was not completely conformant RDF; in particular, many URIs in
the data contained spaces, that needed handling before being submitted to the
reasoner
are defined as datatype, even when they in fact represent
entities (e.g. dblp:author is a datatype property, while usually
an author of a paper is a person, and so is typically modeled
as an individual) raises the idea that building more structured
datasets is necessary before attempting a complete empirical
evaluation of the system.</p>
        <p>From the computational complexity point of view, it is well
known that OWL DL reasoning algorithms have exponential
worst-case complexity 10; however, most real world ontologies
do not exploit all DL constructors, and therefore reasoning
with these ontologies can be done in reasonable time. We
conducted a very preliminary test on the DBLP dataset presented,
where we sliced the available examples in 10 disjoint sets and
ran the learning algorithm separately; each learning problem
was made up of 50 positive examples (randomly chosen) and
5 negative examples, and the elapsed time for building the
definition is around one minute for each one of the ten trials.
The last trial was made using all individuals at once, so that
there were 500 positive examples and 50 negative examples;
in this case, the required time is 80 seconds. So far, then, the
number of examples seems not to hamper the practical use of
the algorithm.11</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>VI. CONCLUSIONS AND FUTURE WORK</title>
      <p>Giving solutions to the ontology alignment problem is one
of the most compelling issues in the roadmap to realize the
Semantic Web as a real-world infrastructure. We recalled the
motivations for the investigation on this subject and underlined
our own viewpoint on alignment. In such a vision, we prefer
to slightly stray from the most widespread idea of a rigid
centralized (top level) ontology and to adopt on-demand partial
mappings among ontologies owned by the parties in play. We
have proposed a concept learning algorithm to accomplish this
that works under assumptions justified throughout the paper.
We have illustrated the prototype implemented these ideas
together with some preliminary results using limited datasets.</p>
      <p>We plan to improve our work along the following directions.
First, we should evaluate suitable concept similarity measures
for assessing the closeness of learned concept definitions to
the pre-existing conceptualizations in the ontologies to be
aligned. We will also investigate on methodologies for aligning
properties and not only concepts, using techniques that are
very similar to those employed in tools like GLUE (see
Sect. II). Last, but not least, we will evaluate the impact of
different strategies for example selection in terms of the quality
(effectiveness and/or efficiency) of the induced definitions and,
therefore, of the computed alignment themselves.
10http://www.cs.man.ac.uk/ ezolin/logic/complexity.html
11Note that this last test was designed as a ten-fold cross validation
experiment, however the results of the test seem too influenced by the specific
dataset to be taken into account to measure the performance of the algorithm
in terms of precision and recall, and that’s the reason they’re not presented
here in detail.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hendler</surname>
          </string-name>
          , and
          <string-name>
            <given-names>O.</given-names>
            <surname>Lassila</surname>
          </string-name>
          , “The Semantic Web,” Scientific American, May
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Gruber</surname>
          </string-name>
          , “
          <article-title>A translation approach to portable ontology specifications</article-title>
          ,”
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N. F.</given-names>
            <surname>Noy</surname>
          </string-name>
          , “
          <article-title>Tools for mapping and merging ontologies.” in Handbook on Ontologies, ser</article-title>
          .
          <source>International Handbooks on Information Systems, S. Staab and R</source>
          . Studer, Eds. Springer,
          <year>2004</year>
          , pp.
          <fpage>365</fpage>
          -
          <lpage>384</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Doan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Madhavan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Domingos</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Halevy</surname>
          </string-name>
          , “
          <article-title>Learning to map between ontologies on the semantic web</article-title>
          .”
          <string-name>
            <surname>in</surname>
            <given-names>WWW</given-names>
          </string-name>
          ,
          <year>2002</year>
          , pp.
          <fpage>662</fpage>
          -
          <lpage>673</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Doan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Domingos</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Halevy</surname>
          </string-name>
          , “
          <article-title>Learning to match the schemas of data sources: A multistrategy approach</article-title>
          .”
          <source>Machine Learning</source>
          , vol.
          <volume>50</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>279</fpage>
          -
          <lpage>301</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>M. A. M. Natalya F. Noy</surname>
          </string-name>
          , “
          <article-title>The prompt suite: interactive tools for ontology merging</article-title>
          and mapping,”
          <source>International Journal of HumanComputer Studies</source>
          , vol.
          <volume>59</volume>
          , pp.
          <fpage>983</fpage>
          -
          <lpage>1024</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ehrig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sure</surname>
          </string-name>
          , “
          <article-title>Bootstrapping ontology alignment methods with apfel</article-title>
          .” in International Semantic Web Conference, ser. Lecture Notes in Computer Science,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Gil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Motta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. R.</given-names>
            <surname>Benjamins</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Musen</surname>
          </string-name>
          , Eds., vol.
          <volume>3729</volume>
          . Springer,
          <year>2005</year>
          , pp.
          <fpage>186</fpage>
          -
          <lpage>200</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Mitchell</surname>
          </string-name>
          , Machine Learning. New York:
          <string-name>
            <surname>McGraw-Hill</surname>
          </string-name>
          ,
          <year>1997</year>
          , ch. Concept Learning and General to Specific Ordering, pp.
          <fpage>20</fpage>
          -
          <lpage>51</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>dAmato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Fanizzi</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Esposito</surname>
          </string-name>
          , “
          <article-title>A semantic similarity measure for expressive description logics</article-title>
          ,”
          <source>in Proceedings of CILC</source>
          <year>2005</year>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>F.</given-names>
            <surname>Baader</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Calvanese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>McGuinness</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Nardi</surname>
          </string-name>
          , and P. PatelSchneider, Eds.,
          <source>The Description Logic Handbook</source>
          . Cambridge University Press,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P. R. J. van der</given-names>
            <surname>Laag</surname>
          </string-name>
          and S.
          <string-name>
            <surname>-H.</surname>
          </string-name>
          Nienhuys-Cheng, “
          <article-title>Existence and nonexistence of complete refinement operators.” in ECML, ser</article-title>
          . Lecture Notes in Computer Science,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bergadano</surname>
          </string-name>
          and
          <string-name>
            <given-names>L. D.</given-names>
            <surname>Raedt</surname>
          </string-name>
          , Eds., vol.
          <volume>784</volume>
          . Springer,
          <year>1994</year>
          , pp.
          <fpage>307</fpage>
          -
          <lpage>322</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Nienhuys-Cheng and R. de Wolf</surname>
          </string-name>
          ,
          <article-title>Foundations of Inductive Logic Programming, ser</article-title>
          .
          <source>LNAI</source>
          . Springer,
          <year>1997</year>
          , vol.
          <volume>1228</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>L.</given-names>
            <surname>Badea</surname>
          </string-name>
          and S.
          <string-name>
            <surname>-H.</surname>
          </string-name>
          Nienhuys-Cheng,
          <article-title>“A refinement operator for description logics</article-title>
          ,”
          <source>in Proceedings of the 10th International Conference on Inductive Logic Programming</source>
          , ser. LNAI,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cussens</surname>
          </string-name>
          and
          <string-name>
            <surname>A</surname>
          </string-name>
          . Frisch, Eds., vol.
          <source>1866</source>
          . Springer,
          <year>2000</year>
          , pp.
          <fpage>40</fpage>
          -
          <lpage>59</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>F.</given-names>
            <surname>Esposito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Fanizzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Iannone</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Palmisano</surname>
          </string-name>
          , and G. Semeraro, “
          <article-title>Knowledge-intensive induction of terminologies from metadata,” in Proceedings of the 3rd International Semantic Web Conference, ISWC2004, ser</article-title>
          . LNCS,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>McIlraith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Plexousakis</surname>
          </string-name>
          , and F. van Harmelen, Eds., vol.
          <volume>3298</volume>
          . Springer,
          <year>2004</year>
          , pp.
          <fpage>411</fpage>
          -
          <lpage>426</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Vere</surname>
          </string-name>
          , “
          <article-title>Multilevel counterfactuals for generalizations of relational concepts and productions</article-title>
          ,
          <source>” Artificial Intelligence</source>
          , vol.
          <volume>14</volume>
          , pp.
          <fpage>139</fpage>
          -
          <lpage>164</lpage>
          ,
          <year>1980</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>G.</given-names>
            <surname>Teege</surname>
          </string-name>
          , “
          <article-title>A subtraction operation for description logics</article-title>
          ,”
          <source>in Proceedings of the 4th International Conference on Principles of Knowledge Representation and Reasoning</source>
          , P. Torasso,
          <string-name>
            <given-names>J.</given-names>
            <surname>Doyle</surname>
          </string-name>
          , and E. Sandewall, Eds. Morgan Kaufmann,
          <year>1994</year>
          , pp.
          <fpage>540</fpage>
          -
          <lpage>550</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>P.</given-names>
            <surname>Winston</surname>
          </string-name>
          ,
          <article-title>Learning Structural Descriptions from Examples</article-title>
          . MIT,
          <year>1970</year>
          , ph.
          <source>D. dissertation.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>