<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Multimedia Information Extraction in Ontology-based Semantic Annotation of Product Catalogues</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Roberto Bartolini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emiliano Giovannetti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simone Marchi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simonetta Montemagni</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ILC-CNR</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudio Andreatta</string-name>
          <email>andreatta@itc.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roberto Brunelli</string-name>
          <email>brunelli@itc.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>ITC-irst</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rodolfo Stecher</string-name>
          <email>stecher@l3s.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fraunhofer IPSI</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Paolo Bouquet, DIT-University of Trento</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Roberto Bartolini, Emiliano Giovannetti, Simone Marchi, and Simonetta Montemagni work at the Istituto di Linguistica Computazionale (ILC-CNR) emiliano.giovannetti</institution>
          ,
          <addr-line>simone.marchi</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>-The demand for efficient methods for extracting knowledge from multimedia content has led to a growing research community investigating the convergence of multimedia and knowledge technologies. In this paper we describe a methodology for extracting multimedia information from product catalogues empowered by the synergetic use and extension of a domain ontology. The methodology was implemented in the Trade Fair Advanced Semantic Annotation Pipeline of the VIKE-framework.</p>
      </abstract>
      <kwd-group>
        <kwd>Semantic Web Technologies</kwd>
        <kwd>Ontology Creation</kwd>
        <kwd>Ontology Extraction</kwd>
        <kwd>Ontology Evolution</kwd>
        <kwd>Semantic Annotation of Multimedia Content</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>I. INTRODUCTION</title>
      <p>Esharing of the knowledge embedded in textual and</p>
      <p>FFECTIVE acquisition, organization, processing, use and
multimedia content play a major role for competitiveness in
the modern information society and for the emerging
knowledge economy. However, this wealth of knowledge
implicitly conveyed in the vast amount of available digital
content is nowadays only accessible, if considerable manual
effort has been invested into its interpretation and semantic
annotation, which is possible only for a small fraction of the
available content.</p>
      <p>The field of semi-automatic information extraction from
multimedia corpora is central for overcoming the so-called
“knowledge acquisition bottleneck”. Multimedia sources of
information, such as product catalogues, contain text
(captions) and images (pictures of the products) thus requiring
information extraction approaches combining several different
techniques, ranging from Natural Language Processing to
Image Analysis and Understanding. In our approach we have
three main aspects to consider: 1) the information extraction
per se, 2) the ontology, its use and creation, and 3) the usage
of the ontology in the information extraction process and the
synergy between different kinds of extraction processes.</p>
      <p>The development of adequate ontologies itself is one of the
knowledge acquisition bottlenecks: the use of (semi-)
automatic tools for semantic information extraction from
multimedia corpora is very promising but, to be efficiently
exploited, must have access to a formal representation of a
given domain, i.e., an ontology. We support the ontology
creation process in two different and complementary ways,
ontology learning and reuse of existing ontologies. The
ontology learning approach takes advantage of the results of
the extraction to enrich the ontology, and the reuse support
provides methods and tools to re-use already existing
ontologies which capture the target domain under a similar
modelling perspective as the one of interest for the extraction
task. This (apparent) vicious circle (between the need of
having the domain represented in the ontology for an
extraction process and the enrichment of the ontology based on
the results obtained from the extraction) can be turned to a
virtuous circle if the necessary conditions are set to let the
evolving ontology and the information extraction tool interact
in a synergetic way.</p>
      <p>After a brief introduction to the Vike-Framework the
general methodology is described in section III, including
specific details about the four different components of the
system pipeline. Some conclusions will be presented in section</p>
    </sec>
    <sec id="sec-2">
      <title>II. THE VIKE-FRAMEWORK</title>
      <p>The methodology we present is developed inside the Vikef
project (Virtual Information and Knowledge Environment
Framework, IST-2002-507173 - http://www.vikef.net/), which
creates an advanced software framework for enabling the
integrated development of semantic-based Information,
Content, and Knowledge (ICK) management systems. Apart
from the scientific and academic interest related to these fields
of research we have also registered a growing need from
industrial parties for automated knowledge elicitation tools to
be applied to their commercial resources, such as product
catalogues.</p>
      <p>VIKEF bridges the gap between the partly implicit
knowledge and information conveyed in scientific and
business content resources (e.g. text, speech, images) and the
explicit representation of knowledge required for a targeted
and effective access, dissemination, sharing, use, and
annotation of ICK resources by scientific and business
communities and their information- and knowledge-based
work processes.</p>
      <p>R&amp;D within VIKEF builds on and significantly extends the
current Semantic Web efforts by addressing crucial
operationalisation and application challenges in building up
real-world semantically enriched virtual information and
knowledge environments.</p>
    </sec>
    <sec id="sec-3">
      <title>III. THE METHODOLOGY</title>
      <p>The task of (semi) automatically annotating content objects
with semantic information requires a multi-phased process,
where multimedia entities discovered within a content object
are coupled with domain knowledge represented by an
ontology. For effective semantic annotation support, linguistic,
image-related and knowledge representation aspects,
approaches, and formats, have to be combined in a synergetic
way. The proposed methodology can be presented as a
pipeline (together with the employed representation formats
within the pipeline), which supports semantic annotation in a
flexible and pragmatic way.</p>
      <p>The pipeline has been implemented as a prototype
developed as part of the VIKEF project and evaluated for
content from the Trade Fair domain.</p>
      <p>The main components of the pipeline are four, and can be
functionally summarized in this way:</p>
      <p>A) Annotation of text – the ontology based semantic
annotation of the textual part of the catalogue;</p>
      <p>B) Annotation of images – the ontology based semantic
annotation of the images appearing in the catalogue;</p>
      <p>C) Elicitation and refinement – to make information
extracted by the annotation component
machineunderstandable and to enrich the ontology for further
annotations;</p>
      <p>D) Reuse of existing ontologies – to support the ontology
creation and refinement by exploiting existing ontologies.</p>
      <p>The approach has been thought and implemented to provide
the possibility of triggering a “virtuous circle”: once
information extracted in the annotation steps are integrated
inside the ontology, the whole process can be restarted, thus
allowing the textual and image annotators to exploit the novel
information added to the ontology during the previous run.</p>
      <sec id="sec-3-1">
        <title>A. Annotation of Text</title>
        <p>Semantic annotation of content is a crucial task (probably
the most important) for processing documents to be accessed
inside the Semantic Web. To semantically annotate a text it is
necessary to develop (semi) automatic Information Extraction
techniques capable of overcoming the so-called “knowledge
acquisition bottleneck” typical of Semantic Web related
applications.</p>
        <p>
          Semantic annotation of product catalogues poses different
challenges at different levels. Concerning the textual part,
relative to product descriptions, catalogues do not contain
linguistically sound text: very often, sentences are constituted
by strictly nominal descriptions, thus discouraging the recourse
to traditional NLP techniques. On the other hand, product
descriptions appear as semi-structured texts where product
features appear in a fixed (or at least regular) order. Semantic
annotation of product catalogues appears therefore as a
complex task requiring the combination of different types of
techniques. Previous works about product information
semantic annotations are quite scarce: the two main works
being the European project CROSSMARC [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] and the Czech
national project Rainbow [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ]. The project CROSSMARC aims
to the electronic-retail product comparison, using a
combination of language engineering, machine learning and
user modelling, where a domain ontology is used as “semantic
glue” to link together the various analysis modules.
        </p>
        <p>Within the Rainbow project, a multi-layered ontology has
been defined, to integrate the more abstract aspects of the
domain (domain-neutral), relative to web-sites in general, with
the more specific ones (domain-dependent) relative to
concepts found in sites of small organization offering products
or services. Concerning the information extraction task,
Rainbow makes use of lexical indicators and, depending on the
document to analyze, applies HTML-centred or
free-textcentred extractors, in the latter case using shallow parsing
techniques.</p>
        <p>
          The hybrid methodology we propose (which has been
applied to Italian product catalogues belonging to the furniture
domain) makes use of two different approaches: first, pattern
matching techniques are resorted to for isolating individual
product descriptions within the textual flow and for identifying
their basic building blocks (e.g. the product name, its price as
well as its natural language description). Then, for each
identified product, the natural language description is
processed by a battery of NLP tools ([
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]) in charge of
identifying relevant entities (e.g. colour, material, parts of a
given product) and the relations holding between them (e.g.
part_of, colour_of which can be referred either to the product
itself or to individual parts).
        </p>
        <p>
          The architecture in Fig. 1 includes two main components,
the Product catalogue Italian Semantic Annotator (PISA), and
the Product catalogues Terminology Processor (PTP), both
exploiting the battery of NLP modules, the former to
linguistically analyze the free text part of the product
descriptions, the latter to obtain the TermBank [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] upon which
a simple application ontology can be constructed, to be
exploited for disambiguation. This “proto”-ontology forms the
terminological basis for the development of the final (project)
ontology, to be populated using the information derived from
the multimedia semantic annotation tasks.
        </p>
        <p>Once this proto-ontology has been constructed using the
PTP module, it is possible to run the PISA component (Fig. 2):
each product description is firstly extracted by pattern
matching starting from a set of regular expressions, each one
matching a particular product structure. Once a description has
been isolated, some of its components, interpreted on the basis
of particular “groups” of the matching regular expressions
(“name”, “type”, “product id”, etc.) can be detected.</p>
        <p>The remaining “free text” part of the description is then
processed by the NLP Manager module, which is able to
access the NLP tools and the ontology, the former to
linguistically analyze the text, the latter to resolve possible
syntactical ambiguities found during the analysis.</p>
        <p>Consider the example in Fig. 3 which is relative to the
annotation of the description in the box about a given product.
Through pattern matching it is possible to extract its name
(“Sanela”), the type (cushion), the price (€12,95), its
dimensions (40 cm of width and 60 cm of length) and the
product unique identifier (900.582.56), as well as the relations
between this information and the product itself (i.e. name_of,
price_of, etc). The natural language description identified at
this stage is then passed to the NLP Manager which is in
charge of acquiring, with the support of the application
ontology, further information about the product: in this
example, the system detects a part (“cover”) and a material
(“cotton”), as well as a relation holding between them
(“made_of”). The box below contains a snippet of the
(XMLstyle) final annotated product description, where some of the
extracted features of the product are listed, including the fact,
for instance, that the cover of the cushion is made of cotton.</p>
        <p>A preliminary evaluation of the system has been done
analyzing the semantic annotation of the “IKEA 2006” italian
furniture catalogue. There are two main levels of evaluation
that have been taken into consideration, relative to the two
main components of the system: PTP and PISA. Due to the
lack of a “golden standard” furniture ontology to compare to
the one obtained with the help of the PTP component, a “task
based” evaluation technique has been adopted, where the
coverage of the ontology has been indirectly evaluated on the
basis of the quality of the obtained annotation.</p>
        <p>To sum up the results of the preliminary evaluation we can
say the ontology is able to detect approximately the 70% of the
terms appearing in the free text description of the extracted
product and put them in the correct relation, thus scoring a 0,7
of recall. On the other hand, considering that only terms
included in the ontology can be extracted, the system scores a
precision of 1. Concerning the disambiguation functionality,
our analysis has proved that every time two terms are correctly
detected and recognized, the disambiguation works. After all,
we can assert that the quality of the linguistic analysis is
strongly related to the ontology coverage.</p>
        <p>From the pattern matching point of view, the system has
scored a 0,9 of precision and 0,8 of recall, extracting 800
products out of 1000 applying 9 different regular expressions.</p>
        <p>Concerning the task of semantic text annotation future
directions of research include, on the one hand the application
of the presented technique to different product catalogues (not
necessarily in italian) and, on the other hand, the evolution of
the methodology by the integration of a more sophisticated
ontology learning from text system currently under
development within the DylanLab of the ILC-CNR in Pisa.</p>
      </sec>
      <sec id="sec-3-2">
        <title>B. Annotation of Images</title>
        <p>The information conveyed by a multimedia document is
analyzed and extracted at two different levels: the document
level, in which the document (geometrical) layout is
investigated considering both text and images, and the image
level, in which pictures are examined in order to describe their
visual content and to recognize the depicted objects. Two
interacting methodologies were adopted to analyze the
document at the two different levels, both of them exploiting a
domain ontology.</p>
        <p>
          A catalogue document usually provides a rich source of
structure. As far as the first algorithm is based on a modified
version of the data mining system SUBDUE ([
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]). SUBDUE is
a system that discovers interesting substructures in structural
data based on inexact graph matching techniques and
computationally constrained beam search algorithm guided by
heuristics. In particular, a substructure is evaluated based on
how well it can compress the graph, according to the minimum
description length principle. Highly compressing structures
can be considered as building block of the entire data set.
        </p>
        <p>To apply the algorithm to a document, we construct a graph
representing the page layout (Fig.4). The graph representation
consists of two types of vertices, image (classified in three
categories: highlight, scene and miscellaneous) and text
paragraph, and one type of arc representing the spatial
relationships between the vertices (e.g. top-left, overlapping).</p>
        <p>
          The algorithm can provide abstract structured components
resulting in a hierarchical view of the document that can be
analyzed at many levels of detail and focus. Usually in
catalogues a common recurring structure is formed by image
and caption (Fig.5). It is remarkable fact that, while text and
images may be separately ambiguous, jointly they tend not to
be. Establishing from the catalogue structure meaningful links
between images and text paragraphs allows exploiting the
semantic annotation of the textual part to semantically annotate
the images or to guide image processing algorithms in order to
recognize the object depicted or to infer correspondence
between words and particular image structures (as in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ][
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]).
Domain knowledge can be also added to guide the discovery
process and to separate the important substructures from the
irrelevant ones.
        </p>
        <p>
          The methodology adopted to describe the image content
relies on MEMORI ([
          <xref ref-type="bibr" rid="ref9">9</xref>
          ][
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]), a system for the detection and
recognition of objects in digital images using pre stored visual
information obtained from shots or 3D models. Object
recognition plays a crucial role in Computer Vision, especially
in the semantic description of visual content. Although object
recognition has been intensely studied, it still remains a hard
and computational expensive problem. The main difficulty in
the description of image content is the lack of information
about the kind and the number of objects possibly present.
Moreover, objects can appear at different locations in the
image and they can be deformed, rotated, rescaled, differently
illuminated or even occluded with respect to a reference view.
In order to simplify object detection and to reduce
computational cost, many systems (e.g. [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]) limit the
recognition to specific classes of objects. In these cases, a
priori knowledge permits to select the most descriptive
features for the objects at hand and to circumscribe the search
space. However, even under this restriction, high classification
performance is seldom reached. Moreover, many object
recognition systems rely on user interaction to label as wrong
or correct the returned items or to improve system response
[
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
        <p>The MEMORI system tackles the object recognition
problem by segmenting the input image into regions, applying
a region grouping algorithm, which interacts with an object
classifier, producing a set of object candidates and filtering the
candidate list (Fig.6). The object segmentation and recognition
modules need domain knowledge in the form of object
snapshots from multiple view points.</p>
        <p>Supporting the extraction and recognition process with a
domain ontology allows the development of context-aware
strategies in order to guide and focus the multimedia semantic
analysis. The knowledge of the image context permits the
object recognition module to restrict the search to a limited
subset of objects and to refine the heuristics weighting object
hypotheses according to their own context. The other way
round content based image analysis allows the acquisition and
exploitation of similarity relations among multimedia entities
thus allowing to refine and enrich the knowledge
representation modeled in the domain ontology.</p>
        <p>
          The experiments and tests conducted so far show that the
proposed approach is a promising method for the detection and
recognition of objects and for image annotation. An extensive
assessment of the performance was conducted on a synthetic
test-bed. A set of synthetic images has been created by
drawing on a non uniform background rotated and rescaled
version of objects taken from COIL-100 [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ], a database
widely adopted in the object recognition community.
        </p>
        <p>
          The achieved recognition rates were compared with other
state of the art recognition methods ([
          <xref ref-type="bibr" rid="ref14">14</xref>
          ][
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]), MEMORI
performs best in all experiments, regardless of the number of
training views. The recognition rate is over 96% with as few as
only 4 training views, demonstrating the robustness of the
method. New techniques to make the system robust with
respect to illumination changes and partial occlusions are
currently under development. Future work will extend the
evaluation to more complex and non synthetic images. The
potential synergy between visual similarity and semantic
similarity measures based on ontologies will also be
investigated and exploited.
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>C. Elicitation and Refinement</title>
        <p>Once information has been extracted from text and images,
and stored in the XML-based format partially described in
Figure 3, the next task is to make its meaning explicit (and thus
machine-understandable) by transforming it into a suitable
RDF/OWL representation.</p>
        <p>This transformation has two parts1:
•
•</p>
        <p>The first step, called syntactic elicitation, aims at
producing a collection of RDF statements, namely an
explicit representation of the knowledge content of the
XML elements in the schema. For example, we want to
construct the fact (expressed by an RDF statement) that
the string “900.582.56” is the ID of “SANELA” (actually,
of some entity which happens to be a product named
“SANELA”).</p>
        <p>The second step, called semantic elicitation and
refinement, aims at leveraging the RDF representation to a
full semantic level, where entities are assigned a proper
identifier (a URI), properties are associated to some data
type or object property in some domain ontology (more
precisely, are replaced by the URI of their ontological
counterpart), and finally entities are assigned to the most
appropriate class (e.g. “SANELA” should be assigned to
the class of cushions, which belongs to a hierarchy of
classes which is very likely to include among its ancestors
the class of products and of physical entity).</p>
        <p>
          The first step, in our implementation, is rather simple, as it
is performed via a simple XSL transformation from the XML
schema depicted in Figure 3 to a collection of RDF statements
expressed in the RDF XML syntax. The only tricky part is the
decision of what statements are to be produced from the XML
file (in fact, we observe that the simple snippet in Figure 3
contains a large number of implicit statements, and therefore
one needs to select the most useful ones). The outcome of this
step is an RDF file containing a potentially large number of
1 More details are provided in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], where the entire VIKEF knowledge
pipeline is described thoroughly.
facts about the product catalog.
        </p>
        <p>The second step is by far more interesting. Indeed, the goal
is to qualify the RDF statements generated in the previous step
by linking their constituents to some pre-existing ontology2
(this is what we call semantic elicitation). In our approach, this
task can be decomposed in two different sub-tasks: (1)
entitylevel elicitation, and (2) class/property-level elicitation.</p>
        <p>The first sub-task is implemented as a problem of
matching entity descriptions onto the entities stored in an
repository called OKKAM. A full description of OKKAM is
beyond the scope of this paper; the main idea is that it creates
and stores URIs (which can then be reused through multiple
applications) together with additional information, including
any known description of the entity itself. Once a new entity is
recognized in any digital document (plain text documents,
relational databases, HTML pages, and so on), OKKAM can
be queried to check whether that entity already has a known
URI, which is returned for reuse; if no match is found, a new
entity is created, and stored with its URI and all available
descriptions.</p>
        <p>
          The second task uses a tool, called CtxMatch2.0 [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] for
schema and ontology matching, which is a VIKEF motivated
extension of a pre-existing tool. The details of semantic
elicitation with CtxMatch2.0 are provided in other papers (see
e.g. [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]). In the scenario described in this paper, the tool is
used to match categories and relations extracted from catalogs
to classes and properties found in any available domain
ontology. The matching method uses two sources of
information: the hierarchical structure of elements (which is
particularly important in product catalogs) and lexical
information associated both to catalog labels and ontology
elements.
        </p>
        <p>The outcome of this second step is a refined RDF file where
linguistic descriptions of entities are replaced by unique URIs
(which can be used later to merge RDF graphs produced from
different catalogs or in general from different collections of
documents), category names are replaced by the URIs of
ontological classes (if available), and relation names are
replaced by the URIs of data type or object properties (if
available). We notice that entities may be classified using
complex concepts, compositionally constructed from their
linguistic descriptions (for example, “SANELA” will
correspond to a cushion made of cotton).</p>
        <p>A final remark is that the outcome of this elicitation and
refinement phase includes a mixture of knowledge coming
from multimedia sources, namely text and pictures, in a single
representation.</p>
      </sec>
      <sec id="sec-3-4">
        <title>D. Reuse of existing ontologies</title>
        <p>
          Ontology engineering is a very time consuming and
subjective task. It is time consuming because it is hard to
model not only new, but also already known domains, and
subjective since in most of the cases the same domains are
2 See next section for what concerns the support we provide to build an
ontology by reusing existing ontologies.
modeled in different ways by different ontology engineers.
Even if the domains are the same, the perspectives under
which they are modeled are usually different. One way to
detect the differences (or similarities) of the modeling
perspectives is by analyzing the relations existing between the
concepts. Due to the already mentioned intrinsic difficulties of
the ontology engineering task, advanced research is being
performed in order to find a solution to (partially) overcome
these problems. The method and tools aim to support the
ontology engineer in finding existing Semantic Web ontologies
written in OWL (or parts of them) that model the targeted
domain in a similar way; this is, with the same modeling
perspective. The approach relies on the existence of searchable
pool(s) of ontologies where candidate ontologies can be
searched for and pre-selected based on a user-specified set of
desired ontological classes (this set is called fragment). The
pre-selection process searches for a specific percentile match
of labels and synonyms of them appearing in the user specified
fragment and ontologies in the pool of ontologies. Currently
we are using the SWOOGLE (http://swoogle.umbc.edu/)
ontology repository, but any other similar repository can be
easily incorporated. The pre-selected ontologies are then
further analyzed in detail by analyzing the labels of classes and
relations in combination with a lexical resource (currently
WordNet). Labels are analyzed and word meanings are related
by various means in
order to compute the
likelihood of one or the
other possible sense of a
word to hold in the
given context. This is
then represented in two
ways, in a logical
formulae (following the
[
          <xref ref-type="bibr" rid="ref18">18</xref>
          ] approach) used for
detecting similarity of
concepts using a
reasoner and as weights
of relevance for each
possible sense in order
to compare concepts
and decide if they are
similar and to which aFpigp.ro7a.chThefoarrchtihteecturree-uosef thoef deesxcirsitbinedg
degree this similarity ontologies.
holds. The relevance
measures are combined with the logical results in order to give
a measure of the closeness of the considered concepts. The
architecture of this approach is presented in Fig. 7. The
ontology engineer can then decide to use any of the proposed
ontologies as a basis for performing the required extension of
the initially specified fragment. There exist several different
techniques for matching ontologies, using very different
approaches. Ontology mapping, ontology alignment and
ontology matching fall all under a broader research area where
correspondences between two schemata need to be discovered.
In some cases rules for mapping schemata need to be created,
in others it is enough to know the relations existing between
two or more elements from different schemata. The area of
alignment or matching of schemata has been long studied and
many approaches exist in integrating database schemata, XML
schemata in general, query mediation, etc. The closest
approaches to ontology alignment are the approaches for
schemata alignment, especially taxonomic hierarchies or
classification hierarchies in general. A comprehensive study is
presented in [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. In the following we will present some
approaches that are more closely related to the one presented
in this section.
•
        </p>
        <p>
          The iPrompt [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] tool suite contains tools for ontology
merging and alignment among others. Its input are a set of
pairs of related/similar concept from different ontologies,
based on them it proceeds to find other pairs of similar
concepts relying heavily on the structure.
•
•
•
•
•
        </p>
        <p>
          The GLUE approach [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] uses a machine learning
approach to analyze instances. The similarity of concept
meaning is defined based on the joint probability
distribution of the involved concepts.
        </p>
        <p>
          The QOM [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] approach is based on a combination of
several approaches that analyze features of different
ontologies. The similarity of concepts is later computed
by aggregation of different similarity. The approach is
iterative; every iteration uses results obtained in the
previous ones in order to enhance quality of the results.
Cupid [
          <xref ref-type="bibr" rid="ref23">23</xref>
          ] presents an approach that considers schemata
and not instances information for matching XML
Schemata mainly. It uses linguistic information and the
structure of the schema. This approach uses also keys,
referential constraints and other schema information to
derive more precise matching results. Cupid also handles
context dependent matches of shared type definitions that
are used in several larger structures in the schema.
        </p>
        <p>
          The MoA [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] is an approach to merge and align OWL
ontologies. This approach uses linguistic methods to
disambiguate the meaning of concepts and proposes an
algorithm to detect and represent in a Semantic Bridge
semantic equivalence between concepts. The Semantic
Bridge can represent equivalence of concepts and
properties, subconcept and subproperties and identity of
instances. A merging algorithm is also presented.
        </p>
        <p>
          The OMEN [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] approach uses Bayesian Networks (BN)
for deciding the match of concepts based on an initial
hand made match. Based on known matches it analyzes
the structure (e.g. domain and range of properties) to
derive further matches. The algorithm operates iteratively
and produces in every iteration a new match which is used
in the following interactions.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>IV. CONCLUSIONS In this paper we presented a methodology for supporting the use of background ontologies in the task of information 7</title>
      <p>extraction from multimedia sources, in particular from product
catalogues. Our methodology tries to enable a virtuous circle
by which domain ontologies are used in the extraction process,
and at the same time the extraction process becomes a way for
creating or extending the available ontologies. The result of
the extraction process is a semantically rich representation of
the content of catalogs, where knowledge extracted from texts
(e.g. product descriptions) is integrated with knowledge
extracted from pictures, and made available for any service
one may want to build on top of it.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Pazienza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Stellato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Vindigni</surname>
          </string-name>
          ,
          <article-title>Cross-lingual multi-agent retail comparison</article-title>
          ,
          <source>AWSS 2003 Workshop on Applications, Products and Services of Web-based Support Systems</source>
          ,
          <volume>13</volume>
          Oct.
          <year>2003</year>
          ,
          <article-title>Halifax Canada in conjunction with Web-Intelligence international</article-title>
          <source>Conference WI</source>
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>V.</given-names>
            <surname>Svatek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Kosek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Labsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Braza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kavalec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Vacura</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vavra</surname>
          </string-name>
          ,
          <string-name>
            <surname>V.</surname>
          </string-name>
          <article-title>Snasel: Rainbow - multiway semantic analysis of websites</article-title>
          ,
          <source>in Proceedings of the 14th International Workshop on Database and Expert Systems Applications (DEXA'03)</source>
          ,
          <year>2003</year>
          , p.
          <fpage>635</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lenci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Montemagni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Pirrelli</surname>
          </string-name>
          ,
          <article-title>Chunk-it. An Italian shallow parser for robust syntactic annotation</article-title>
          . In A. Zampolli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Calzolari</surname>
          </string-name>
          , L. Cignoni, (eds.),
          <source>Computational Linguistics in Pisa - Linguistica Computazionale a Pisa</source>
          . Linguistica Computazionale, Special Issue, XVI-XVII, (
          <year>2003</year>
          ). Pisa-Roma, IEPI.
          <string-name>
            <surname>Tomo</surname>
            <given-names>I</given-names>
          </string-name>
          ,
          <year>2003</year>
          , pp.
          <fpage>353</fpage>
          -
          <lpage>386</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bartolini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lenci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Montemagni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Pirrelli</surname>
          </string-name>
          ,
          <article-title>Hybrid Cosntraints for Robust Parsing: First Experiments and Evaluation</article-title>
          ,
          <source>in Proceedings of LREC 2004: Fourth International Conference on Language Resources and Evaluation</source>
          . Lisbon, Portugal,
          <year>26th</year>
          ,
          <source>27th &amp; 28 May</source>
          <year>2004</year>
          ,
          <string-name>
            <surname>Volume</surname>
            <given-names>III</given-names>
          </string-name>
          , Paris, pp.
          <fpage>859</fpage>
          -
          <lpage>862</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bartolini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Giorgetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lenci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Montemagni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Pirrelli</surname>
          </string-name>
          ,
          <article-title>Automatic Incremental Term Acquisition from Domain Corpora</article-title>
          ,
          <source>in Proceedings of the 7th International conference on Terminology and Knowledge Engineering (TKE2005)</source>
          , Copenhagen Business School, Copenhagen, Denmark,
          <year>2005</year>
          , pp.
          <fpage>17</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Coble</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Rathi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. J.</given-names>
            <surname>Cook</surname>
          </string-name>
          and
          <string-name>
            <given-names>L. B.</given-names>
            <surname>Holder</surname>
          </string-name>
          ,
          <article-title>Iterative Structure Discovery in Graph-Based Data</article-title>
          ,
          <source>International Journal on Artificial Intelligence Tools</source>
          ,
          <volume>14</volume>
          (
          <issue>1-2</issue>
          ):
          <fpage>101</fpage>
          -
          <lpage>124</lpage>
          ,
          <year>2005</year>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Duygulu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Barnard</surname>
          </string-name>
          , N. de Freitas, and
          <string-name>
            <given-names>D.</given-names>
            <surname>Forsyth</surname>
          </string-name>
          .
          <article-title>Object Recognition as Machine Translation: Learning a lexicon for a fixed image vocabulary</article-title>
          .
          <source>In European Conference on Computer Vision</source>
          (ECCV) Copenhagen,
          <year>2002</year>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Carson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Belongie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Greenspan</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Malik</surname>
          </string-name>
          . Blobworld:
          <article-title>Image segmentation using expectation-maximization and its application to image querying</article-title>
          .
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          ,
          <volume>24</volume>
          (
          <issue>8</issue>
          ):
          <fpage>1026</fpage>
          -
          <lpage>1038</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lecca</surname>
          </string-name>
          ,
          <article-title>Object Recognition in Color Images by the Self Configuring System MEMORI</article-title>
          ,
          <source>International Journal of Signal Processing</source>
          , Vol.
          <volume>3</volume>
          , No.
          <issue>3</issue>
          , pp.
          <fpage>176</fpage>
          -
          <lpage>185</lpage>
          , World Enformatika Society,
          <year>2006</year>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Andreatta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lecca</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Messelodi</surname>
          </string-name>
          ,
          <article-title>Memory-based Object Recognition in digital Images</article-title>
          , 10th International Fall Workshop - Vision, Modeling, and Visualization - VMV
          <year>2005</year>
          , Erlangen, Germany, November 16-
          <issue>18</issue>
          ,
          <year>2005</year>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Hoogs</surname>
          </string-name>
          , R. Collins,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kaucic</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Mundy</surname>
          </string-name>
          .
          <article-title>A common set ofperceptual observables for grouping, figure - ground discrimination, and texture classification</article-title>
          .
          <source>IEEE Transaction on Pattern Analysis and Machine Intelligence</source>
          , (
          <volume>4</volume>
          ):
          <fpage>458</fpage>
          -
          <lpage>474</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B.</given-names>
            <surname>Ko</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Byun</surname>
          </string-name>
          .
          <article-title>Extracting Salient Regions And Learning Importance Scores In Region-Based Image Retrieval</article-title>
          .
          <source>International Journal of Patter Recognition and Artificial Intelligence</source>
          , (
          <volume>17</volume>
          (
          <issue>8</issue>
          )):
          <fpage>1349</fpage>
          -
          <lpage>1367</lpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Nene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. K.</given-names>
            <surname>Nayar</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Murase</surname>
          </string-name>
          .
          <article-title>Columbia object image library (COIL-100)</article-title>
          .
          <source>In Technical Report CUCS-006-96</source>
          , Columbia University,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Obdrzalek</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Matas</surname>
          </string-name>
          .
          <article-title>Object recognition using local affine frames on distinguished regions</article-title>
          . In Paul L. Rosin and David Marshall, editors,
          <source>Proceedings of the British Machine Vision Conference</source>
          , volume
          <volume>1</volume>
          , pages
          <fpage>113</fpage>
          -
          <lpage>122</lpage>
          , London, UK,
          <year>September 2002</year>
          . BMVA.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Ming-Hsuan Yang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Roth</surname>
            and
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ahuja</surname>
          </string-name>
          .
          <article-title>Learning to Recognize 3D Objects with SNoW</article-title>
          .
          <source>In ECCV-2000, The Proceedings of the Sixth European Conference on Computer Vision</source>
          , volume
          <volume>1</volume>
          , pages
          <fpage>439</fpage>
          -
          <lpage>454</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Bouquet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Serafini</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zanobini</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Peer-to-peer semantic coordination</article-title>
          .
          <source>Journal of Web Semantics</source>
          <volume>2</volume>
          (
          <year>2005</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Bouquet</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Serafini</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zanobini</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sceffer</surname>
            <given-names>S.:</given-names>
          </string-name>
          <article-title>Bootstrapping semantics on the web. meaning elicitation from schemas</article-title>
          .
          <source>WWW2006</source>
          ,
          <string-name>
            <surname>Edinburgh</surname>
          </string-name>
          (Scotland, UK),
          <article-title>May 2006</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R.</given-names>
            <surname>Stecher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Niederee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bouquet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Jaquin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ait-Mokhtar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Montemagni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Brunelli</surname>
          </string-name>
          , G. Demetriou,
          <article-title>"Enabling a knowledge supply chain: from content resources to ontologies"</article-title>
          ,
          <source>in Proceedings of the ESWC2006 workshop on "</source>
          <article-title>Mastering the Gap: from information extraction to semantic representation"</article-title>
          ,
          <source>CEUR workshop proceedings</source>
          , vol.
          <volume>187</volume>
          ,
          <string-name>
            <surname>Budva</surname>
          </string-name>
          (Montenegro),
          <source>11 June</source>
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Pavel</surname>
            <given-names>Shvaiko</given-names>
          </string-name>
          , Jérôme Euzenat:
          <article-title>“A Survey of Schema-Based Matching Approaches”</article-title>
          ,
          <source>in Journal of Data Semantics IV</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>N.</given-names>
            <surname>Noy</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Musen</surname>
          </string-name>
          <article-title>: “The PROMPT suite: Interactive tools for ontology merging and mapping”</article-title>
          ,
          <source>in Technical report, SMI</source>
          , Stanford University, CA, USA (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.</given-names>
            <surname>Doan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Domingos</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Halevy</surname>
          </string-name>
          <article-title>: “Learning to match the schemas of data sources: A multistrategy approach</article-title>
          ”,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Marc</given-names>
            <surname>Ehrig</surname>
          </string-name>
          and
          <article-title>Steffen Staab: „QOM - Quick Ontology Mapping“</article-title>
          ,
          <source>in Proceedings of the International Semantic Web Conference</source>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>Jayant</surname>
            <given-names>Madhavan</given-names>
          </string-name>
          ,
          <article-title>Philip A. Bernstein and Erhard Rahm: “Generic Schema Matching with Cupid”</article-title>
          ,
          <source>in The VLDB Journal</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Jaehong</surname>
            <given-names>Kim</given-names>
          </string-name>
          , Minsu Jang,
          <string-name>
            <surname>Young-Guk</surname>
            <given-names>Ha</given-names>
          </string-name>
          ,
          <article-title>Joo-Chan Sohn and Sang Jo Lee: “MoA: OWL Ontology Merging and Alignment Tool for the Semantic Web</article-title>
          ” in
          <source>Innovations in Applied Artificial Intelligence: 18th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Prasenjit</surname>
            <given-names>Mitra</given-names>
          </string-name>
          , Natasha F.
          <article-title>Noy and Anuj Jaiswal: “OMEN: A Probabilistic Ontology Mapping Tool”</article-title>
          ,
          <source>in Proceedings of the International Semantic Web Conference</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>