<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An extensible platform for semantic classification and retrieval of multimedia resources</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Paolo Pellegrino</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fulvio Corno Politecnico di Torino</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>C.so Duca degli Abruzzi</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Torino</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italy</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>paolo.pellegrino</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>fulvio.corno}@polito.it</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2005</year>
      </pub-date>
      <abstract>
        <p>This paper introduces a possible solution to the problem of semantic indexing, searching and retrieving heterogeneous resources, from textual as in most of modern search engines, to multimedia. The idea of “anchor” as information unit is here introduced to view resources from different perspectives and to access existing resources and metadata archives. Moreover, the platform uses an ontology as a conceptual representation of a well-defined domain in order to semantically classify and retrieve anchors (and the related resources). Specifically, the architecture of the proposed platform aims at being as modular and easily extensible as possible, in order to permit the inclusion of state-of-the-art techniques for the classification and retrieval of multimedia resources. Eventually, the adoption of Web Services as interface technology facilitates the exposition of the semantic functionalities and of content management to web application designers and users without any additional overload on the content creation and maintenance workflow.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Resources, exponentially growing in number on the
Internet, are slowly but increasingly shifting from textual
to multi-media. The increase in storage capacity and
computational power, as well as in connectivity and
bandwidth, permits and stimulates the creation of
multimedia digital archives, both for sake of resource
preservation and for ease of fruition. However, current
methodologies for classification of multimedia resources
are still mostly based on tags and metadata which are
usually manually added to the resources during the
archival phases. Approaches oriented to the automatic
discovery or extraction of information from multimedia
resources are of course under study, but still rather
inefficient and immature, especially if compared with the
consolidated algorithms and technologies which are
currently applied to textual documents. In addition, the
Semantic Web initiative is pushing even forward the
classic knowledge paradigms, shifting mainly from a
keyword-based view of web documents to a more
articulated representation of knowledge, based on
concepts and relations between concepts (ontologies).
These new technologies are recently growing in popularity
and seem very promising, but are still mostly exploited for
textual resources only. In effect, text is the simplest form
of knowledge representation which can be both
understood by humans and quickly processed by
machines. So, while text itself is already efficaciously
used as knowledge representation for machines, digital
multimedia resources like images and videos cannot in
general be used as they are for efficient classification and
retrieval. Some intensive processing or manual metadata
creation is necessary to extract and associate semantic
information to a given multimedia resource.</p>
      <p>In this context, the proposed work aims at providing a
flexible, lightweight and ready-to-use platform for the
semantic classification, search and retrieval of
heterogeneous resources. Particularly, the proposed
approach keeps in high consideration the reuse of existing
digital repositories and metadata, and provides means for
the exploitation and testing of the recent knowledge
extraction technologies applied to multimedia resources. A
simple information container and collector, called anchor,
is therefore introduced as information unit to wrap
knowledge about given resources and to permit a
classification based on the concepts defined in the domain
ontology. It is therefore possible to exploit semantic based
technologies, commonly used for textual resources, even
for an enhanced fruition of multimedia content, both in
terms of classification and search for the desired resources
in the available repositories and in terms of retrieval and
composition of the requested material in a suitable form
for the users. In fact, semantic-oriented technologies can
be easily and uniformly integrated in the platform to
transparently enable web application designers to enhance
their applications with little effort, guaranteeing an
improved flexibility and resource fruition to the users.</p>
      <p>In the next sections of this paper, starting from section
2, related works are presented, followed by a description
of the design principles in section 3, and the architectural
design in section 4. Section 5 presents the tests performed,
and eventually, in section 6, the entire work is summarized
as a conclusion.</p>
      <p>Three main areas may be individuated that particularly
pertain to this work and substantially involve technologies
for extraction of knowledge from multimedia resources,
metadata exploitation and merging, and knowledge
representation issues.</p>
      <p>
        Various projects have recently proposed solutions to
the problem of maintaining repositories of digital
resources. The cataloguing of resources is generally
handled manually or semi-automatically. Usually, each
repository contains only a specific type of resource, as it
may happen for books, videos, etc., and specific metadata
schemas are used for the classification. Some approaches
try to automatically extract and exploit audiovisual
features for the classification, such as the SCHEMA
project [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and Caliph &amp; Emir [10] which are mainly
focused on images.
      </p>
      <p>
        The DSpace project [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] proposes the archival of
heterogeneous resources through the exploitation of new
emerging standards for resource identification and
handling such as DOI and HANDLE [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Here, however,
the attention is less focused on the semantics of the
resource content. The Simile project [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] exploits the
DSpace work to efficiently manage distributed metadata,
in form of RDF triples, so requiring the conversion of
existing schemas into this W3C recommended format (or
equivalent).
      </p>
      <p>Instead, our approach aims at providing an easily
manageable, extensible and ready-to-use
semanticoriented platform. The existing H-DOSE platform has
already been successfully and easily integrated into a
number of different applications (Moodle [11], Muffin
[12]), so offering semantic services along with those
already existent ones. Although specific tests are yet to be
published, the authors believe that the improvements
added to this platform are worth the effort, and are
therefore explained in this paper, starting from the next
section.</p>
      <p>
        For what concerns metadata, the necessity for
diversification and fine-grained levels of detail in various
cases has brought to the creation of the most diverse
schemas for the collection of document metadata.
However, it is extremely difficult to find a single schema
that is both popular and capable of describing any type of
resource. In fact, the existing metadata schemas are either
too generic, like the Dublin Core (DC) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], too specific
like the MPEG-7 [13]. Half-way approaches are not yet
widely accepted as standards, like the Harmony project for
the ABC ontology [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which integrates DC with MPEG-7
to provide a detailed description of an audiovisual
resource. The main difficulty in similar cases resides in
identifying possible mappings between elements which are
similar in the joined schemas, as well as in managing the
increased complexity of the new schema. It becomes
therefore difficult to reconduct metadata of existing digital
archives to a new schema, while many tags of composite
schemas may remain empty even for newly created
archives unless manually filled in, operation which is
often unscalable. While the adoption of a unique, simple
and widespread standard would still be desirable, in this
project a flexible approach has been adopted. Particularly,
as explained in the next sections, the adoption of a simple
XML container has been chosen. In this way, different
schemas may be easily included independently and
therefore reused as they are. This also means that available
software and technologies that can parse and use existing
schemas may be reused in the platform with minimal
adaptation.
      </p>
    </sec>
    <sec id="sec-2">
      <title>3 Design Principles</title>
      <p>
        The platform architecture is based on the existing
HDOSE platform [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. H-DOSE currently provides semantic
functionalities for web applications through an easy to
access interface, allowing rapid inclusion of services into
the existing development workflow and trying to
maximize the benefit/cost ratio for the inclusion of
semantics in web applications. In particular, H-DOSE is
focused on semantic search and indexing services,
providing means for classifying a textual web resource
with respect to a conceptual model represented as an
ontology. It also transparently stores conceptual
information (annotations) about indexed resources and
retrieves such resources in response to user queries,
according to the semantic similarity between the queries
and the annotated resources. The relations specified in the
ontology allow to expand the search of documents through
correlated concepts, as explained in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The main
functionalities that are offered are therefore semantic
indexing, search and deep-search of textual resources.
      </p>
      <p>The new architecture aims at maintaining backward
compatibility and the same functionalities of the existing
platform, while adding support for semantic classification
and retrieval of heterogeneous resources. The main
novelty proposed here is therefore the introduction of a
modular substructure that allows to easily extend the
semantic capabilities of the platform with state-of-the-art
algorithms suitable for the classification of any given type
of resource. The semantic framework which collates the
whole architecture can then be exploited to retrieve
resources with a minimal effort basing simply on the
conceptual relevance of the resource content. Particularly,
two elements play a relevant role in the architecture of the
new multimedia framework and will therefore be
described in more details in the next sections: anchors,
which associate resources to descriptions – rather than
using the textual document itself as a self-describing
search object –, and mapping modules, which exploit the
semantics of given descriptions to generate links to the
domain ontology (annotations). The rest of the
components mainly covers automatic creation of anchors,
management of semantic annotations and resource
retrieval.</p>
    </sec>
    <sec id="sec-3">
      <title>3.1 Anchors and Resources</title>
      <p>Generally, any multimedia resource may be
“described” from different perspectives or views, basing
on a precise context, or level of detail, subcomponents,
etc. For instance, a video can be considered as a whole or
as a temporal sequence of chapters. Some particular frame
could also be described in detail, and so on. Hence,
resources, which are usually stored as atomic items, can be
considered as more versatile sources of information. For
this reason, each perspective of a given resource is
associated to one anchor (Figure 1), which substitutes the
resource as information unit. Each anchor is composed by
two type of elements: the first one indicates the target
location of the resource and the precise part of interest of
the resource, so “physically” defining the perspective, e.g.,
as a temporal or spatial restriction, or as a text fragment,
etc.; the second type of elements instead, may appear
multiple times in an anchor and represents a semantic
description about the so defined perspective. Both of these
types of elements are actually simple containers which are
to be customized depending on the particular necessities.
So a simple perspective which refers to a web resource as
a whole may be simply targeted by specifying its URL,
whereas a spatiotemporal restriction of an audiovisual clip
requires more articulated target details. In any case, the
anchor target element should simply specify the necessary
information to permit the retrieval of the targeted resource
perspective. On the contrary, the intended use of the
description elements is the collection of data, metadata or
features useful for the classification of the anchor. So, for
instance, plain text may be used as description for
audiovisual resources in order to exploit textual classifiers.
Furthermore, Dublin Core (DC) metadata (or other more
specific schemas) can be used to indicate additional
information like author, date of creation, etc., which
provide multidimensional views for the given anchors.
Eventually, descriptions specific for certain types of media
can be adopted for classification with feature- or
caseAnchor
Target resource view
View description</p>
      <p>metadata
View description
metadata</p>
      <p>Resource
s
e
v
i
t
c
e
p
s
r
e
P
based approaches. Customized targets and descriptions are
parsed through simple ad-hoc modules that can be easily
added as platform extensions. Similarly, classification
algorithms can be added to the platform to fully exploit
the flexibility introduced with the descriptions. In this
case, the main task is the interpretation of a description
basing on the conceptual domain provided as an ontology
to the platform. As a result, each description may be
associated to one or more concepts defined in the
ontology, so contributing to the classification of the
container anchor. To summarize, the anchors (information
units) may be used as generic yet uniform containers for
various forms of descriptions, referring to the targeted
resource perspective.</p>
      <p>An XML descriptor is used to collect various anchors
for the same resource, or even for multiple resources. The
XML descriptors can therefore be used as a sort of
distributed repository, the stored information being located
in the same location of the described resources.
Alternatively, the descriptors can be used as a central local
cache of existing metadata coming from the resource
archives, that therefore remain separated from the related
anchors. The platform keeps track of all the annotated
descriptors, anchors and of the described resources, so that
they can be easily retrieved or updated.</p>
      <p>For example, a target element for a simple fragment of
an XHTML web page can be simply expressed as follows:
&lt;target type=”urn:dose:target:xhtml”
src=”http://www.eg.org/index.html#/body/h1” /&gt;
In this case a type attribute specifies how to handle
and parse the target element as well as how to eventually
retrieve the resource specified through the src attribute.</p>
      <p>Instead, for a simple video the MPEG-7 schema could
be used to specify which part of it we are focusing on:
&lt;target type=”urn:dose:target:mpeg7”
src=”http://www.example.org/video.mpg”&gt;
&lt;Mpeg7 xmlns="urn:mpeg:mpeg7:schema:2001"
xmlns:xsi=
http://www.w3.org/2001/XMLSchema-instance
xmlns:mpeg7="urn:mpeg:mpeg7:schema:2001"&gt;
&lt;Description xsi:type="ContentEntityType"&gt;
&lt;MultimediaContent</p>
      <p>xsi:type="AudioVisualType"&gt;
&lt;AudioVisual&gt;
&lt;TemporalDecomposition&gt;
&lt;AudioVisualSegment id="cha01"&gt;
&lt;MediaTime&gt;
&lt;!--from beginning...--&gt;
&lt;MediaTimePoint&gt;</p>
      <p>T00:00:00:00F25
&lt;/MediaTimePoint&gt;
&lt;!--to 12’25sec &amp; 17/25sec--&gt;
&lt;MediaDuration&gt;</p>
      <p>PT0H12M40S17N25F
&lt;/MediaDuration&gt;
&lt;/MediaTime&gt;
&lt;/AudioVisualSegment&gt;
&lt;/TemporalDecomposition&gt;
&lt;/AudioVisual&gt;
&lt;/MultimediaContent&gt;
&lt;/Description&gt;
&lt;/Mpeg7&gt;
&lt;/target&gt;</p>
      <p>In this case an appropriate module is necessary to parse
the MPEG-7 elements and to possibly retrieve the desired
resource fragment(s).</p>
      <p>Similar examples can be proposed for description
elements. For instance, in a description element it is
possible to simply specify plain inline text:
&lt;description
type=”urn:dose:description:plainText”/&gt;</p>
      <p>This is a sample textual description
&lt;/description&gt;</p>
      <p>Differently than target elements, description elements
are used to classify an anchor. So, a specific module will
parse and handle specific types of descriptions and exploit
their content to associate the anchor to concepts in the
domain ontology (which abstracts the relevant concepts
from the content of the archived resources).</p>
      <p>In this example, the Dublin Core schema is exploited:
&lt;description
type=”urn:dose:description:DublinCore”
xmlns:dc=http://purl.org/dc/elements/1.1/&gt;
&lt;dc:title xml:lang="en"&gt;</p>
      <p>Wildlife and nature
&lt;/dc:title&gt;
&lt;dc:author&gt;John Doe&lt;/dc:author&gt;
&lt;dc:type&gt;Documentary&lt;/dc:type&gt;
&lt;dc:language&gt;en&lt;/dc:language&gt;
&lt;/description&gt;</p>
      <p>Other well-known schemas may of course be exploited,
such as MARC [14], CIDOC [15] and the Open Archives
Initiative Protocol for Metadata Harvesting (OAI-PMH)
[16].</p>
    </sec>
    <sec id="sec-4">
      <title>3.2 Anchor Life-cycle</title>
      <p>The role of the anchors in the platform is to serve as
identifiers for resource perspectives and as collectors of
metadata for indexing and search. Their use in the
platform is explained in this section, and basically covers
the creation of anchors and descriptors, the indexing
phase, including the creation of semantic annotations, and
the search phase, which describes the composition of
advanced queries and the retrieval of anchors and
resources through the previously indexed annotations.</p>
      <p>First of all, the anchors must be created (step 1 in
Figure 2 on the left). An automatic procedure may
generate anchors by extracting metadata from a given
resource (step 2 and 3). As explained in more detail in the
next sections, the Resource Inspector is one of the types of
modules that can be extended and seamlessly integrated
with new implementations capable of generating
descriptors (i.e., anchor containers) for different resource
types. The main idea here is the provision of a service
capable of automatically index a given (and supported)
resource. A centralized repository guarantees that
resources are not indexed multiple times, unless modified
since the last indexing operation. In such a case the older
anchors and their annotations must be updated or
removed.</p>
      <p>
        Once some anchors have been created, they can be
examined for indexing (step 4). For each anchor, the
process of indexing is based on the creation of weighted
links (generally referred to as semantic annotations) to the
concepts of the domain ontology (Figure 3). The
annotations are firstly created for each description
contained in a given anchor and then combined to yield
the final weighted links (spectrum) for such anchor (steps
5 and 6 in Figure 2). As the descriptions are generally
variegated in type and content, a number of different
modules (which can be easily added as extensions to the
platform) perform the mapping between metadata and
ontology concepts in order to create the annotations.
Particularly each mapping module can implement a
different strategy and exploit specific “samples” attached
to the ontology concepts. For instance, textual descriptions
may be mapped through a tf/idf algorithm exploiting
multilingual synsets for each concept (i.e., set of words
that identify the concept)[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ][
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Instead, descriptions with
visual features could be mapped through a case-based
4
Spectra
6
5
      </p>
      <p>Ranked List
of Anchors
Annotation
Repository</p>
      <p>Anchor
Browser(s)
reasoning technique, and so on.</p>
      <p>In the end, the semantic annotations (spectrum) for
each anchor are stored in the annotation repository for
further retrieval (step 7). Particularly, a semantic
annotation for an anchor is composed by the URL of the
container descriptor, the unique anchor ID within such
descriptor and the weights that correlate the anchor to each
ontology. This last operation concludes the indexing
phase.</p>
      <p>The search process starts from a user query. In the
search phase, as shown in Figure 2 on the right, the user
query is converted into a logical expression of descriptions
(step 1 in the figure) similar to those normally included in
the anchors. In this way, it is possible to support different
forms of query, not necessarily textual. For instance, one
might exploit a specific description type to represent a
user’s sketch, another one to represent audio features
extracted from a just recorded humming, and so on. These
descriptions, representing the query, are then mapped to
the ontology concepts by the platform search engine (steps
2 and 3) through query oriented mapping strategies (which
may actually be the same used in the indexing phase). In
addition, thanks to the relation between concepts, it is
possible to infer correlated concepts so that a wider range
of related resources can be retrieved.</p>
      <p>Once a definitive spectrum has been computed by
merging all the description spectra, a spectrum-based
distance function is utilized to rank the annotated anchors
(steps 4 and 5). The list of anchors (which actually
corresponds to a list of descriptor URLs including anchor
IDs), is then returned to the web application (step 6). At
this point the textual links to the anchors can already be
displayed as they are, leaving to the user the burden of
retrieving the anchor and the related resources. As an
alternative, the platform may be requested to retrieve the
anchors from the container descriptors and the resources
myResource
&lt;descriptor&gt;
&lt;anchor anchorID=”example”&gt;</p>
      <p>Anchor
are then obtained by exploiting the information included
in each target element. In this way a more practical
preview can be proposed to the user, even as a
combination of multiple multimedia channels (steps 7 and
8). In effect, at the application level, in addition to the
existing textual visualization of search results, the new
architecture also allows the automatic composition of clips
or snapshots from the retrieved result. So, eventually, the
composition of the fragments retrieved through the anchor
targets, especially when heterogeneous, can be arranged
according to the context and to the user preferences. For
instance, a query for images could either return a grid of
thumbnails or a slideshow, whereas textual documents
may be shortly described in a list, possibly indicating the
most relevant fragments. An appropriate form or interface
will permit the user to personalize his/her semantic-based
multimedia fruition environment so as to improve comfort
and satisfaction.</p>
    </sec>
    <sec id="sec-5">
      <title>4 Architecture</title>
      <p>In this section, the logical deployment of the new
platform is discussed with respect to the type of offered
services. The most relevant component of the platform are
identified and described and the tasks performed by each
service are explained in the context of typical usage
scenarios.</p>
      <p>The entire architecture can be logically divided in three
main functional levels. Each level includes a set of web
services (based on the standard SOAP) or modules, from
the most user oriented (indexing, and search) to the most
platform specific ones (database management, etc.).</p>
      <p>The topmost layer exposes the main services offering
indexing and search functionalities to web application
developers wishing to include semantics into their works.
The interfaces for such services are inherited from the
HOntology
&lt;target src=
”http://sample.org/myResource”/&gt;
&lt;description type=”descType1”&gt;</p>
      <p>Textual description
&lt;/description&gt;
&lt;description type=”descType2”&gt;
&lt;MPEG7 ... &gt; ...</p>
      <p>&lt;/MPEG7&gt;
&lt;/description&gt;
&lt;/anchor&gt;
&lt;anchor anchorID=”example2”&gt;</p>
      <p>...</p>
      <p>&lt;/anchor&gt;
&lt;/descriptor&gt;
Weights
Weights
annotation
spectrum</p>
      <p>desc2_w1
desc2_w2
Semantic
Mapper
Module
Module-specific
Mapping models
DOSE platform to guarantee the maximum possible
compatibility with the older architecture, so facilitating the
migration to the one proposed here. However, additional
functionalities are provided to the user, especially for what
concerns the management of non-textual resources.</p>
      <p>The kernel layer includes the modules and services that
actually implement most of the techniques for semantic
classification, search and retrieval. General interfaces are
also provided to ensure the necessary functionalities in all
the extended modules. So, for instance, every Semantic
Mapper implementation must receive descriptions and
return a spectrum.</p>
      <p>The last layer hides the management of complex data to
the higher levels. So, data stored and retrieved through
storage devices (databases, files, etc.) are wrapped around
simpler functions or services.</p>
      <p>In the next sections, the most significant components of
the platform are described in more details. Subsequently
module extension issues are discussed.</p>
    </sec>
    <sec id="sec-6">
      <title>4.1. Indexing</title>
      <p>The indexing service receives requests for the
classification of a given resource. Its primary task is to
obtain a descriptor whose anchors are to be annotated. If
the resource is not already a descriptor, it is passed to any
specific Resource Inspector module that is possibly
capable of automatically or semi-automatically generating
a valid descriptor. If no descriptor can be obtained, the
indexing is interrupted for the requested resource and the
error is logged for further debug.</p>
    </sec>
    <sec id="sec-7">
      <title>4.2. Resource Inspectors</title>
      <p>Each Resource Inspector is an implementation of a
simple predefined interface, which defines the
functionalities for this type of modules. Two main tasks
should be provided by any implementation, given a
resource: to decide whether it can process it; to generate a
descriptor (if no error occurs).</p>
      <p>The first task is not strictly necessary, but is intended to
facilitate and quicken the selection of the appropriate
implementation for a resource.</p>
      <p>The second task is actually responsible for most of the
work. Basically, a resource inspector is fed with the URI
of a resource (that should be the resource to be indexed)
and returns a valid descriptor containing a series of
anchors describing whichever parts of the resource that
may be relevant (or even the whole resource).</p>
      <p>The module may exploit external libraries, such as
filters, to extract features and fill appropriate descriptions.
This type of modules is in effect expected to become
rather complex, especially in case of multimedia
resources, where, for example, one might wish to
automatically process a video stream, its audio or even its
captions to split the entire sequence in smaller clips, each
identified by an anchor (the target element may be used to
specify the desired restriction). Indeed, similar scenarios
require the integration of state-of-the-art techniques from
numerous research areas, just not to cite the relatively
simpler issue of understanding the various resource
formats.</p>
      <p>Easier is the case of retrieving special metadata from
the resource or from linked resources that have been
provided during the authoring phase, i.e., directly by the
creator of the resource. These sources of information
should actually represent a rather reliable description of
the resource, even if they are often incomplete or
inaccurate, especially when inadequate tools are available
in the authoring phase.</p>
      <p>In any case, the information captured from a given
resource (perspective) should be appropriately organized
in anchor descriptions with the precise purpose of
providing relevant hints about how to classify the anchor
with respect to the domain ontology and to the mapping
modules that will be used for such descriptions.</p>
      <p>One may of course also decide not to depend on this
“descriptor factory” by creating the descriptor(s) through
some other external tools, and then feeding the
descriptions directly to the indexing service in place of the
resources.</p>
    </sec>
    <sec id="sec-8">
      <title>4.3. Semantic Mappers</title>
      <p>The mapping modules (or Semantic Mappers, SM) are
used to associate anchor descriptions to one or more
ontology concepts, according to a predefined interface.
Specifically, the information stored in a description is used
by a SM to determine how strongly the description is
associated to each ontology concept. These weighted
associations, also known as semantic annotations, are
represented as a conceptual spectrum (a sort of histogram
which indicates how much each concept is correlated to a
given object) and classify an anchor description according
to the semantics of its content.</p>
      <p>An advantage of using spectra for managing sets of
annotations is the possibility of combining different
spectra into a single spectrum. So, the weights
corresponding to the same concept can be simply averaged
among the given spectra.</p>
      <p>Each description element may contain metadata in the
desired format and can be exploited by all the SMs that
provide support for it. Consequently, more than one
spectrum can be obtained for each description.</p>
      <p>
        The techniques used to compute a spectrum given a set
of descriptions may be various: each implementation may
actually employ the most diverse methodologies and
external aids. For instance, in the case of textual
descriptions a tf/idf strategy may be utilized in
conjunction with multilingual synsets linked to the
ontology concepts, as explained in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Instead, for
descriptions containing visual features, case-based
reasoning techniques might be useful to determine the
most appropriate topics. In effect, the modularity of the
platform presented in this paper permits to explore and
integrate innovative approaches with relatively little effort.
      </p>
    </sec>
    <sec id="sec-9">
      <title>4.4. Search Engine</title>
      <p>The search service receives queries as descriptions and
returns a ranked list of anchors, each identified by the
container descriptor URL followed by the anchor XPath or
ID. In more details, the entire process proceeds as follows.
A web application designer proposes to the user one or
more input fields and interfaces that collect the user query
information and convert it in one or more descriptions,
depending on the type and complexity of the query
components. In effect, the descriptions can be composed
in a logical expression that is then passed to the search
service. It exploits the appropriate Semantic Mappers to
convert the descriptions in spectra, which are then used to
retrieve the most relevant anchors through the Annotation
Repository. Eventually, the anchors are ranked also taking
into account the logical expression.</p>
    </sec>
    <sec id="sec-10">
      <title>4.5. Anchor Retriever</title>
      <p>Once a search has been performed, the anchor retriever
service can be used to extract the requested anchors from
the respective descriptors, so that the web application
designer can actually access the targeted resources and
compose a response for the user. Precisely, the target
element extracted from an anchor specifies the exact part
of the resource that should be shown to the user. The web
application designer might however summarize it in
different ways according to the resource type.</p>
    </sec>
    <sec id="sec-11">
      <title>4.6. Extensibility directions</title>
      <p>An insight on the components that are directly involved
in the platform extensibility is here presented. Depending
on the necessity, it is possible to add functionalities even
adding a new single implementation for just one module.
The reuse of the existing modules is in fact possible to
support articulated scenarios.</p>
      <p>A new description type is generally necessary when a
new metadata schema is to be used. In this case it may be
sufficient to implement an appropriate Description
Handler module that wraps the data into an internal
format, which is exploited by a Semantic Mapper for
classification. In this way multiple schemas can be
transparently managed by various Semantic Mapper
modules.</p>
      <p>If the platform does not yet provide Semantic Mapper
modules capable of handling the new description types, an
ad-hoc implementation of a Semantic Mapper is probably
necessary, in order to efficaciously exploit all the
information for the creation of annotations towards the
ontology concepts. Also, a distinction can be made
between query related descriptions and resource oriented
descriptions, so that it is possible to appropriately enhance
simple user’s queries in the former case, for instance by
inferring correlated concepts through the domain
ontology.</p>
      <p>Additionally, different target types can be specified in
order to support the most various resource formats. In fact,
one may desire to consider only a well defined segment of
a resource, be it spatial, temporal or both. In these cases, a
new Target Handler implementation can manage, other
than the predefined tags, even the most complicated
structures. For instance, it is possible to support the
MPEG-7 standard to specify spatiotemporal segments of
video resources, which could then prove to be very useful
to an anchor browser on the application side. Similarly it
would even be possible to specify resources that have no
real URL, as in the case of real world objects, e.g., a book
on a particular shelf. Depending on the implementation,
one may decide to either provide methods to retrieve the
actual (part of) resource or to leave this burden to the
calling program.</p>
      <p>Eventually, a new implementation of a Resource
Inspector can be useful to automatically generate a
descriptor given a resource. In this case the programmer
should be aware of all of the above modules, in order to
meet all the schema specifications for the content of the
target and description elements. As in the case of the
Semantic Mappers, external filters or programming
libraries may facilitate the integration and reuse of
stateof-the-art techniques.</p>
    </sec>
    <sec id="sec-12">
      <title>5. Test case</title>
      <p>The platform has been tested using videos from the
Open Video (OV) Project [17], which collects and makes
available a repository of digitized video content for the
digital video, multimedia retrieval, digital library, and
other research communities. The unavailability of video
test-sets classified with a well defined ontology motivated
the creation of an ontology suitable for a subset of the OV
archive, namely the NASA K-16 Science Education
Programs Special Collection. At the time the collections
were analyzed, this was the biggest subset spanning on a
reasonably restricted domain (Table 1). The textual
descriptions associated to each video have been
automatically extracted to create one anchor for each
video, for a total of 555 anchors. The most frequent words
have then been exploited to manually generate the
ontology and the extra information (i.e., sysnsets)
necessary to map text to concepts with a tf/idf-based
Semantic Mapper. The ontology contains 169 concepts,
mostly about meteorology. In these preliminary tests, the
platform classified the anchors producing 1685
annotations to 105 out of the 169 ontology concepts.
Actually, some anchors could not be annotated because
their descriptions were either empty or contained words
which have not been considered for concept mapping. In
effect, the ontology covers a somewhat heterogeneous
domain, yet it still lacks a few branches, uncorrelated with
meteorology, which would cover the anchors lacking of
annotations.
Collection subset Videos
Internet Moving Images Archive 1121
NASA K-16 Science Education Programs 555
The Informedia Project at Carnegie Mellon 321
University
CHI Video Retrospective 121
University of Maryland HCIL Open House Video 52
Reports
Digital Himalaya Project 34
2001 TREC Video Retrieval Test Collection 26</p>
      <p>Search results are show in general good recall due to
the relations defined in the ontology, but are of course
strongly biased, as the ontology has been created upon the
dataset. They are therefore not reported here. Ongoing
work is focused on improving the ontology so that it can
be shared for other researchers and on integrating
featurebased technologies for the enhancement of video
classification and retrieval.</p>
    </sec>
    <sec id="sec-13">
      <title>6. Conclusions</title>
      <p>In this paper we have proposed a modular
implementation of a semantic platform that can be
exploited for any resource type. The idea of anchor is here
introduced as information unit. Each anchor may be used
to represent a particular aspect (or perspective, view) of a
given resource through any desired set of metadata,
without requiring burdensome metadata conversions or
adaptations. Each particular resource perspective is
defined within an anchor through a customizable target
element which allows to seamlessly reuse multiple digital
archives as they are, independently on their format or
location. Automatic or semi-automatic approaches are
supported to generate anchors given a resource and
multiple mapping strategies can be exploited to associate
anchors to ontology concepts. The resulting architecture
can therefore offer semantically enhanced fruition of
existing memories and resource repositories through
unified semantic indexing, search and retrieval services.
Eventually, the entire platform uses widely adopted web
technologies such as web services and standard formats to
expose a minimal complexity to a web application
designer whishing to propose these semantic enabled
services to the end users.</p>
      <p>Preliminary tests have been conducted with a set of
videos annotated through textual descriptions and seem
encouraging. Further testing is under development and
will include the adoption of feature-based techniques for
semantic classification and retrieval of images, videos and
audio resources. The ontology developed during this work
will be publicly released to stimulate further research, as
well the platform source code.</p>
    </sec>
    <sec id="sec-14">
      <title>7. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bonino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bosca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Corno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Farinetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Pescarmona</surname>
          </string-name>
          ,
          <article-title>HDOSE: an Holistic Distributed Open Semantic Elaboration Platform</article-title>
          ,
          <source>SWAP2004: 1st Italian Semantic Web Workshop 10th December</source>
          <year>2004</year>
          , Ancona, Italy
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bonino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Corno</surname>
          </string-name>
          , L. Farinetti,
          <article-title>Domain Specific Searches using Conceptual Spectra</article-title>
          ,
          <source>ICTAI 2004 the IEEE International Conference on Tools with Artificial Intelligence</source>
          ,
          <fpage>15</fpage>
          -
          <lpage>17</lpage>
          Nov 2004,
          <string-name>
            <given-names>Boca</given-names>
            <surname>Raton</surname>
          </string-name>
          , Florida, USA, pp.
          <fpage>680</fpage>
          -
          <lpage>687</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Bonino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Corno</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Farinetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ferrato</surname>
          </string-name>
          ,
          <article-title>Multilingual Semantic Elaboration in the DOSE platform</article-title>
          ,
          <source>SAC 2004, ACM Symposium on Applied Computing, March</source>
          <volume>14</volume>
          -17,
          <year>2004</year>
          , Nicosia, Cyprus
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lagoze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hunter</surname>
          </string-name>
          ,
          <source>The ABC Ontology and Model</source>
          , (
          <issue>Version3</issue>
          ),
          <source>Journal of Digital Information</source>
          , Special Issue - selected papers from Dublin Core 2001 Conference
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>[5] Dublin Core Metadata Initiative, http://purl.org/dc/</mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>V.</given-names>
            <surname>Mezaris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Doulaverakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Herrmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lehane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. O</given-names>
            <surname>'Connor</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Kompatsiaris</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. G.</surname>
          </string-name>
          <article-title>Strintzis, The SCHEMA Reference System: An Extensible Modular System for ContentBased Information Retrieval</article-title>
          ,
          <source>Proc. Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)</source>
          ,
          <year>April 2005</year>
          , Montreux, Switzerland
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Tansley</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bass</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <source>DSpace as an Open Archival Information System: Current Status and Future Directions, Lecture Notes in Computer Science: Research and Advanced Technology for Digital Libraries LNCS 2769</source>
          . pp.
          <fpage>446</fpage>
          -
          <lpage>460</lpage>
          ,
          <year>2003</year>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Sam</surname>
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Establishing Persistent Identity Using the Handle System</article-title>
          ,
          <source>Tenth International World Wide Web Conference</source>
          , Hong Kong, May 2001
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Simile</surname>
          </string-name>
          :
          <article-title>Semantic Interoperability of Metadata and Information in unLike Environments</article-title>
          , http://simile.mit.edu/ [10]
          <string-name>
            <surname>Mathias</surname>
            <given-names>Lux</given-names>
          </string-name>
          ,
          <article-title>Jutta Becker and Harald Krottmaier, Caliph &amp; Emir: Semantic Annotation and Retrieval in Personal Digital Photo Libraries</article-title>
          ,
          <source>Proceedings of CAiSE '03 Forum at 15th Conference on Advanced Information Systems Engineering</source>
          , p.
          <fpage>85</fpage>
          -
          <lpage>89</lpage>
          , June 16th-20th
          <year>2003</year>
          , Velden, Austria [11]
          <string-name>
            <surname>Moodle</surname>
          </string-name>
          , http://moodle.org/ [12]
          <string-name>
            <surname>Muffin</surname>
            , http://muffin.doit.org/ [13]
            <given-names>Moving</given-names>
          </string-name>
          <string-name>
            <surname>Picture</surname>
            Expert Group Standards, http://www.chiariglione.org/mpeg/ [14]
            <given-names>MARC</given-names>
          </string-name>
          <string-name>
            <surname>Standards</surname>
          </string-name>
          ,
          <article-title>Library of Congress Network Development and MARC Standards Office</article-title>
          , http://lcweb.loc.gov/marc/marc.html [15]
          <string-name>
            <given-names>CIDOC</given-names>
            <surname>Conceptual Reference</surname>
          </string-name>
          <article-title>Model (CRM)</article-title>
          , http://cidoc.ics.forth.gr/index.html [16]
          <article-title>Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)</article-title>
          , http://www.openarchives.org/ [17]The Open Video Project, http://www.open-video.org/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>