<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Context and a Genetic Algorithm for Knowledge-Assisted Image Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stamatia Dasiopoulou</string-name>
          <email>dasiop@iti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>George Th. Papadopoulos</string-name>
          <email>papad@iti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Phivos Mylonas</string-name>
          <email>fmylonas@image.ntua.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yannis Avrtihis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ioannis Kompatsiaris</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>S. Dasiopoulou and G.Th. Papadopoulos are with the Information Processing Laboratory, Electrical and Computer Engineering Department, Aristotle University of Thessaloniki</institution>
          ,
          <addr-line>Greeece, and the Informatics and Telematics</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>- In this poster, we present an approach to contextualized semantic image annotation as an optimization problem. Ontologies are used to capture general and contextual knowledge of the domain considered, and a genetic algorithm is applied to realize the final annotation. Experiments with images from the beach vacation domain demonstrate the performance of the proposed approach and illustrate the added value of utilizing contextual information.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Index Terms— knowledge-assisted image analysis, ontologies,
context modelling, genetic algorithm.</p>
    </sec>
    <sec id="sec-2">
      <title>I. INTRODUCTION</title>
      <p>Given the continuously increasing information flow,
providing tools and methodologies for (semi-) automatically
extracting content descriptions at a conceptual level is a key factor
for enabling users to effectively access the information needed.
Incorporating knowledge has been acknowledged as probably
the only viable solution to overcome the limitations resulting
from attempts to imitate the way users assess similarity by
focusing on the definition of suitable descriptors that could
be automatically extracted and of appropriate metrics. Going
though the relevant literature, one can distinguish two main
streams in the reported knowledge-driven approaches with
respect to the adopted knowledge acquisition and processing
strategy: implicit, realized by machine learning methods, and
explicit, realized by model-based approaches. The former
provide relatively powerful methods for discovering complex and
hidden relationships between image data and corresponding
conceptual descriptions. Model-based analysis approaches on
the other hand make use of explicitly defined prior knowledge,
attempting to provide a coherent domain model to support
symbolic inference; as a result issues are raised with respect
to the entailed complexity and the completeness of knowledge
acquisition and construction.</p>
      <p>In order to benefit from the advantages of each category and
overcome their individual limitations, we propose an approach
coupling explicit prior knowledge and a genetic algorithm for
domain-specific semantic image analysis. Ontologies, being
the leading edge technology for knowledge sharing and reuse
and providing well-defined inferences, have been selected
for representation. The employed knowledge considers both
Segmentation</p>
      <p>Low-Level</p>
      <p>Descriptors
Spatial
Relations</p>
      <p>Descriptors
Matching
Context
Analysis</p>
      <p>Module
Hypothesis</p>
      <p>Sets
Refined
Hypothesis</p>
      <p>Sets</p>
      <p>Genetic
Algorithm
Semantic
Annotation</p>
      <p>Ontology Infrastructure
OMnutlotilmogeideisa DOonmtoaloingies
DOLCE</p>
      <p>Context
Ontology
Fig. 1. Overall architecture
high- and low-level information in order to provide the means
for driving the semantic analysis and annotation. High-level
knowledge includes the domain concepts of interest, their
relations and contextual knowledge in terms of fuzzy ontological
relations. Low-level knowledge on the other hand, consists of
the low-level visual and structural descriptions required for the
actual analysis process.</p>
    </sec>
    <sec id="sec-3">
      <title>II. OVERALL ARCHITECTURE</title>
      <p>
        The overall architecture of the proposed framework is
illustrated in Fig. 1. To represent the required knowledge
components, the ontology infrastructure introduced in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] has
been employed and appropriately extended to provide support
for contextual knowledge modelling. Analysis starts by
segmentation, and subsequently low-level descriptors and spatial
relations are extracted for the generated image segments. An
extension of the Recursive Shortest Spanning Tree (RSST)
algorithm has been used for segmenting the image [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], while
descriptors extraction is based on the guidelines given by the
MPEG-7 eXperimentation Model (XM). The image
preprocessing stage completes with the extraction of spatial relations
between adjacent image segments, where fuzzy definitions
have been employed [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Once the low-level descriptors are
available, an initial set of hypotheses is generated for each
image segment based on the distance between each segment
extracted descriptors and the domain concepts prototypical
descriptors that are included in the knowledge base. Thereby,
a set of plausible annotations with corresponding degrees of
confidence are produced for each segment, which are refined in
the sequel based on the provided fuzzy contextual knowledge.
The refined hypotheses sets along with the segments spatial
relations are eventually passed to the genetic algorithm, which
based on the provided domain knowledge decides the optimal
semantic interpretation.
      </p>
      <p>III. CONTEXT MODELLING AND ANALYSIS</p>
      <p>A fuzzified ontology, defined on top of the domain one, has
been built to represent the modelled contextual knowledge. To
provide a sufficiently descriptive context model, we selected a
set of diverse relations from the ones included in the
MPEG7 specification (Table ??) and extended their definition to
support uncertainty. More specifically, the defined fuzzified
contextual domain knowledge consists of the set</p>
      <p>
        OF =
(1)
where OF forms a domain-specific “fuzzified” ontology, C
is the set of all possible concepts it describes, rci,cj =
F (Rci,cj ) : C × C → [
        <xref ref-type="bibr" rid="ref1">0, 1</xref>
        ] , Rci,cj : C × C → {0, 1}, i, j ∈ N
denotes a fuzzy ontological relation amongst two concepts
ci, cj and Rci ,cj is a crisp semantic relation amongst the
two concepts. For the representation the RDF language has
been selected and reification was used in order to achieve the
desired expressiveness. The refinement of the initial
hypotheses’ degrees is performed based on the readjustment algorithm
presented in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], where the notion of overall context relevance
of a concept to the root element is employed to tackle cases
where a concept is related to multiple ones.
      </p>
    </sec>
    <sec id="sec-4">
      <title>IV. GENETIC ALGORITHM</title>
      <p>Under the proposed approach, each chromosome represents
a possible solution. To determine the degree to which each
interpretation is plausible, the employed fitness function has
been defined as follows:
f (C) = λ × F Snorm
+ (1 − λ) × SCnorm,
(2)
where C denotes a particular Chromosome, F Snorm refers
to the degree of low-level descriptors similarity, and SCnorm
stands for the degree of consistency with respect to the
provided spatial domain knowledge. The variable λ is introduced
to adjust the degree to which visual similarity and spatial
consistency should affect the final outcome. SCnorm and
F Snorm are computed as follows:</p>
      <p>F Snorm
=</p>
      <p>PiN=−01 IM (gi) − Imin ,</p>
      <p>Imax − Imin
(3)
where Imin is the sum of the minimum degrees of confidence
assigned of each region hypotheses set and Imax the sum of
the maximum degrees of confidence values respectively.</p>
      <p>SCnorm = SC2+ 1 and SC = PrM=1 IMsr (gi, gj ) , (4)
where M denotes the number of relations in the constraints
that had to be examined.</p>
      <p>V. EXPERIMENTAL RESULTS AND CONCLUSIONS
The proposed semantic image analysis framework was
tested on the beach vacations domain on the following
concepts: Sea, Sky, Sand, Plant, Cliff and Person. The employed
descriptors are Scalable Color, Homogeneous Texture, Edge
Histogram and Region Shape, while the implemented fuzzy
spatial relations include the eight diagonal relations, i.e., left,
above, above-left etc. Indicative results are given in Fig. 2,
illustrating the added value of utilizing contextual information
and the performance of the genetic algorithm.</p>
    </sec>
    <sec id="sec-5">
      <title>ACKNOWLEDGMENT The work presented in this paper was partially supported by the European Commission under contracts FP6-001765 aceMedia and FP6-027026 K-Space.</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>T.</given-names>
            <surname>Adamek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. O</given-names>
            <surname>'Connor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Murphy</surname>
          </string-name>
          ,
          <article-title>Region-based Segmentation of Images Using Syntactic Visual Features</article-title>
          .
          <source>Workshop on Image Analysis for Multimedia Interactive Services</source>
          , (WIAMIS), Montreux, Switzerland,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bloehdorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Petridis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Saathoff</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Simou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tzouvaras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Avrithis</surname>
          </string-name>
          , I. Kompatsiaris,
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. G.</given-names>
            <surname>Strintzis</surname>
          </string-name>
          ,
          <article-title>Semantic Annotation of Images and Videos for Multimedia Analysis</article-title>
          .
          <source>2nd European Semantic Web Conference (ESWC)</source>
          , Heraklion, Greece, May
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Ph. Mylonas</surname>
            , Th. Athanasiadis and
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Avrithis</surname>
          </string-name>
          ,
          <article-title>Improving image analysis using a contextual approach</article-title>
          .
          <source>Internationl Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)</source>
          , Seoul,
          <year>April 2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Panagi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dasiopoulou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Th. Papadopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Kompatsiaris</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.G.</given-names>
            <surname>Strintzis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A Genetic</given-names>
            <surname>Algorithm</surname>
          </string-name>
          <article-title>Approach to Ontology-Driven Semantic Image Analysis</article-title>
          .
          <source>IEE International Cofnerence on Visual Information Engineering (VIE)</source>
          , Bangalore, India,
          <year>Sept 2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>