=Paper=
{{Paper
|id=Vol-233/paper-19
|storemode=property
|title=Rule-based Reasoning for Semantic Image Segmentation and Interpretation
|pdfUrl=https://ceur-ws.org/Vol-233/p39.pdf
|volume=Vol-233
|dblpUrl=https://dblp.org/rec/conf/samt/BerkaAA06
}}
==Rule-based Reasoning for Semantic Image Segmentation and Interpretation==
<pdf width="1500px">https://ceur-ws.org/Vol-233/p39.pdf</pdf>
<pre>
             Rule-based Reasoning for Semantic Image
                  Segmentation and Interpretation
                                     Petr Berka, Thanos Athanasiadis and Yannis Avrithis


   Abstract— In this paper, we propose the application of rule-              II. I NITIAL S EGMENTATION AND R EGION L ABELING
based reasoning for knowledge assisted image segmentation and
object detection. A region merging approach is proposed based
                                                                             Our intention is to operate on a higher level of information
on fuzzy labeling and not on visual descriptors, while reasoning          where image regions are linked to possible labels rather than
is used in evaluation of dissimilarity between adjacent regions           only to their visual features. For the representation of an image
according to rules applied on local information.                          we adopt the Attributed Relational Graph (ARG) [3]. An ARG
   Index Terms— Knowledge-assisted analysis, rule-based reason-           is a graph structure that holds a region-based representation of
ing, semantic region merging.                                             the image, where the set of vertices V corresponds to image
                                                                          regions and the set of edges E to links between adjacent
                                                                          regions. The ARG is constructed based on an initial color
                        I. I NTRODUCTION                                  RSST segmentation that produces a few tens of regions.
                                                                          Each vertex of the graph holds the Dominant Color, Region
A      UTOMATIC segmentation of images is a very challeng-
       ing task in computer vision and one of the most crucial
steps toward image understanding. Although a great effort
                                                                          Shape and Homogeneous Texture MPEG-7 visual descriptors
                                                                          extracted for this speciﬁc region.
has been consumed in designing generic, robust and efﬁcient                  Based on these features and the adjacency information
segmentation algorithms [1], still human vision perception out-           provided by the ARG, the target is to assign to each region a a
performs any state-of-the-art computer algorithm. One main                fuzzy set of labels La . In order to achieve this (for more details
reason for this is that human vision is based also in high level          please refer to previous work in [4]), we compute a matching
prior knowledge about the semantic meaning of the objects                 distance value between each one of these regions and each
that compose the image.                                                   one of the prototype instances of all concepts in the domain
                                                                          ontology. This process results to an initial fuzzy labeling of
   Knowledge assisted analysis can be deﬁned as a tightly
                                                                          the regions with concepts from the knowledge         base, i.e. for
coupled and constant interaction between low level image
                                                                          region a we have the fuzzy set La =            k ck /wk , where k
analysis and higher level knowledge. In this paper we propose
                                                                          is the cardinality of the (crisp) set of all concepts C = {ck }
an algorithm that involves simultaneously both segmentation
                                                                          in the knowledge base and wk = μa (ck ) is the degree of
and detection of simple objects. Starting from traditional
                                                                          membership of element ck in the fuzzy set La .
graph-based segmentation, the proposed technique continues
with fuzzy region labeling and semantic region merging. More
                                                                                       III. RULE -BASED I MAGE A NALYSIS
speciﬁcally the latter is based on rule-based reasoning, using
knowledge about the possible labels of the candidate for                     This section presents the integration of a rule based rea-
merging regions and of their direct neighbors, to improve the             soning system into an image segmentation algorithm. Based
initial segmentation and labeling.                                        on the foundations described in the previous section, we
   To perform rule-based reasoning, we use the expert system              introduce a novel segmentation algorithm that relies on fuzzy
NEST [2], developed at the University of Economics, Prague.               region labeling and rules to solve the problem of image
This system follows the idea of compositional approach to                 oversegmentation.
inference introduced in mid 70s by the early expert systems
MYCIN and PROSPECTOR. The rules used by NEST are in                       A. Semantic region merging
the form:                                                                    Recursive Shortest Spanning Tree, or simply RSST, is a
               IF condition THEN conclusion (w)                           bottom-up segmentation algorithm that begins from the pixel
                                                                          level and iteratively merges neighbor regions according to a
where condition is a conjunction of propositions, conclusion              distance value until certain termination criteria are satisﬁed.
is a single proposition and weight w expresses the uncertainty            This distance is calculated based on color and texture char-
of the conclusion if the condition is true. During consultation,          acteristics, which are independent of the area’s size. In every
all rules for which the condition is true with some positive              step the two regions with the least distance are merged; visual
degree (weight) are activated and their contributions are used            characteristics of the new region are extracted and all distances
to compute the weights of goals.                                          are updated accordingly.
                                                                             We introduce here a modiﬁed version of RSST, called
  P. Berka is with the Dept. of Information and Knowledge Engineering,    Semantic RSST (S-RSST) that aims to improve the usual
University of Economics, Prague.
  Th. Athanasiadis and Y. Avrithis are with Image, Video and Multimedia   oversegmentation results by incorporating region labeling in
Systems Laboratory, National Technical University of Athens.              the segmentation process [5]. In this approach the distance
between two adjacent regions a and b (vertices va and vb in         an exemplar image of a beach scene, where 0, 2, 3, 8, are
the graph) is calculated using NEST, in a fashion described         adjacent to 7. Table I illustrates the associated fuzzy set of each
later on, and this dissimilarity value is assigned as the weight    region, i.e. the degrees of membership of each concept. Correct
of the respective graph’s edge eab .                                concepts for each region, according to the ground truth, are
   Let us now examine in detail one iteration of the S-             highlighted with bold letters.
RSST algorithm. Firstly, the edge eab with the least weight
                                                                                                      TABLE I
is selected, then regions a and b are merged. Vertex vb is
                                                                                             I NITIAL FUZZY LABELING
removed completely from the ARG, whereas va is updated
appropriately. This update procedure consists of the following                                             Concepts
two actions:                                                              Regions      Sky     Sea     Cliff    Plant    Sand      Person
   1) Re-evaluation of the degrees of membership of the labels                0       0.66     0.82     0.47     0.25     0.58      0.75
       fuzzy set in a weighted average (w.r.t. the regions’ size)             2       0.74     0.77     0.44     0.34     0.69      0.62
       fashion.                                                               3       0.70     0.52     0.34     0.40     0.61      0.73
   2) Re-adjustment of the ARG edges by removing edge                         7       0.78     0.79     0.38     0.35     0.55      0.39
                                                                              8       0.66     0.79     0.64     0.27     0.59      0.37
       eab and re-evaluating the weight of the affected edges
       invoking NEST.
This procedure continues until the edge e∗ with the least             From this input, the rule base infers as dominant labels Sea
weight in the ARG is bigger than a threshold: w(e∗ ) > Tw .         and Person for 0, Sea and Sky for 2, Person and Sky for 3, Sea
This threshold is calculated in the beginning of the algorithm,     and Sky for 7, and Sea for 8. This will result in the following
based on the histogram of all weights in E.                         ordering of region pairs according to their similarity: 7-2 (most
                                                                    similar), 7-8, 7-0 (less similar), 7-3 (most dissimilar); 47 rules
B. Estimation of regions dissimilarity                              have been activated during this inference. Thus the regions 7
   Expert system NEST serves in knowledge-assisted analysis         and 2 will be merged in this iteration of the S-RSST algorithm.
as an estimator for the dissimilarity of two adjacent regions
a and b in the image. The input consists of the fuzzy sets                                    V. C ONCLUSIONS
La and Lb of the two regions, with membership functions μA             The methodology presented in this paper aimed in improv-
and μB respectively as well as the fuzzy set of their direct        ing image segmentation based on initial labeling of regions
neighbors. More speciﬁcally, we consider as direct neighbors        and not only on visual features. This hybrid algorithm involves
only the four dominant directional neighbors of regions a and       ruled-based reasoning in a region merging process, by means
b (north, south, west, east). With this input, NEST is able to      of calculating the semantic distance between two adjacent
reason about the (dis)similarity between two adjacent regions,      regions based on local knowledge. First experiments have
according to rules applied on local information of what each        shown the feasibility of our approach, nevertheless the current
region might represent given its neighborhood. The rule base        rule base must be thoroughly tested. In our future work, we
is structured into two layers. The ﬁrst layer determines the        will incorporate also some domain knowledge (e.g. sea cannot
dominant label(s) of the regions according to the initial label-    be above sky) to improve the whole process.
ing of regions and their neighbors; the conﬁdence of the initial
labeling (expressed in these rules as ”high”, ”medium”, ”low”)
                                                                                             ACKNOWLEDGMENT
is derived from the initial fuzzy labels. This layer contains 60
rules; a rule for each combination of label, of its conﬁdence, of     This research was supported by the European Commission
the region and its neighbor. The next layer increases/decreases     under contract FP6-027026 K-SPACE. Thanos Athanasiadis is
the similarity of regions a and b if they share/don’t share         funded by the Greek Secretariat of Research and Technology
the same dominant label. This layer contains 48 rules; a            (PENED Ontomedia 03 ED 475).
rule for each combination of possible dominant labels of the
two regions. All rules have been created empirically, by a                                       R EFERENCES
knowledge engineer. Example rules for the ﬁrst and second           [1] P. Salembier and F. Marques, “Region-based representations of image
layer look like follows:                                                and video - segmentation tools for multimedia services,” IEEE Trans. on
                                                                        Circuits and Systems for Video Technology, vol. 9, no. 8, December 1999.
   IF A conf sky(medium) AND A conf sea(medium) AND                 [2] P. Berka, V. Las, and V. Svatek, “Nest: re-engineering the compositional
    A conf sand(low) AND leftofA conf sky(high) AND                     approach to rule-based inference,” Neural Network World, pp. 367–379,
  rightofA conf sky(high) THEN A dominant(sky) (0.8)                    May 2004.
                                                                    [3] S. Berretti, A. D. Bimbo, and E. Vicario, “Efﬁcient matching and indexing
      IF A dominant(sky) AND B dominant(sky,sea)                        of graph models in content-based retrieval,” IEEE Trans. on Circuits and
                THEN A B similar (0.7)                                  Systems for Video Technology, vol. 11, no. 12, pp. 1089–1105, Dec. 2001.
                                                                    [4] Th. Athanasiadis, V. Tzouvaras, K. Petridis, F. Precioso, Y. Avrithis,
   Inferred weight of the (dis)similarity proposition is used           and Y. Kompatsiaris, “Using a multimedia ontology infrastructure for
to drive segmentation based on semantic information and to              semantic annotation of multimedia content,” in Proceedgins of 5th In-
resolve oversegmentation problems.                                      ternational Workshop on Knowledge Markup and Semantic Annotation
                                                                        (SemAnnot ’05), Galway, Ireland, 2005.
                 IV. F IRST E XPERIMENTS                            [5] Th. Athanasiadis, Y. Avrithis, and S. Kollias, “A semantic region growing
                                                                        approach in image segmentation and annotation,” in Proceedgins of 1st
  We will illustrate the rule-based reasoning step on a fol-            International Workshop on Semantic Web Annotations for Multimedia
lowing simple example. Let 0, 2, 3, 7, 8 be ﬁve regions of              (SWAMM ’06), Edinburgh, Scotland, 2006.

</pre>