=Paper=
{{Paper
|id=Vol-233/paper-19
|storemode=property
|title=Rule-based Reasoning for Semantic Image Segmentation and Interpretation
|pdfUrl=https://ceur-ws.org/Vol-233/p39.pdf
|volume=Vol-233
|dblpUrl=https://dblp.org/rec/conf/samt/BerkaAA06
}}
==Rule-based Reasoning for Semantic Image Segmentation and Interpretation==
Rule-based Reasoning for Semantic Image
Segmentation and Interpretation
Petr Berka, Thanos Athanasiadis and Yannis Avrithis
Abstract— In this paper, we propose the application of rule- II. I NITIAL S EGMENTATION AND R EGION L ABELING
based reasoning for knowledge assisted image segmentation and
object detection. A region merging approach is proposed based
Our intention is to operate on a higher level of information
on fuzzy labeling and not on visual descriptors, while reasoning where image regions are linked to possible labels rather than
is used in evaluation of dissimilarity between adjacent regions only to their visual features. For the representation of an image
according to rules applied on local information. we adopt the Attributed Relational Graph (ARG) [3]. An ARG
Index Terms— Knowledge-assisted analysis, rule-based reason- is a graph structure that holds a region-based representation of
ing, semantic region merging. the image, where the set of vertices V corresponds to image
regions and the set of edges E to links between adjacent
regions. The ARG is constructed based on an initial color
I. I NTRODUCTION RSST segmentation that produces a few tens of regions.
Each vertex of the graph holds the Dominant Color, Region
A UTOMATIC segmentation of images is a very challeng-
ing task in computer vision and one of the most crucial
steps toward image understanding. Although a great effort
Shape and Homogeneous Texture MPEG-7 visual descriptors
extracted for this specific region.
has been consumed in designing generic, robust and efficient Based on these features and the adjacency information
segmentation algorithms [1], still human vision perception out- provided by the ARG, the target is to assign to each region a a
performs any state-of-the-art computer algorithm. One main fuzzy set of labels La . In order to achieve this (for more details
reason for this is that human vision is based also in high level please refer to previous work in [4]), we compute a matching
prior knowledge about the semantic meaning of the objects distance value between each one of these regions and each
that compose the image. one of the prototype instances of all concepts in the domain
ontology. This process results to an initial fuzzy labeling of
Knowledge assisted analysis can be defined as a tightly
the regions with concepts from the knowledge base, i.e. for
coupled and constant interaction between low level image
region a we have the fuzzy set La = k ck /wk , where k
analysis and higher level knowledge. In this paper we propose
is the cardinality of the (crisp) set of all concepts C = {ck }
an algorithm that involves simultaneously both segmentation
in the knowledge base and wk = μa (ck ) is the degree of
and detection of simple objects. Starting from traditional
membership of element ck in the fuzzy set La .
graph-based segmentation, the proposed technique continues
with fuzzy region labeling and semantic region merging. More
III. RULE -BASED I MAGE A NALYSIS
specifically the latter is based on rule-based reasoning, using
knowledge about the possible labels of the candidate for This section presents the integration of a rule based rea-
merging regions and of their direct neighbors, to improve the soning system into an image segmentation algorithm. Based
initial segmentation and labeling. on the foundations described in the previous section, we
To perform rule-based reasoning, we use the expert system introduce a novel segmentation algorithm that relies on fuzzy
NEST [2], developed at the University of Economics, Prague. region labeling and rules to solve the problem of image
This system follows the idea of compositional approach to oversegmentation.
inference introduced in mid 70s by the early expert systems
MYCIN and PROSPECTOR. The rules used by NEST are in A. Semantic region merging
the form: Recursive Shortest Spanning Tree, or simply RSST, is a
IF condition THEN conclusion (w) bottom-up segmentation algorithm that begins from the pixel
level and iteratively merges neighbor regions according to a
where condition is a conjunction of propositions, conclusion distance value until certain termination criteria are satisfied.
is a single proposition and weight w expresses the uncertainty This distance is calculated based on color and texture char-
of the conclusion if the condition is true. During consultation, acteristics, which are independent of the area’s size. In every
all rules for which the condition is true with some positive step the two regions with the least distance are merged; visual
degree (weight) are activated and their contributions are used characteristics of the new region are extracted and all distances
to compute the weights of goals. are updated accordingly.
We introduce here a modified version of RSST, called
P. Berka is with the Dept. of Information and Knowledge Engineering, Semantic RSST (S-RSST) that aims to improve the usual
University of Economics, Prague.
Th. Athanasiadis and Y. Avrithis are with Image, Video and Multimedia oversegmentation results by incorporating region labeling in
Systems Laboratory, National Technical University of Athens. the segmentation process [5]. In this approach the distance
between two adjacent regions a and b (vertices va and vb in an exemplar image of a beach scene, where 0, 2, 3, 8, are
the graph) is calculated using NEST, in a fashion described adjacent to 7. Table I illustrates the associated fuzzy set of each
later on, and this dissimilarity value is assigned as the weight region, i.e. the degrees of membership of each concept. Correct
of the respective graph’s edge eab . concepts for each region, according to the ground truth, are
Let us now examine in detail one iteration of the S- highlighted with bold letters.
RSST algorithm. Firstly, the edge eab with the least weight
TABLE I
is selected, then regions a and b are merged. Vertex vb is
I NITIAL FUZZY LABELING
removed completely from the ARG, whereas va is updated
appropriately. This update procedure consists of the following Concepts
two actions: Regions Sky Sea Cliff Plant Sand Person
1) Re-evaluation of the degrees of membership of the labels 0 0.66 0.82 0.47 0.25 0.58 0.75
fuzzy set in a weighted average (w.r.t. the regions’ size) 2 0.74 0.77 0.44 0.34 0.69 0.62
fashion. 3 0.70 0.52 0.34 0.40 0.61 0.73
2) Re-adjustment of the ARG edges by removing edge 7 0.78 0.79 0.38 0.35 0.55 0.39
8 0.66 0.79 0.64 0.27 0.59 0.37
eab and re-evaluating the weight of the affected edges
invoking NEST.
This procedure continues until the edge e∗ with the least From this input, the rule base infers as dominant labels Sea
weight in the ARG is bigger than a threshold: w(e∗ ) > Tw . and Person for 0, Sea and Sky for 2, Person and Sky for 3, Sea
This threshold is calculated in the beginning of the algorithm, and Sky for 7, and Sea for 8. This will result in the following
based on the histogram of all weights in E. ordering of region pairs according to their similarity: 7-2 (most
similar), 7-8, 7-0 (less similar), 7-3 (most dissimilar); 47 rules
B. Estimation of regions dissimilarity have been activated during this inference. Thus the regions 7
Expert system NEST serves in knowledge-assisted analysis and 2 will be merged in this iteration of the S-RSST algorithm.
as an estimator for the dissimilarity of two adjacent regions
a and b in the image. The input consists of the fuzzy sets V. C ONCLUSIONS
La and Lb of the two regions, with membership functions μA The methodology presented in this paper aimed in improv-
and μB respectively as well as the fuzzy set of their direct ing image segmentation based on initial labeling of regions
neighbors. More specifically, we consider as direct neighbors and not only on visual features. This hybrid algorithm involves
only the four dominant directional neighbors of regions a and ruled-based reasoning in a region merging process, by means
b (north, south, west, east). With this input, NEST is able to of calculating the semantic distance between two adjacent
reason about the (dis)similarity between two adjacent regions, regions based on local knowledge. First experiments have
according to rules applied on local information of what each shown the feasibility of our approach, nevertheless the current
region might represent given its neighborhood. The rule base rule base must be thoroughly tested. In our future work, we
is structured into two layers. The first layer determines the will incorporate also some domain knowledge (e.g. sea cannot
dominant label(s) of the regions according to the initial label- be above sky) to improve the whole process.
ing of regions and their neighbors; the confidence of the initial
labeling (expressed in these rules as ”high”, ”medium”, ”low”)
ACKNOWLEDGMENT
is derived from the initial fuzzy labels. This layer contains 60
rules; a rule for each combination of label, of its confidence, of This research was supported by the European Commission
the region and its neighbor. The next layer increases/decreases under contract FP6-027026 K-SPACE. Thanos Athanasiadis is
the similarity of regions a and b if they share/don’t share funded by the Greek Secretariat of Research and Technology
the same dominant label. This layer contains 48 rules; a (PENED Ontomedia 03 ED 475).
rule for each combination of possible dominant labels of the
two regions. All rules have been created empirically, by a R EFERENCES
knowledge engineer. Example rules for the first and second [1] P. Salembier and F. Marques, “Region-based representations of image
layer look like follows: and video - segmentation tools for multimedia services,” IEEE Trans. on
Circuits and Systems for Video Technology, vol. 9, no. 8, December 1999.
IF A conf sky(medium) AND A conf sea(medium) AND [2] P. Berka, V. Las, and V. Svatek, “Nest: re-engineering the compositional
A conf sand(low) AND leftofA conf sky(high) AND approach to rule-based inference,” Neural Network World, pp. 367–379,
rightofA conf sky(high) THEN A dominant(sky) (0.8) May 2004.
[3] S. Berretti, A. D. Bimbo, and E. Vicario, “Efficient matching and indexing
IF A dominant(sky) AND B dominant(sky,sea) of graph models in content-based retrieval,” IEEE Trans. on Circuits and
THEN A B similar (0.7) Systems for Video Technology, vol. 11, no. 12, pp. 1089–1105, Dec. 2001.
[4] Th. Athanasiadis, V. Tzouvaras, K. Petridis, F. Precioso, Y. Avrithis,
Inferred weight of the (dis)similarity proposition is used and Y. Kompatsiaris, “Using a multimedia ontology infrastructure for
to drive segmentation based on semantic information and to semantic annotation of multimedia content,” in Proceedgins of 5th In-
resolve oversegmentation problems. ternational Workshop on Knowledge Markup and Semantic Annotation
(SemAnnot ’05), Galway, Ireland, 2005.
IV. F IRST E XPERIMENTS [5] Th. Athanasiadis, Y. Avrithis, and S. Kollias, “A semantic region growing
approach in image segmentation and annotation,” in Proceedgins of 1st
We will illustrate the rule-based reasoning step on a fol- International Workshop on Semantic Web Annotations for Multimedia
lowing simple example. Let 0, 2, 3, 7, 8 be five regions of (SWAMM ’06), Edinburgh, Scotland, 2006.