LABELLING IMAGE REGIONS USING SPATIAL PROTOTYPES

                               Carsten Saathoff, Marcin Grzegorzek, and Steffen Staab

                                          University of Koblenz
                              {saathoff,marcin,staab}@uni-koblenz.de


                        ABSTRACT                                   2.1. Constraint acquisition

In this paper we present an approach for introducing spatial       Spatial constraint templates constitute the background knowl-
context into image region labelling. We combine low-level          edge in our approach. We acquire these templates from so-
classification with spatial reasoning based on explicitly repre-   called spatial prototypes, which are manually labelled images.
sented spatial arrangements of labels. We formalise the prob-      We mine the prototypes using support and confidence as se-
lem using Linear Programming, and provide an evaluation on         lection criteria, and come up with a set of templates represent-
a set of 923 images.                                               ing typical spatial arrangements.
                                                                       For each label l we have to determine in what spatial re-
                                                                   lation to other labels it might be found. Therefore, for each
                   1. INTRODUCTION                                 spatial relation type t, we consider the relation set Rt↓l , which
                                                                   contains the relations of type t from images depicting l. We
Exploiting solely low-level features for automatic image re-                       l,l0
                                                                   then define Rt↓l     to be the set of relations between segments
gion labelling often leads to unsatisfactory results, and recent                                                         ∗,l          0

studies [1] show the importance of contextual and spatial in-      s, s0 depicting l and l0 , respectively, and finally Rt↓l to denote
formation. In this paper, we propose an approach based on          all relations between an arbitrary region and a region depict-
[2] that integrates a wavelet-based low-level classification [3]   ing l0 . The confidence of a label arrangement is then defined
                                                                                        l,l0                                               0
                                                                                      |Rt↓l |                                         |Rl,l |
and spatial reasoning based on Linear Programming.                 as γt (l, l0 ) =     ∗,l0     , and the support as σt (l, l0 ) = |Rtt↓l | .
                                                                                      |Rt↓l  |
     During the training phase we train the classifiers and ac-
                                                                       Finally, we define the template Tt for the spatial relation
quire our background knowledge. In the classification phase,
                                                                   type t as Tt (l, l0 ) = 1 iff σt (l, l0 ) > thσ and γt (l, l0 ) > thγ ,
each image is first segmented by an automatic segmentation
                                                                   and Tt (l, l0 ) = 0 otherwise. For absolute spatial relations we
algorithm. The low-level classification produces for each re-
                                                                   define support, confidence, and the template accordingly.
gion si and each supported label lj a probability θi (lj ). Then
relative (e.g. above-of, left-of ) and absolute (e.g. above-all)
spatial relations are extracted, and are processed by the spa-     2.2. Spatial reasoning with linear programming
tial reasoning together with the probabilities. The output is a    We will show in the following how to formalize image la-
final labelling that is optimal with respect to both our spatial   belling with spatial constraints as a linear program. We con-
background knowledge and the probabilities.                        sider Binary Integer Programs, which have the form
     We provide results of a number of experiments showing
that our approach provides comparable performance with low                                maximize    Z             = cT x
numbers of training examples. Due to length constraints, we                               subject to Ax             =b                           (1)
will not detail the low-level processing at all.                                                      x             ∈ {0, 1}

                                                                   Goal of the solving process is to find a set of assignments to
         2. SPATIAL REASONING BASED ON                             the integer variables in x with a maximum evaluation score Z
                  CONSTRAINTS                                      that satisfy all the constraints.
                                                                        In order to represent the image labelling problem as a lin-
The goal of the spatial reasoning step is to exploit background    ear program, we create a set of linear constraints from each
knowledge about the typical spatial arrangements of objects        spatial relation in the image, and determine the objective co-
in images in order to improve the labelling accuracy com-          efficients based on the hypotheses sets and the constraint tem-
pared to pure local, low-level feature-based approaches. We        plates. Let Oi ⊆ R be the set of outgoing relations for re-
will first discuss the acquisition of constraint templates from    gion si ∈ S, i.e. Oi = {r ∈ R|∃s ∈ S, s 6= si : r =
a set of spatial prototypes, and then describe the formalisation   (si , s)}, and Ei ⊆ R the set of incoming spatial relations,
of the problem as a Linear Program.                                i.e. Ei = {r ∈ R|∃s ∈ S, s 6= si : r = (s, si )}. Then,
for each possible pair of label assignments to the regions, we                                 3. EXPERIMENTS AND RESULTS
create a variable cko
                   itj , representing the possible assignment of
lk to si and lo to sj with respect to the relation r with type                     We evaluated the approach on a set of 923 images depicting
t ∈ T . Each cko                                     ko
                itj is an integer variable and citj = 1 repre-
                                                                                   outdoor scenes. We used the labels building, foliage, moun-
sents the assignments si = lk and sj = lo , while cko                              tain, person, road, sailing-boat, sand, sea, sky, snow. In our
                                                            itj = 0
means that these assignments are not made. Since every such                        evaluation we used the spatial relations above-of, below-of,
variable represents exactly one assignment of labels to the                        left-of and right-of, the absolute spatial relations above-all
involved regions, and only one label might be assigned to                          and below-all, and we used the thresholds σ = 0.001 and
a region in the final solution, we have to add this restric-                       γ = 0.2 for both relative and absolute spatial relations. We
tion as linear constraints. The constraints       are                              compared the performance of the low-level classification with
                                        P         P formalised
                                                           ko
                                                                   as
                                                                                   the spatial reasoning on different training set sizes and mea-
∀r ∈ R : r = (si , sj ) ∈ R →             lk ∈L           c
                                                     lo ∈L itj =   1.
These constraints assure that there is only one pair of labels                     sured precision (p), recall (r) and the classification rate (c).
assigned to a pair of regions per spatial relation, but it still                   Further we computed the F-Measure (f). In Table 1 the aver-
                                         k0 o0                                     age for each of these measures is given.
there could be two variables ckoitj and cit0 j 0 both being set to 1,
                                      0
which would result in both k and k assigned to si .                                                      Low-Level                  BIP
                                                                                        set size    p     r    f      c     p     r     f     c
    Since our solution requires that there is only one label
                                                                                           50      .63   .65 .57     .60   .77   .75   .73   .75
assigned to a region, we have to add constraints that “link”
                                                                                          100      .70   .67 .65     .69   .78   .77   .75   .80
the variables accordingly. This can be accomplished by link-                              150      .67   .63 .61     .66   .74   .71   .70   .75
ing pairs of relations, and start by defining the constraints for                         200      .69   .65 .63     .67   .80   .75   .76   .80
the outgoing relations. We arbitrarily take one base relation                             250      .69   .64 .60     .66   .78   .73   .72   .77
rO ∈ Oi and then create constraints for all r ∈ Oi \ rO .                                 300      .68   .63 .61     .66   .82   .77   .78   .82
                                                                                          350      .63   .68 .61     .66   .80   .75   .76   .80
Let rO = (si , sj ) with type tO , and r = (si , sj 0 ) with type                         400      .68   .63 .61     .66   .80   .75   .75   .79
t be the twoP  relations to be linked.  Then, the constraints are
                        ko               ko0
                                P
∀lk ∈ L :        lo ∈L citO j −   lo ∈L itj 0 = 0. The first sum
                                   0   c                                                  Table 1. Overall results for the two approaches.
can either take the value 0 if lk is not assigned to si by the
relation r, or one if it is assigned, and basically the same ap-                       The best overall classification rate is achieved with the bi-
plies for the second sum. Since both are subtracted and the                        nary integer programming approach on the data set with 300
whole expression has to evaluate to 0, either both equal 1 or                      training images. However, with only 100 training examples
both equal 0 and subsequently, if one of the relations assigns                     we achieve nearly the same performance, indicating that 100
lk to si , the others have to do the same. The constraints for                     training examples are a good size for training a well perform-
the incoming relations are defined accordingly, where rE is                        ing classifier using our approach.
the base relation.
     Finally we have to link the outgoing to the incoming re-                                            4. CONCLUSIONS
lations. Since the same label assignment is already enforced
                                                                                   In this paper we have introduced a novel spatial reasoning
within those two types of relations, we only have to link rO
                                                                                   approach based on an explicit model of spatial context. Our
P rE , ko
and       using the
                 P following       set of constraints: ∀lk ∈ L :
                           o0 k                                                    results show a good classification rate compared to results in
        c
   lo ∈L itO j −         c
                   lo0 ∈L j 0 tE i =  0 Absolute relations are for-
                                                                                   the literature, while requiring only a low number of training
malized and linked accordingly.
                                                                                   data.
    Eventually, let tr and ta refer to the type of the relative
relation r and the absolute relation a, respectively, then the                                           5. REFERENCES
objective function is defined as
                                                                                   [1] M. Grzegorzek and E. Izquierdo, “Statistical 3d object
                                                                                       classification and localization with context modeling,” in
      X       X X
                             min(θi (lk ), θj (lo )) ∗ Ttr (lk , lo ) ∗ cko            15th European Signal Processing Conference, 2007.
                                                                         itr j +
   r=(si ,sj ) lk ∈L lo ∈L                                                         [2] Carsten Saathoff and Steffen Staab, “Exploiting spatial
                                X X
                                             θi (lk ) ∗ Tta (lk ) ∗ ckita . (2)        context in image region labelling using fuzzy constraint
                                a=si lk ∈L                                             reasoning,” in Proc. of WIAMIS, 2008.
                                                                                   [3] M. Grzegorzek and H. Niemann, “Statistical object
                                                                                       recognition including color modeling,” in 2nd Interna-
This function rewards label assignments that satisfy the back-
                                                                                       tional Conference on Image Analysis and Recognition,
ground knowledge and that involve labels with a high confi-
                                                                                       2005.
dence score provided during the classification step.