1. INTRODUCTION

LABELLING IMAGE REGIONS USING SPATIAL PROTOTYPES Carsten Saathoff, Marcin Grzegorzek, and Steffen Staab

0 University of Koblenz

In this paper we present an approach for introducing spatial context into image region labelling. We combine low-level classification with spatial reasoning based on explicitly represented spatial arrangements of labels. We formalise the problem using Linear Programming, and provide an evaluation on a set of 923 images.

1. INTRODUCTION

Exploiting solely low-level features for automatic image region labelling often leads to unsatisfactory results, and recent studies [1] show the importance of contextual and spatial information. In this paper, we propose an approach based on [2] that integrates a wavelet-based low-level classification [ 3 ] and spatial reasoning based on Linear Programming.

During the training phase we train the classifiers and acquire our background knowledge. In the classification phase, each image is first segmented by an automatic segmentation algorithm. The low-level classification produces for each region si and each supported label lj a probability i(lj ). Then relative (e.g. above-of, left-of ) and absolute (e.g. above-all) spatial relations are extracted, and are processed by the spatial reasoning together with the probabilities. The output is a final labelling that is optimal with respect to both our spatial background knowledge and the probabilities.

We provide results of a number of experiments showing that our approach provides comparable performance with low numbers of training examples. Due to length constraints, we will not detail the low-level processing at all.

2. SPATIAL REASONING BASED ON CONSTRAINTS

The goal of the spatial reasoning step is to exploit background knowledge about the typical spatial arrangements of objects in images in order to improve the labelling accuracy compared to pure local, low-level feature-based approaches. We will first discuss the acquisition of constraint templates from a set of spatial prototypes, and then describe the formalisation of the problem as a Linear Program.

2.1. Constraint acquisition

Spatial constraint templates constitute the background knowledge in our approach. We acquire these templates from socalled spatial prototypes, which are manually labelled images. We mine the prototypes using support and confidence as selection criteria, and come up with a set of templates representing typical spatial arrangements.

For each label l we have to determine in what spatial relation to other labels it might be found. Therefore, for each spatial relation type t, we consider the relation set Rt#l, which contains the relations of type t from images depicting l. We then define Rtl;ll0 to be the set of relations between segments # s; s0 depicting l and l0, respectively, and finally Rt ;ll0 to denote # all relations between an arbitrary region and a region depicting l0. The confidence of a label arrangement is then defined as t(l; l0) = jjRRttl#;#;llll00jj ; and the support as t(l; l0) = jRtl;l0 j . jRt#lj

Finally, we define the template Tt for the spatial relation type t as Tt(l; l0) = 1 iff t(l; l0) > th and t(l; l0) > th , and Tt(l; l0) = 0 otherwise. For absolute spatial relations we define support, confidence, and the template accordingly.

2.2. Spatial reasoning with linear programming

We will show in the following how to formalize image labelling with spatial constraints as a linear program. We consider Binary Integer Programs, which have the form maximize subject to

Z Ax x = cTx = b 2 f0; 1g (1) Goal of the solving process is to find a set of assignments to the integer variables in x with a maximum evaluation score Z that satisfy all the constraints.

In order to represent the image labelling problem as a linear program, we create a set of linear constraints from each spatial relation in the image, and determine the objective coefficients based on the hypotheses sets and the constraint templates. Let Oi R be the set of outgoing relations for region si 2 S, i.e. Oi = fr 2 Rj9s 2 S; s 6= si : r = (si; s)g, and Ei R the set of incoming spatial relations, i.e. Ei = fr 2 Rj9s 2 S; s 6= si : r = (s; si)g. Then, for each possible pair of label assignments to the regions, we create a variable ciktoj , representing the possible assignment of lk to si and lo to sj with respect to the relation r with type t 2 T . Each ciktoj is an integer variable and ciktoj = 1 represents the assignments si = lk and sj = lo, while ciktoj = 0 means that these assignments are not made. Since every such variable represents exactly one assignment of labels to the involved regions, and only one label might be assigned to a region in the final solution, we have to add this restriction as linear constraints. The constraints are formalised as 8r 2 R : r = (si; sj ) 2 R ! Plk2L Plo2L ciktoj = 1: These constraints assure that there is only one pair of labels assigned to a pair of regions per spatial relation, but it still there could be two variables ciktoj and cikt00oj00 both being set to 1, which would result in both k and k0 assigned to si.

Since our solution requires that there is only one label assigned to a region, we have to add constraints that “link” the variables accordingly. This can be accomplished by linking pairs of relations, and start by defining the constraints for the outgoing relations. We arbitrarily take one base relation rO 2 Oi and then create constraints for all r 2 Oi n rO. Let rO = (si; sj ) with type tO, and r = (si; sj0 ) with type t be the two relations to be linked. Then, the constraints are 8lk 2 L : Plo2L ciktoOj Plo02L ciktoj00 = 0: The first sum can either take the value 0 if lk is not assigned to si by the relation r, or one if it is assigned, and basically the same applies for the second sum. Since both are subtracted and the whole expression has to evaluate to 0, either both equal 1 or both equal 0 and subsequently, if one of the relations assigns lk to si, the others have to do the same. The constraints for the incoming relations are defined accordingly, where rE is the base relation.

Finally we have to link the outgoing to the incoming relations. Since the same label assignment is already enforced within those two types of relations, we only have to link rO and rE , using the following set of constraints: 8lk 2 L : Plo2L ciktoOj Plo02L cjo00tkEi = 0 Absolute relations are formalized and linked accordingly.

Eventually, let tr and ta refer to the type of the relative relation r and the absolute relation a, respectively, then the objective function is defined as

X X min( i(lk); j (lo)) Ttr (lk; lo) ciktorj + r=(si;sj) lk2L lo2L

X X a=si lk2L i(lk)

Tta (lk) cikta : (2) This function rewards label assignments that satisfy the background knowledge and that involve labels with a high confidence score provided during the classification step.

3. EXPERIMENTS AND RESULTS

We evaluated the approach on a set of 923 images depicting outdoor scenes. We used the labels building, foliage, mountain, person, road, sailing-boat, sand, sea, sky, snow. In our evaluation we used the spatial relations above-of, below-of, left-of and right-of, the absolute spatial relations above-all and below-all, and we used the thresholds = 0:001 and = 0:2 for both relative and absolute spatial relations. We compared the performance of the low-level classification with the spatial reasoning on different training set sizes and measured precision (p), recall (r) and the classification rate (c). Further we computed the F-Measure (f). In Table 1 the average for each of these measures is given. r .75 .77 .71 .75 .73 .77 .75 .75 f .73 .75 .70 .76 .72 .78 .76 .75

The best overall classification rate is achieved with the binary integer programming approach on the data set with 300 training images. However, with only 100 training examples we achieve nearly the same performance, indicating that 100 training examples are a good size for training a well performing classifier using our approach.

4. CONCLUSIONS

In this paper we have introduced a novel spatial reasoning approach based on an explicit model of spatial context. Our results show a good classification rate compared to results in the literature, while requiring only a low number of training data.

5. REFERENCES [1] M. Grzegorzek and E. Izquierdo, “Statistical 3d object classification and localization with context modeling,” in 15th European Signal Processing Conference, 2007. [2] Carsten Saathoff and Steffen Staab, “Exploiting spatial context in image region labelling using fuzzy constraint reasoning,” in Proc. of WIAMIS, 2008.

[3]

Grzegorzek and

Niemann , “ Statistical object recognition including color modeling , ” in 2nd International Conference on Image Analysis and Recognition , 2005 .