FCSE at Medical Tasks of ImageCLEF 2013 Ivan Kitanovski, Ivica Dimitrovski, and Suzana Loskovska Faculty of Computer Science and Engineering, University of Ss Cyril and Methodius Rugjer Boshkovikj 16, 1000 Skopje, Macedonia {ivan.kitanovski, ivica.dimitrovski, suzana.loshkovska}@finki.ukim.mk Abstract. This paper presents the details of the participation of FCSE (Faculty of Computer Science and Engineering) research team in Image- CLEF 2013 medical tasks (modality classification, ad-hoc image retrieval and case-based retrieval). For the modality classification task we used SIFT descriptors and tf − idf weights of the surrounding text (image caption and paper title) as features. SVMs with χ2 kernel and one-vs- all strategy were used as classifiers. For the ad-hoc image retrieval task and case-based retrieval we adopted a strategy which uses a combination of word-space and concept-space approaches. The word-space approach uses the Terrier IR search engine to index and retrieve the text associ- ated with the images/cases. The concept-space approach uses Metamap to map the text data into a set of UMLS (Unified Medical Language System) concepts, which are later indexed and retrieved by the Terrier IR search engine. The results from the word-space and concept-space retrieval are fused using linear combination. For the compound figure separation task, we used unsupervised algorithm based on breadth-first search strategy using only visual information from the medical images. The selected algorithms were tuned and tested on the data from Im- ageCLEF 2012 medical task and based on the selected parameters we submitted the new experiments for ImageCLEF 2013 medical task. We achieved very good overall performance: the best run for the modality classification ranked 2nd in the overall score, the best run for the ad-hoc image retrieval ranked 3rd. Keywords: information retrieval, medical imaging, medical image re- trieval, modality classification, compound figure separation 1 Introduction In this paper we present the experiments performed by the Faculty of Computer Science and Engineering (FSCE) team for the medical tasks at ImageCLEF 2013. Our group participated in all medical subtasks. To acquire the optimal parameters we evaluated our approaches on the ImageCLEF 2012 dataset and then based on those parameters we submitted the runs for ImageCLEF 2013. The paper is organized as follows: Section 2 describes our approach for the modality classification task, section 3 shows the algorithm for the compound separation task, section 4 presents the ad-hoc image retrieval task, section 5 contains the details for the case-based retrieval task. 2 I. Kitanovski et al. 2 Modality classification task 2.1 Introduction Imaging modality is an important information on the image for medical retrieval. In user studies, clinicians have indicated that modality is one of the most im- portant fillters that they would like to be able to limit their search by. Using the modality information, the retrieval results can often be improved significantly. The ImageCLEF 2013 medical modality classification task is a standardized benchmark for systems to automatically classify medical image modality from PubMed journal articles [1]. The 2013 dataset has 31 calsses (the same number of classes and the same classification hierarchy as in 2012) but larger number of compound figures are present making the task significantly harder but corre- sponding much more to the reality of biomedical journals [1]. Our approach uses visual features with combination of textual features ex- tracted from the surrounding text content of the images. SVMs with χ2 kernel were used as a classifiers. The algorithms are explained in details in the remain- der of this section. 2.2 Visual features Collections of medical images can contain various images obtained using dif- ferent imaging techniques. Different feature extraction techniques are able to capture different aspects of an image (e.g., texture, shapes, color distribution...) [2]. Texture is especially important, because it is difficult to classify medical im- ages using shape or gray level information. Effective representation of texture is needed to distinguish between images with equal modality and layout. Lo- cal image characteristics are fundamental for image interpretation: while global features retain information on the whole image, the local features capture the details. They are thus more discriminative concerning the problem of inter and intra-class variability [3]. The bag-of-visual-words approach is commonly used in many state of the art algorithms for image classification [4]. The basic idea of this approach is to sample a set of local image patches using some method (densely, randomly or using a key-point detector) and calculate a visual descriptor on each patch (SIFT descriptor, normalized pixel values). The resulting distribution of descriptors is then quantified against a pre-specified visual codebook which converts it to a histogram. The main issues that need to be considered when applying this approach are: sampling of the patches, selection of the visual patch descriptor and building the visual codebook. We use dense sampling of the patches, which samples an image grid in a uniform fashion using a fixed pixel interval between patches. We use an interval distance of 6 pixels and sample at multiple scales (σ = 1.2 and σ = 2.0). Due to the low contrast of some of the medical images (for example, radiographs), it would be difficult to use any detector for points of interest. Also, it has been pointed by Zhang et al. [4], that a dense sampling is always superior to any FCSE at ImageCLEF2013 3 strategy based on detectors for points of interest. We calculate a opponentSIFT descriptor for each image patch [5], [6]. OpponentSIFT describes all the channels in the opponent color space using SIFT descriptors. The information in the O3 channel is equal to the intensity information, while the other channels describe the color information in the image. These other channels do contain some inten- sity information, but due to the normalization of the SIFT descriptor they are invariant to changes in light intensity [6]. The crucial aspect of the bag-of-visual-words approach is the codebook con- struction. An extensive comparison of codebook construction variables is given by van Gemert et al. [7]. We employ k-means clustering on 250K randomly cho- sen descriptors from the set of images available for training. k-means partitions the visual feature space by minimizing the variance between a predefined num- ber of k clusters. Here, we set k to 500 and thus define a codebook with 500 codewords [3]. Fig. 1. Three different spatial pyramids used in our experiments, a) 1x1, b) 2x2 and c) 3x1. The spatial pyramid constructs feature vectors for each of the specific part of the image. Dense sampling gives an equal weight to all key-points, irrespective of their spatial location in the image. To overcome this limitation, we follow the spatial pyramid approach [8]. We used a spatial pyramid of 1x1, 2x2, and 1x3 regions. Since every region is an image in itself, the spatial pyramid can easily be used in combination with dense sampling. The resulting vector with 4000 bins ((1x1 + 2x2 + 1x3)x500) was obtained by concatenation of the eight histograms (each histogram is L1 normalized). Fig. 1 shows an example of the histograms extarcted from an image for the spatial pyramids of 1x1, 2x2 and 3x1. 2.3 Textual features Images in the collection belong to a medical article, so they can be indexed using the surrounding text content. The text representation adopted in this work included information from the title of the paper and the image caption, 4 I. Kitanovski et al. which can be found in the XML file corresponding to each image in the data set. With that, a text corpus for the image collection was built, and standard text processing operations were applied, including tokenization, stemming, and stop-word removal using Terrier IR [9]. We calculate the weight for each term in each document using T F − IDF weighting model. The calculated weights were adopted as textual features. 2.4 Feature fusion schemes Different features (in our case visual and textual) bringing different information about the content of the images clearly outperform single feature approaches [10], [3]. Following these findings, we combine the two different features described above using high level feature fusion scheme. The fusion schemes is depicted in Fig. 2. classes Fig. 2. High level fusion scheme for the different descriptors. The high level fusion scheme averages the predictions from the individual classifiers trained on the separate descriptors. 2.5 Classifier setup We used the libSVM implementation of SVMs (Support Vector Machines) [11] with probabilistic output [12] as classifiers. To solve the multi-class classification problems, we employ the one-vs-all approach. Each of the SVMs was trained with a χ2 kernel. Namely, we build a binary classifier for each modality/class: the examples associated with that class are labeled positive and the remaining examples are labeled negative. This results in an imbalanced ratio of positive versus negative training examples. We resolve this issue by adjusting the weights of the positive and negative class [6]. In particular, we set the weight of the positive class to #pos+#neg #pos and the weight of the negative class to #pos+#neg #neg , with #pos the number of positive instances in the train set and #neg the number of negative instances. We also optimize the cost parameter C of the SVMs using an automated parameter search procedure [6]. For the parameter optimization, we used the dataset from 2012. After finding the optimal C value, the SVM is trained on the 2013 set of training images. FCSE at ImageCLEF2013 5 2.6 Results and discussion In this section, we present and discuss the results obtained from the experi- mental evaluation of the proposed method. First, we compare and evaluate the performance of the proposed method for the ImageCLEF 2012 dataset. Next, we present the results obtained for this year, ImageCLEF 2013 dataset. The first three rows in Table 1 show the results of our method applied on the ImageCLEF 2012 dataset. These results include visual, textual and mixed runs. From the presented results, we can note that the better predictive performance of the visual run compared to the textual run. The high level feature fusion scheme helps in increasing the predictive performance. Furthermore, from the presented results, we can also note that our method has a very high accuracy/performance. Compared with the results from the groups that participate in the ImageCLEF 2012 medical task [13] our visual run is second best, the textual and mixed runs are ranked first. The mixed run with accuracy of 77.0 will be ranked first in the overall ranking if we have submitted this run in the last years modality classification task. Table 1. Results of the runs of modality classification task for ImageCLEF 2012 and 2013. Dataset Run Type Accuracy visual 66.10 textual 62.90 ImageCLEF 2012 mixed 77.00 visual 77.14 textual 63.71 ImageCLEF 2013 mixed 78.04 The second three rows in Table 1 shows the results of our method applied on this year modality classification task. These results include also visual, textual and mixed runs. The accuracy of 78.04 obtained with the mixed run is second best in the overall ranking. The high level feature fusion scheme increases the predicitve performance for this year dataset also. 3 Compound figure separation Compound figures contain figures of several types, they cannot be classified into unique classes and need to be separated before a detailed classification into the figure types can be performed. In this work, a unsupervised technique of compound figure separation is proposed and implemented based on breadth- first search strategy using only visual information from the medical figures. All pixel values in the figure are examined/traversed searching for enclosed region separated with white border/pixels. The sensitivity of the border is controlled by threshold parameter. The regions smaller than predefined value are discarded. 6 I. Kitanovski et al. In some of the figures the separating borders between the contained subfigures are in black color, therefore before applying our algorithm we invert the output figure. For the given test dataset our algorithm correctly classified 68.59% of the figures. 4 Ad-hoc image retrieval In this section, we give an overview of the application of our methods to ad- hoc medical image retrieval and present the results of our submitted runs. We participated only in the textual retrieval. 4.1 Proposed approach The approach uses the image caption and the title of medical article in which it is referenced i.e. surrounding text. The approach seeks to combine word-space and concept-space approaches with the goal to achieve better overall retrieval performance. The word-space component indexes and retrieves the surrounding text of the medical images in a traditional way. The surrounding text of the medical images is first preprocessed performing stop words removal and stemming, and creat- ing a standard inverted index. In the retrieval phase, the system pre-processes the query and applies stop words removal and stemming to the query as well. Weighting models are applied to calculate the score for the relevancy of every medical article in respect to the given query. Once the score is calculated the documents are sorted and returned. The concept-space component works by analyzing the text by the presented medical concepts. The first step is to map the surrounding text of the medical images to medical concepts. The mapping can be done using a variety of toolkits, services or libraries such as [14], Meshup [15] etc. The problem in this approach arises in the way documents will be indexed and then evaluated in the retrieval phase with respect to queries. Classical information retrieval models, directly or indirectly, depend on the number of terms which the document and query share to compute the relevance score [9]. But, the number of terms which a query and document share in the word-space could be very different in the concept-space. For example, if a query and the document share one term ”x-ray” in word-space, they can share up to six terms in concept-space [16]. On the other hand, if they share a phrase of two terms ”lung x-ray” in word-space, then they will share only one term in concept-space. The results from both components are then normalized and passed to a fusion component (the diagram is depicted on Figure 3). It can use any of the known strategies for late fusion [17]. In this study, we used a simple linear combination of the normalized results. FCSE at ImageCLEF2013 7 Imageqcaptionq/qMedicalqarticles Preprocessng Mappingqtext Text (stemming,qstop wordsqremoval) toqconcepts query Text Concept Query data data data Text Indexingqand Indexingqand query Retrieval Retrieval Concept-space Word-space results results Normalization Normalization Normalized Normalized results results Fusion Finalq results Fig. 3. Diagram of the process flow 4.2 Retrieval framework For the word-space approach Terrier IR [9] is used as a search engine. For the preprocessing stage, Porter stemmer [18] and stop words are applied. In the retrieval phase, several weighting models were evaluated: PL2 [19], BM25 [19], BB2 [19], DFR-BM25 [19], TF-IDF [20], DirichletLM [21]. Additional experiment was performed with query expansion on the best performing model to test its maximum output. The concept-space approach requires a mapping mechanism to match the text data to medical concepts. In this approach, Metamap is used as mapping tool and the extracted medical concepts are UMLS [14] concepts. The mapping is performed only on the surrounding text of the medical images. After the concepts are extracted, new artificial text is generated containing only the UMLS concepts. The same process is repeated for the queries. Once the artificial text is constructed it is passed to the search engine for indexing. Terrier IR indexes the artificial text, with no additional preprocessing (no stemming and stop words removal). The retrieval is performed by passing the artificial queries to the search engine. In this phase, the same weighting models are applied as in the word-space 8 I. Kitanovski et al. approach. Basically, the concept-space approach can be viewed as a word-space approach with more complex preprocessing. Before the fusion phase, the results from the word-space and concept-space are normalized using min-max normalization [22]. The normalized results are then passed to the fusion component which applies linear combination. This kind fusion provides modularity and control over the extent in which components influence the final result. 4.3 Evaluating on ImageCLEF 2012 The proposed framework was first evaluated on the ImageCLEF 2012 dataset. This phase is used to find the optimal weighting models and appropriate param- eters. The results of the word-space assessment are depicted on Table 2. The results show that the BM25 model provides the best performance for the word- space retrieval. An additional experiment was performed with the best model by assigning weights to key words in the queries using Terri query language (For ex- ample. words such as ”MRI”, ”CT” etc. are given 1.5 weight). The results for the experiment with the word weights (BM25-ww) show an increase in performance. Table 2. Comparison of weighting models for word-space ad-hoc retrieval Model MAP P10 P20 Rprec # of rel. docs BB2 0.2056 0.3429 0.2714 0.2411 473 BM25 0.2266 0.3381 0.3000 0.2559 494 DFR-BM25 0.2091 0.3476 0.2738 0.2236 474 PL2 0.2055 0.3429 0.2643 0.2353 472 TF-IDF 0.2085 0.3524 0.2714 0.2194 471 DirichletLM 0.1601 0.2619 0.2024 0.1614 434 BM25-ww 0.2407 0.3619 0.2929 0.2620 490 The results of the concept-space assessment are depicted on Table 3. In this case the best results are provided with the DirichletLM model. Table 3. Comparison of weighting models for concept-space ad-hoc retrieval Model MAP P10 P20 Rprec # of rel. docs BB2 0.1257 0.1700 0.1025 0.1433 173 BM25 0.1230 0.1550 0.1025 0.1441 172 DFR-BM25 0.1227 0.1550 0.1000 0.1441 173 PL2 0.1065 0.1500 0.0950 0.1137 168 TF-IDF 0.1226 0.1550 0.1025 0.1402 172 DirichletLM 0.1568 0.2450 0.1475 0.1888 232 The results of the mixed assessment are depicted on Table 4. The mixed assessment is consisted of two experiments. The first one is by combining the FCSE at ImageCLEF2013 9 best word-space and concept-space approaches. The second experiment is done by combining the word-space with word weights and concept-space approaches. Table 4. Results for the mixed run of the ad-hoc retrieval on ImageCLEF 2012 Type MAP P10 P20 Rprec # of rel. docs Mixed 0.2385 0.3762 0.2738 0.2496 492 Mixed-ww 0.2528 0.3857 0.2690 0.2600 488 4.4 Results and discussion Based on the results obtained from the experiments over the ImageCLEF 2012 dataset, the runs for the ImageCLEF 2013 ad-hoc retrieval task was submit- ted. Another experiment was made, only now using ImageCLEF 2013 data and submitted the results only from the best performing techniques. For word-space text-based retrieval we submitted the run using BM25 weighting model word weights and for the concept-space text-based retrieval we submitted the run us- ing DirchletLM weighting model. Finally, for the mixed retrieval we submitted the linear combination of the two previous spaces. The results from our runs on ImgeCLEF 2013 are presented on Table 5. Table 5. Submitted runs for ad-hoc retrieval Type MAP GM-MAP bpref P10 P30 word-space 0.2435 0.0430 0.2424 0.3314 0.2248 word-space-ww 0.2507 0.0443 0.2497 0.3200 0.2238 concept-space 0.1456 0.0244 0.1480 0.2000 0.1286 mixed 0.2464 0.0508 0.2338 0.3114 0.2200 mixed-ww 0.2479 0.0515 0.2336 0.3057 0.2181 5 Case-based retrieval In this section, we give an overview of the application of our methods to case- based retrieval and present the results of our submitted runs. We participated only in the textual retrieval of the cases. 5.1 Proposed approach The proposed approach for this task is similar to the ad-hoc retrieval task, with the difference that in this case the retrieval unit is a medical article, not an image. 10 I. Kitanovski et al. Two approach combines the word-space and concept-space, just as with the ad- hoc retrieval. For the word-space component, we index the entire text of the medical articles, which includes the title, abstract, article text and captions of the images in the article (we refer to this as ”fulltext”). The indexing and retrieval is done using Terrier IR and several weighing models are applied to analyze their performance for this type of task. For the concept-space component, only the title and abstract of the medical article are used for extraction of medical concepts. The tool for medical concept extraction is Metamap and the extracted results are UMLS concepts. The rest of the process for the concept-space approach is identical to the concept-space ad-hoc retrieval. The final result is provided with the late fusion of both components using linear combination. 5.2 Evaluating on ImageCLEF 2012 The proposed framework was again evaluated on the ImageCLEF 2012 dataset. The results of the word-space assessment are depicted on Table 6. The results show that the BM25 model provides the best performance for the word-space case-based retrieval. An additional experiment was performed with the best model by adding query expansion. The results for the experiment with the query expansion (BM25-qe) show that the query expansion increase retrieval perfor- mance by roughly 4%. Table 6. Comparison of weighting models for word-space case-based retrieval Model MAP P10 P20 Rprec # of rel. docs BB2 0.1598 0.1435 0.1326 0.1604 217 BM25 0.1818 0.1522 0.1391 0.1757 222 DFR-BM25 0.1816 0.1522 0.1413 0.1767 222 PL2 0.1780 0.1478 0.1370 0.1861 227 TF-IDF 0.1805 0.1522 0.1326 0.1662 221 DirichletLM 0.1811 0.1652 0.1283 0.1744 225 BM25-qe 0.1994 0.1957 0.1522 0.2198 232 The results of the concept-space assessment are depicted on Table 7. In this case the best results are provided with the DirichletLM model. An additional experiment was performed using query expansion on the best performing model, which provides an improvement of roughly 2%. The results of the mixed assessment are depicted on Table 8. The mixed assessment is consisted of two experiments. The first one is by combining the best word-space and concept-space approaches. The second experiment is done by combining the word-space and concept-space approaches, both with added query expansion. FCSE at ImageCLEF2013 11 Table 7. Comparison of weighting models for concept-space case-based retrieval Model MAP P10 P20 Rprec # of rel. docs BB2 0.0691 0.1000 0.0630 0.0874 132 BM25 0.0705 0.1000 0.0652 0.0815 127 DFR-BM25 0.0706 0.1000 0.0652 0.0888 127 PL2 0.0686 0.0826 0.0674 0.0835 129 TF-IDF 0.0699 0.0957 0.0609 0.0815 131 DirichletLM 0.0841 0.0957 0.0565 0.0988 134 DirichletLM-qe 0.1073 0.1261 0.0870 0.1069 158 Table 8. Results for the mixed run of the case-based retrieval on ImageCLEF 2012 Type MAP P10 P20 Rprec # of rel. docs mixed 0.1758 0.1565 0.1370 0.1915 222 mixed-qe 0.2186 0.2043 0.1804 0.2337 235 5.3 Results and discussion Using the models and optimal parameters learned with the experiments over the ImageCLEF 2012 dataset, the experiments over the ImageCLEF 2013 dataset were performed. The best results were provided with in the case of the mixed experiment using query expansion. Table 9. Results for the ImageCLEF 2013 dataset Type MAP P10 P20 Rprec # of rel. docs word-space 0.2026 0.2057 0.1743 0.2115 549 word-space-qe 0.2019 0.2314 0.1957 0.2011 596 concept-space 0.0438 0.0829 0.0671 0.0730 300 concept-space-qe 0.0632 0.0857 0.0771 0.0850 334 mixed 0.1832 0.1857 0.1586 0.2036 550 mixed-qe 0.2059 0.2229 0.1957 0.2235 604 References 1. de Herrera, A.G.S., Kalpathy-Cramer, J., Fushman, D.D., Antani, S., Müller, H.: Overview of the imageclef 2013 medical tasks. In: Working notes of CLEF 2013. (2013) 2. Dimitrovski, I., Loskovska, S.: Content-based retrieval system for X-ray images. In: International Congress on Image and Signal Processing. (2009) 2236–2240 3. Tommasi, T., Orabona, F., Caputo, B.: Discriminative cue integration for medical image annotation. Pattern Recognition Letters 29(15) (2008) 1996–2002 4. Zhang, J., Marszalek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: A comprehensive study. Inter- national Journal of Computer Vision 73(2) (2007) 213–238 12 I. Kitanovski et al. 5. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Interna- tional Journal of Computer Vision 60(2) (2004) 91–110 6. van de Sande, K., Gevers, T., Snoek, C.: Evaluating color fescriptors for object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(9) (2010) 1582–1596 7. van Gemert, J.C., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Visual word ambiguity. IEEE Transactions on Pattern Analysis and Machine Intelligence 99(1) (2010) 8. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE conference on Com- puter Vision and Pattern Recognition. (2006) 2169–2178 9. Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Johnson, D.: Terrier information retrieval platform. In: Advances in Information Retrieval, Springer (2005) 517–519 10. Tommasi, T., Caputo, B., Welter, P., Güld, M., Deserno, T.: Overview of the clef 2009 medical image annotation track. In: Multilingual Information Access Evaluation II. Multimedia Experiments – LNCS 6242, Springer Berlin/Heidelberg (2010) 85–93 11. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. (2001) Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. 12. Lin, H.T., Lin, C.J., Weng, R.C.: A note on Platt’s probabilistic outputs for support vector machines. Machine Learning 68 (2007) 267–276 13. Müller, H., de Herrera, A.G.S., Kalpathy-Cramer, J., Demner-Fushman, D., An- tani, S., Eggel, I.: Overview of the imageclef 2012 medical image retrieval and classification tasks. In: CLEF (Online Working Notes/Labs/Workshop). (2012) 14. Aronson, A.R.: Effective mapping of biomedical text to the umls metathesaurus: the metamap program. In: Proceedings of the AMIA Symposium, American Med- ical Informatics Association (2001) 17 15. Trieschnigg, D., Pezik, P., Lee, V., De Jong, F., Kraaij, W., Rebholz-Schuhmann, D.: Mesh up: effective mesh text classification for improved document retrieval. Bioinformatics 25(11) (2009) 1412–1418 16. Abdulahhad, K., Chevallet, J.P., Berrut, C., et al.: Mrim at imageclef2012. from words to concepts: A new counting approach. In: Notebook Papers of Labs and Workshop (CLEF). (2012) 17. Müller, H., de Herrera, A.G.S., Kalpathy-Cramer, J., Fushman, D.D., Antani, S., Eggel, I.: Overview of the imageclef 2012 medical image retrieval and classification tasks. Working Notes of CLEF (2012) 18. Macdonald, C., Plachouras, V., He, B., Lioma, C., Ounis, I.: University of glas- gow at webclef 2005: Experiments in per-field normalisation and language specific stemming. In: Accessing Multilingual Information Repositories. Springer (2006) 898–907 19. Amati, G., Van Rijsbergen, C.J.: Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on In- formation Systems (TOIS) 20(4) (2002) 357–389 20. Hiemstra, D.: A probabilistic justification for using tf× idf term weighting in information retrieval. International Journal on Digital Libraries 3(2) (2000) 131– 139 21. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems (TOIS) 22(2) (2004) 179–214 FCSE at ImageCLEF2013 13 22. Jain, A., Nandakumar, K., Ross, A.: Score normalization in multimodal biometric systems. Pattern recognition 38(12) (2005) 2270–2285