Introduction

ITI's Participation in the ImageCLEF 2012 Medical Retrieval and Classi cation Tasks

Matthew S. Simpson

Daekeun You

Md Mahmudur Rahman

Dina Demner-Fushman

Sameer Antani

George Thoma

0 0 Lister Hill National Center for Biomedical Communications, U. S. National Library of Medicine, NIH , Bethesda, MD , USA

This article describes the participation of the Image and Text Integration (ITI) group in the 2012 ImageCLEf medical retrieval and classi cation tasks. We present our methods for each of the three tasks and discuss our submitted textual, visual, and mixed runs as well as their results. While our methods generally perform well for each task, our best ad-hoc image retrieval submission was ranked rst among all the submissions from the participating groups.

Image Retrieval Case-based Retrieval Image Modality

Introduction

This article describes the participation of the Image and Text Integration (ITI) group in the ImageCLEF 2012 medical retrieval and classi cation tasks. Our group is from the Communications Engineering Branch of the Lister Hill National Center for Biomedical Communications, which is a division of the U. S. National Library of Medicine.

The medical track [ 15 ] of ImageCLEF 2012 consists of an image modality classi cation task and two retrieval tasks. For the classi cation task, the goal is to classify a given set of medical images according to thirty-one modalities (e.g., \Computerized Tomography," \Electron Microscopy," etc.). The modalities are organized hierarchically into meta-classes such as \Radiology" and \Microscopy," which are themselves types of \Diagnostic Images." In the rst retrieval task, a set of ad-hoc information requests is given, and the goal is to retrieve the most relevant images from a collection of biomedical articles for each topic. Finally, in the second retrieval task, a set of case-based information requests is given, and the goal is to retrieve the most relevant articles describing similar cases.

In the following sections, we describe the textual and visual features that comprise our image and article representations (Sections 2{3) and our methods for the modality classi cation (Section 4) and medical retrieval tasks (Sections 5{6). Our textual approaches primarily utilize the Uni ed Medical Language System R (UMLS R ) [ 11 ] synonymy to identify concepts in topic descriptions and article text, and our visual approaches rely on computed distances between descriptors of various low-level visual features. In developing mixed approaches, we explore the use of clustered visual features that can be represented using text, attribute selection, and ranked list merging strategies.

In Section 7 we describe our submitted runs, and in Section 8 we present our results. For the modality classi cation task, our best submission achieved a classi cation accuracy of 63:2% and was ranked within the submissions from the top three participating groups. Our best submission for the ad-hoc image retrieval task was ranked rst overall, achieving a mean average precision of 0.2377, which is a statistically signi cant improvement over the second ranked submission. For the case-based article retrieval task, our best submission achieved a mean average precision of 0.1035 and was ranked within the submissions from the top four participating groups, but this submission is statistically indistinguishable from our other case-based submissions. 2

Image Representation for Ad-hoc Retrieval

We represent the images contained in biomedical articles using a combination of the textual and visual features described below.

2.1 Textual Features

We represent each image in the collection as a structured document of imagerelated text called an enriched citation. Our representation includes the title, abstract, and MeSH R terms1 of the article in which the image appears as well as the image's caption and \mentions" (snippets of text within the body of an article that discuss an image). These features can be indexed with a traditional text-based information retrieval system, or they may be exposed as term vectors and combined with the visual feature vectors described below.

2.2 Visual Features

In addition to the above textual features, we also represent the visual content of images using various low-level visual descriptors. Table 1 summarizes the descriptors we extract and their dimensionality. Due to the large number of these features, we forego describing them in any detail. However, they are all well-known and discussed extensively in existing literature.

Cluster Words. To avoid the computational complexity of computing distances between the above visual descriptors, we create a textual representation of visual features that is easily integrated with our existing textual features. For each visual descriptor listed in Table 1, we cluster the vectors assigned to all images using the k-means++ [ 3 ] algorithm. We then assign each cluster a unique \cluster word" and represent each image as a sequence of these words. We add an image's cluster words to its enriched citation as a \global image feature" eld, which can be searched using a traditional text-based information retrieval system. Attribute Selection. An orthogonal approach to transforming our visual descriptors into a computationally manageable representation is attribute selection. By eliminating unneeded or redundant information, these techniques can also improve our modality classi cation and image retrieval methods. We perform attribute selection using the WEKA [ 8 ] data mining software. First, we group all our visual descriptors into a single combined vector, and we then perform attribute selection to reduce the dimensionality of this combined feature. 1 MeSH is a controlled vocabulary created by U. S. National Library of Medicine to index biomedical articles. Autocorrelation Color and edge directivity? (CEDD) [ 5 ] Color layout? (CLD) [ 4 ] Color moment Edge frequency Edge histogram? (EHD) [ 4 ] Fuzzy color and texture histogram? (FCTH) [ 6 ] Gabor moment? Gray-level co-occurrence matrix moment (GLCM) [ 19 ] Local binary pattern (LBP1) [ 14 ] Local binary pattern (LBP2) [ 14 ] Local color histogram (LCH) Primitive length Scale-invariant feature transformation? (SIFT) [ 12 ] Semantic concept (SCONCEPT) [ 16 ] Shape moment Tamura moment? [ 20 ] Combined Dimensionality

Article Representation for Case-based Retrieval We represent articles using the textual features of each image appearing in the article. Thus, each article enriched citation consists of its title, abstract, and MeSH terms as well as the caption and mention of each contained image. 4

Modality Classi cation Task

We experimented with both at and hierarchical modality classi cation methods. Below we describe our at classi cation strategy, an extension of this approach that exploits the hierarchical structure of the classes, and a post-processing method for improving the classi cation accuracy of illustrations.

4.1 Flat Classi cation

Figure 1a provides an overview of our basic classi cation approach. We utilize multi-class support vector machines (SVMs) as our at modality classi ers. First, we extract our visual and textual image features from the training images (representing the textual features as term vectors). Then, we perform attribute selection to reduce the dimensionality of the features. We construct the lowerdimensional vectors independently for each feature type (textual or visual) and combine the resulting attributes into a single, compound vector. Finally, we use the lower-dimensional feature vectors to train multi-class SVMs for producing textual, visual, or mixed modality predictions.

4.2 Hierarchical Classi cation

Unlike the at classi cation strategy described above, it is possible to exploit the hierarchical organization of the modality classes in order to decompose the task into several smaller classi cation problems that can be sequentially applied. Based on our visual observation of the training samples and our initial experiments, we modi ed the original modality hierarchy [ 15 ] proposed for the task. The hierarchy we used for our experiments is shown in Figure 1b.

We train at multi-class SVMs, as shown in Figure 1a, for each meta-class. For recognizing compound images, we utilize the algorithm proposed by Apostilova et al. [ 1 ], which detects sub- gure labels and the border of each sub- gure within a compound image. To arrive at a nal class label, an image is sequentially classi ed beginning at the root of the hierarchy until a leaf class can be determined.

4.3 Illustration Post-processing

Because our initial classi cation experiments resulted in only modest accuracy for the fourteen \Illustration" classes shown in Figure 1b, we concluded that our current textual and visual features may not be su cient for representing these gures. Therefore, in addition to the aforementioned machine learning modality classi cation methods, we also developed several complimentary rulebased strategies for increasing the classi cation accuracy of \Illustration" classes.

A majority of the training samples contained in the \Illustration" meta-class, unlike other images in the collection, consist of line drawings or text superimposed on a white background. For example, program listings mostly consist of text; thus, the use of text and line detection methods may increase the classi cation accuracy of Class GPLI. Similarly, polygons (e.g., rectangles, hexagons, etc.) contained in owcharts (GFLO), tables (GTAB), system overviews (GSYS), and chemical structures (GCHE) are a distinctive feature of these modalities. We utilize the methods of Jung et al. [ 10 ] and OpenCV2 functions to assess the presence of text and polygons, respectively. 5

Ad-Hoc Image Retrieval Task

In this section we describe our textual, visual and mixed approaches to the ad-hoc image retrieval task. Descriptions of the submitted runs that utilize these methods are presented in Section 7.

2 http://opencv.willowgarage.com/wiki/ 5.1 Textual Approaches

To allow for e cient retrieval and to compare their relative performance, we index our enriched citations with the Essie [ 9 ] and Lucene/SOLR3 search engines. Essie is a search engine developed by the U.S. National Library of Medicine and is particularly well-suited for the medical retrieval task due to its ability to automatically expand query terms using the UMLS synonymy. Lucene/SOLR is a popular search engine developed by the Apache Software Foundation that employs the well-known vector space model of information retrieval. We have extracted the UMLS synonymy from Essie and use it for term expansion when indexing enriched citations with Lucene/SOLR.

We organize each topic description into a frame-based (e.g., PICO4) representation following the method similar to that described by Demner-Fushman and Lin [ 7 ]. Extractors identify concepts related to problems, interventions, age, anatomy, drugs, and modality. We also identify modi ers of the extracted concepts and a limited number of relationships among them. We then transform the extracted concepts into queries appropriate for either Essie or Lucene/SOLR.

5.2 Visual Approaches

Our visual approaches to image retrieval are based on retrieving images that appear visually similar to the given topic images. We compute the visual similarity between two images as the Euclidean distance between their visual descriptors. For the purposes of computing this distance, we represent each image as a combined feature vector composed of a subset of the visual descriptors listed in Table 1 after attribute selection.

5.3 Mixed Approaches

We explore several methods of combing our textual and visual approaches. One such approach involves the use of our image cluster words. For performing multimodal retrieval using cluster words, we rst extract the visual descriptors listed in Table 1 from each example image of a given topic. We then locate the clusters to which the extracted descriptors are nearest in order to determine their corresponding cluster words. Finally, we combine these cluster words with words taken from the topic description to form a multimodal query appropriate for either Essie or Lucene/SOLR.

While the use of cluster words allows us to create multimodal queries, we can instead directly combine the independent outputs of our textual and visual approaches. In a score merging approach, we apply a min-max normalization to the ranked lists of scores produced by our textual and visual retrieval strategies. We then linearly combine the normalized scores given to each image to produce a nal ranking. Similarly, a rank merging approach combines the results of our textual and visual approaches using the ranks of the retrieved images instead of their normalized scores. To produce the nal image ranking using this strategy, we re-score each retrieved image as the reciprocal of its rank and then repeat the above procedure for combining scores.

3 http://lucene.apache.org/

4 PICO is a mnemonic for structuring clinical questions in evidence-based practice and represents Patient/Population/Problem, Intervention, Comparison, and Outcome.

Another means of incorporating visual information with our retrieval approaches is through the use of a modality classi er. Using our hierarchical modality classi cation approach, we can rst determine the most probable modalities for a topic's example images. After retrieving a set of images using either our textual or visual methods, we can eliminate retrieved images that are not of the same modality as the topic images. An advantage of performing hierarchical classi cation is that we can lter the retrieved results using the meta-classes within the hierarchy (e.g., \Radiology").

Finally, we often combine the retrieval results produced by several queries into a single ranked list of images. We perform this query combination, or padding, by simply appending the ranked list of images retrieved by a subsequent query to the end of the ranked list produced by the preceding query. 6

Case-Based Retrieval Task

Our method for performing case-based retrieval is analogous to our textual approaches for ad-hoc image retrieval. Here, we index the enriched citations described in Section 3 using the Essie and Lucene/SOLR search engines (for performance comparison). We generate textual and mixed queries appropriate for both search engines according to the approaches described in Sections 5.1.

As a form of query expansion for case-based topics, we also explore the possibility of determining relevant disease names to correspond with signs and symptoms found in a topic case. To determine a set of potential diseases, we rst use the Google Search API5 to search the World Wide Web using a topic case as a query. We then process the top ve documents with MetaMap [ 2 ] to extract terms having the UMLS semantic type \Disease or Syndrome." Finally, we select the top three most frequent diseases for query expansion. 7

Submitted Runs

In this section we describe each of our submitted runs for the modality classi cation, ad-hoc image retrieval, and case-based article retrieval tasks. Each run is identi ed by its submission le name or trec_eval run ID and mode (textual, visual or mixed). All submitted runs are automatic.

7.1 Modality Classi cation Task

We submitted the following nine runs for the modality classi cation task: M1. Visual only Flat.txt (visual): A at multi-class SVM classi cation using selected attributes from a combined visual descriptor of 15 features (all descriptors in Table 1 except LCH and SCONCEPT).

M2. Visual only Hierarchy.txt (visual): Like Run M1 but classi cation is performed hierarchically.

M3. Text only Flat.txt (textual): A at multi-class SVM classi cation using selected attributes from a combined term vector created from four textual features (article title, MeSH terms, and image caption and mention). M4. Text only Hierarchy.txt (textual): Like Run M3 but classi cation is performed hierarchically. 5 https://developers.google.com/custom-search/v1/overview M5. Visual Text Flat.txt (mixed): A at multi-class SVM classi cation combining the feature representations used in Runs M1{3.

M6. Visual Text Hierarchy.txt (mixed): Like Run M5 but classi cation is performed hierarchically.

M7. Visual Text Flat w Postprocessing 4 Illustration.txt (mixed): Like Run M5 but additional post-processing is applied for \Illustration" classes. M8. Visual Text Hierarchy w Postprocessing 4 Illustration.txt (mixed): Like

Run M7 but classi cation is performed hierarchically.

M9. Image Text Hierarchy Entire set.txt (mixed): Like Run M6 but applied to all the images contained in the retrieval collection. 7.2

Ad-hoc Image Retrieval Task

We submitted the following ten runs for the ad-hoc image retrieval task: A1. nlm-se (mixed): A combination of three queries using Essie. (A1.Q1) A disjunction of modality terms extracted from the query topic must occur within the caption or mention elds of an image's enriched citation; a disjunction of the remaining terms is allowed to occur in any eld. (A1.Q2) A lossy expansion of the verbatim topic is allowed to occur in any eld. (A1.Q3) A disjunction of the query images' cluster words must occur within the global image feature eld.

A2. nlm-se-cw-mf (mixed): A combination of Query A1.Q1 with the additional query below using Essie. (A2.Q2) A lossy expansion of the verbatim topic is allowed occur in any eld of an image's enriched citation and a disjunction of the query images' cluster words can optionally occur within the global image feature eld. Additionally, the retrieved images are ltered so that they share a least common ancestor modality with the query images, as determined by the modality classi er used in Run M9. Query A2.Q2 is distinct from Queries 1.Q2{3 in that the occurrence of a lossy expansion of the topic is not necessarily weighted more heavily than the occurrence of image cluster words.

A3. nlm-se-scw-mf (mixed): Like Run A2 but image cluster words are only considered if the modality classi er used in Run M9 identically labels all the example images of a topic.

A4. nlm-lc (mixed): A combination of three queries using Lucene with BM25 similarity and UMLS synonymy. (A4.Q1) A fuzzy phrase-based occurrence of the verbatim topic is allowed in any eld of an image's enriched citation. (A4.Q2) A disjunction of the topic words is allowed to occur in any eld. (A4.Q3) A disjunction of the query images' cluster words must occur within the global image feature eld.

A5. nlm-lc-cw-mf (mixed): A combination of Query A4.Q1 with the additional query below using Lucene with BM25 similarity and UMLS synonymy. (A5.Q2) A disjunction of the topic words is allowed occur in any eld of an image's enriched citation and a disjunction of the query images' cluster words can optionally occur within the global image feature eld. Additionally, the retrieved images are ltered so that they share a least common ancestor modality with the query images, as determined by the modality classi er used in Run M9.

A6. nlm-lc-scw-mf (mixed): Like Run A5 but image cluster words are only considered if the modality classi er used in Run M9 identically labels all the example images of a topic.

A7. Combined Selected Fileterd Merge (visual): Similarity matching using 62 min-max normalized attributes selected from a combined visual descriptor of 15 features (all descriptors in Table 1 except LCH and SCONCEPT). Retrieval is performed separately for each query image, and the retrieved results are ltered, according to the modality classi er used in Run M9, so that they share the top two modality levels with the query. Images are scored according the query image resulting in the maximum score. A8. Combined LateFusion Fileterd Merge (visual): Like Run A8 but similarity matching is performed separately for seven features (CLD, GLCM, SCONCEPT, and the color, Gabor, shape, and Tamura moments from Table 1) whose scores are linearly combined with prede ned weights.

A9. Txt Img Wighted Merge (mixed): A combination of visual Run A7 with a textual run consisting solely of Query A1.Q2 using score merging. A10. Merge RankToScore weighted (mixed): A combination of visual Run A8 with a textual run consisting solely of Query A1.Q2 using rank mering. 7.3

Case-based Article Retrieval Task

We submitted the following eight runs for the case-based article retrieval task: C1. nlm-se-max (textual): A combination of three queries for each topic sentence using Essie. (C1.Q1) A disjunction of modality terms extracted from the sentence must occur within the caption or mention elds of an article's enriched citation; a disjunction of the remaining terms is allowed to occur in any eld. (C1.Q2) A lossy expansion of the verbatim sentence is allowed to occur in any eld. (C1.Q3) A disjunction of all extracted words and discovered diseases in the sentence is allowed to occur in any eld. Articles are scored according to the sentence resulting in the maximum score. C2. nlm-se-sum (textual): Like Run C1 but articles are scored according to the sum of the scores produced for each sentence.

C3. nlm-se-frames-max (textual): A combination of the query below with Query C1.Q2 for each topic sentence using Essie. (C3.Q1) An expansion of the frame-based representation of the sentence is allowed to occur in any eld of an article's enriched citation. Articles are scored according to the sentence resulting in the maximum score.

C4. nlm-se-frames-sum (textual): Like Run C3 but articles are scored according to the sum of the scores produced for each sentence.

C5. nlm-lc-max (textual): A combination of two queries for each topic sentence using Lucene with language model similarity, Jelinek-Mercer smoothing, and UMLS synonymy. (C5.Q1) A fuzzy phrase-based occurrence of the verbatim sentence is allowed in any eld of an article's enriched citation. (C5.Q2) A disjunction of all words and discovered diseases in the sentence is allowed to occur in any eld. Articles are scored according to the sentence resulting in the maximum score.

C6. nlm-lc-sum (textual): Like Run C5 but articles are scored according to the sum of the scores produced for each sentence.

File Name Visual Text Hierarchy w Postprocessing 4 Illustration.txt Visual Text Flat w Postprocessing 4 Illustration.txt Visual Text Hierarchy.txt Visual Text Flat.txt Visual only Hierarchy.txt Visual only Flat.txt Image Text Hierarchy Entire set.txt Text only Hierarchy.txt Text only Flat.txt Mixed Mixed Mixed Mixed Visual Visual Mixed Textual Textual C7. nlm-lc-total-max (textual): A combination of the query below with Queries C5.Q1{2 (as C7.Q2{3) using Lucene with language model similarity, JelinekMercer smoothing, and UMLS synonymy. (C7.Q1) A fuzzy phrase-based occurrence of the entire verbatim topic is allowed in any eld of an article's enriched citation. Articles are scored according to the sentence resulting in the maximum score.

C8. nlm-lc-total-sum (textual): Like Run C7 but articles are scored according to the sum of the scores produced for each sentence. 8

Results

We present and discuss the results of our modality classi cation, ad-hoc image retrieval, and case-based article retrieval task submissions below.

8.1 Modality Classi cation Task

Table 2 presents the classi cation accuracy of our submitted runs for the modality classi cation task. Visual Text Hierarchy w Postprocessing 4 Illustration.txt, a mixed approach, achieved the highest accuracy (63.2%) of our submitted runs and was ranked fth overall. However, it ranked within the submissions from the top three participating groups. This result validates our post-processing method used to improve the recognition of \Illustration" classes, and provides, with our previous experience [ 17 ], further evidence that hierarchical classi cation is a successful strategy. Each of our hierarchical classi cation methods outperforms the corresponding at approach having the same feature representation.

While our submitted runs were only judged on their ability to identify each of the thirty-one modality classes [ 15 ], Table 3 presents the classi cation accuracy of the intermediate classi ers we used for our hierarchical approaches. For each meta-class in the hierarchy shown in Figure 1b, Table 3 gives the number of classes they contain; the classi cation accuracy associated with the textual, visual, and mixed feature representations; and the dimensionality of the mixed feature representation after attribute selection. These results demonstrate that the accuracies of the intermediate classi ers generally improve as the number of class labels decreases. Given the limited amount of training data in relation to the number of total modalities, the smaller number of labels per classi er likely is signi cant for explaining why our hierarchical classi cation approaches consistently outperform their corresponding at approaches. ? Feature dimensionality is given for mixed mode classi ers only. nlm-se Merge RankToScore weighted nlm-lc nlm-lc-cw-mf nlm-lc-scw-mf nlm-se-scw-mf Txt Img Wighted Merge nlm-se-cw-mf Combined LateFusion Fileterd Merge Combined Selected Fileterd Merge Mode Mixed Mixed Mixed Mixed Mixed Mixed Mixed Mixed Visual Visual 8.2

Ad-hoc Image Retrieval Task

Table 4 presents the mean average precision (MAP), binary preference (bpref), and early precision (P@10) of our submitted runs for the ad-hoc image retrieval task. nlm-se achieved the highest MAP (0.2377) among our submitted runs and was ranked rst overall. Merge RankToScore weighted, the run achieving our second highest MAP (0.2166), was ranked second overall. Comparing these two runs using Fisher's paired randomization test [ 18 ], a recommended statistical test for evaluating information retrieval systems, we nd that nlm-se achieved a statistically signi cant increase (9:7%, p = 0:0016) over the performance of Merge RankToScore weighted.

That the two highest ranked runs were multimodal, as apposed to textual, is an encouraging result, and provides evidence that our ongoing e orts at integrating textual and visual information will be successful. In particular, the use by nlm-se and other runs of cluster words, which are indexed and retrieved using a traditional text-based information retrieval system, is an e ective way, not only of incorporating visual information with text, but of avoiding the computational expense common among content-based retrieval methods. Furthermore, Merge RankToScore weighted demonstrates the value of rank merging when combining textual and visual retrieval results. Some of our other mixed runs, in utilizing the results of our modality classi ers, may have been weakened due to the modest performance of our classi cation methods.

8.3 Case-based Article Retrieval Task

Table 5 presents the MAP, bpref, and P@10 of our submitted runs for the casebased article retrieval task. nlm-lc-total-sum, a textual approach using language model similarity, achieved the highest MAP (0.1035) among our submitted runs and was ranked seventh overall. However, it ranked within the submissions from the top four participating groups. Using Fisher's paired randomization test, we nd that there is no statistically signi cant di erence (p < 0:05) in MAP among any of our submitted runs. The relatively low performance of most of the ImageCLEF 2012 case-based submissions may be due, in part, to the existence in the collection of only a small number of case reports, clinical trials, or other types of documents relevant for case-based topics. 9

Conclusion

This article describes the methods and results of the Image and Text Integration (ITI) group in the ImageCLEF 2012 medical retrieval and classi cation tasks. For the modality classi cation task, our best submission was ranked within the submissions from the top three participating groups. Our best submission for the ad-hoc image retrieval task was ranked rst overall. Finally, for the case-based article retrieval task, our best submission was ranked within the submissions from the top four participating groups, though we found no statistical signi cance between this run and our other case-based submissions. The e ectiveness of our multimodal approaches are encouraging and provide evidence that our ongoing e orts at integrating textual and visual information will be successful. Acknowledgments. We would like to thank Antonio Jimeno-Yepes for assisting in expanding case-based topics with disease names, Russell Loane for proving source code for converting frame-based topics to Essie queries, and Srinivas Phadnis for constructing enriched citations and extracting visual features.

1. Apostolova , E. , You , D. , Xue , Z. , Antani , S. , Demner-Fushman , D. , Thoma , G. : Image retrieval from scienti c publications: Text and image content processing to separate multi-panel gures . Journal of the American Society for Information Science and Technology (To appear)

2. Aronson , A.R.: E ective mapping of biomedical text to the UMLS metathesaurus: The MetaMap program . In: Proc. of the Annual Symp. of the American Medical Informatics Association (AMIA) . pp. 17 { 21 ( 2001 )

3. Arthur , D. , Vassilvitskii , S.: k-means++: The advantages of careful seeding . In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms . pp. 1027 { 1035 . SODA ' 07 ( 2007 )

4 . Chang , S.F. , Sikora , T. , Puri , A. : Overview of the MPEG-7 standard . IEEE Transactions on Circuits and Systems for Video Technology 11 ( 6 ), 688 { 695 ( 2001 )

5. Chatzichristo s, S.A. , Boutalis , Y.S.: CEDD: Color and edge directivity descriptor: A compact descriptor for image indexing and retrieval . In: Gasteratos, A. , Vincze , M. , Tsotsos , J.K . (eds.) Proceedings of the 6th International Conference on Computer Vision Systems. Lecture Notes in Computer Science , vol. 5008 , pp. 312 { 322 . Springer ( 2008 )

6. Chatzichristo s, S.A. , Boutalis , Y.S.: FCTH: Fuzzy color and texture histogram: A low level feature for accurate image retrieval . In: Proceedings of the 9th International Workshop on Image Analysis for Multimedia Interactive Services . pp. 191 { 196 ( 2008 )

7. Demner-Fushman , D. , Lin , J. : Answering clinical questions with knowledge-based and statistical techniques . Computational Linguistics 33 ( 1 ), 63 {103 (Mar 2007 )

8. Hall , M. , Frank , E. , Holmes , G. , Pfahringer , B. , Reutemann , P. , Witten , I.H. : The WEKA data mining software: An update . SIGKDD Explorations 11 ( 1 ) ( 2009 )

9. Ide , N.C. , Loane , R.F. , Demner-Fushman , D. : Essie: A concept-based search engine for structured biomedical text . Journal of the American Medical Informatics Association 1 ( 3 ), 253 { 263 ( 2007 )

10. Jung , K. , Kim , K.I. , Jain , A.K. : Text information extraction in images and video: A survey . Pattern Recognition 37 ( 5 ), 977 { 997 ( 2004 )

11. Lindberg , D. , Humphreys , B. , McCray , A. : The uni ed medical language system . Methods of Information in Medicine 32 ( 4 ), 281 { 291 ( 1993 )

12. Lowe , D. : Object recognition from local scale-invariant features . In: Proceedings of the Seventh IEEE International Conference on Computer Vision . vol. 2 , pp. 1150 { 1157 ( 1999 )

13. Lux , M. , Chatzichristo s , S.A.: LIRe: Lucene image retrival|an extensible java CBIR library . In: Proceedings of the 16th ACM International Conference on Multimedia . pp. 1085 { 1088 ( 2008 )

14. Maenpaa, T. : The Local Binary Pattern Approach to Texture Analysis|Extensions and Applications . Ph.D. thesis , University of Oulu ( 2003 )

15. Muller, H., de Herrara , A.G.S. , Kalpathy-Cramer , J. , Demner-Fushman , D. , Antani , S. , Eggel , I. : Overview of the ImageCLEF 2012 medical image retrieval and classi cation tasks . In: CLEF 2012 Working Notes ( 2012 )

16. Rahman , M.M. , Antani , S. , Thoma , G.: A medical image retrieval framework in correlation enhanced visual concept feature space . In: Proceedings of the 22nd IEEE International Symposium on Computer-Based Medical Systems ( 2009 )

17. Simpson , M. , Rahman , M.M. , Phadmis , S. , Apostolova , E. , Demner-Fushman , D. , Antani , S. , Thoma , G.: Text- and content-based approaches to image modality classi cation and retrieval for the ImageCLEF 2011 medical retrieval track ( 2011 )

18. Smucker , M.D. , Allan , J. , Carterette , B. : A comparison of statistical signi cance tests for information retrieval evaluation . In: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management . pp. 623 { 632 ( 2007 )

19. Srinivasan , G.N. , G., S. : Statistical texture analysis . In: Proceedings of World Academy of Science, Engineering and Technology . vol. 36 , pp. 1264 { 9 ( 2008 )

20. Tamura , H. , Mori , S. , Yamawaki , T. : Textural features corresponding to visual perception . IEEE Transactions on Systems, Man, and Cybernetics 8 ( 6 ), 460 { 73 ( 1978 )