-

Query

Word Indexing Versus Conceptual Indexing in Medical Image Retrieval

Karim Gasmi

karimgasmi@yahoo.fr 0

Mouna Torjmen-Khemakhem

torjmen.mouna@redcad.org 0

Maher Ben Jemaa

maher.benjemaa@enis.rnu.tn 0 0 Research unit on Development and Control of Distributed Applications (ReDCAD), Department of Computer Science and Applied Mathematics, National School of Engineers of Sfax, University of Sfax

1840

7 0

This paper presents our participation in medical image retrieval task of ImageCLEF 2012. Our aim is to study the effectiveness of using conceptual indexing comparing to word indexing in medical image retrieval. For this aim, we have used in the one hand the Terrier tool for textual indexing and for textual retrieval, and on another hand, the MetaMap tool for conceptual indexing and Vector model for conceptual retrieval. More precisely, the run of the BM25 model is considered as a baseline. For textual indexing, we tried to compare different weighting formulas. However, for conceptual indexing, we Used BM25 model results to extract concepts and rerank results using vector model. Results show that the use of the textual indexing is more useful than the conceptual indexing. However, the conceptual indexing improves the result of some queries, which encourages us to continue the study of conceptual indexing and retrieval.

medical image retrieval information retrieval model reranking conceptual indexing metamap

Classical Information Retrieval (IR) models retrieve documents that have the same words (at least in part) that the query. But meaning can be expressed by different words, and the same word can express different meanings in different contexts. This false assumption is exactly the pitfalls of traditional approaches to IR. Overcome these limitations is the subject of several recent research projects. This is particularly true of the IR approach known as ”based concepts.” The choice of information retrieval model is a crucial task, which directly affects the result of any system of information retrieval, for that, we decided to work on the evaluation of different information retrieval models using the Terrier IR platform 1 and we tried to improve these models by using a conceptual indexing.

1 http://terrier.org/docs/v3.5/

Our model uses two types of indexing: words and or concepts to re-rank the result obtained by the BM25 model. The goals of this research are: 1. To study the influence of using of each retrieval model on the information retrieval system performance, 2. To study the influence of using two series retrieval models on the information retrieval system performance. 3. To study the effects of using concepts for indexing on the information retrieval system performance.

We have summarized the two indexing methods in the following figure ( 1 ). Our paper is organized as follows: in section 2, we describe the models of image retrieval used in different runs. Then we describe in Section 3, the conceptual indexing of medical image and before the conclusion we are done with the section 4, which describes the run and the result obtained. 2

Word Indexing of medical images The manual textual indexing of images is usually performed by a librarian named iconographer. Its role is to categorize and index images by associating them to categories and groups of words, often taken from a thesaurus, to quickly find the images. Unfortunately, the choice of terms for indexing is a problem for picture researchers, because it is impossible that the user choose the same keywords as those chosen by the iconographer. So the indexing of an image is subjective, because several indexing are possible.

Despite its subjectivity, manual indexing is an effective method to associate a meaning to images. However, to index a large volume of images, this work quickly becomes tedious or impossible, which is not the case for automatic indexing. The automatic textual indexing of images is to associate words in an image using a computer system without human intervention. The indexing textual images on the web can be done from the words in the page title or the most frequent or relevant words to this page.

Every system of information retrieval needs a weighting model , but the weighting term process must provide an iconic representation, compact and informative content of the documents regarding the terms of queries. It should provide an indicator of importance to discriminate the terms towards each other. Although several approaches and techniques have been developed using this factor of importance (weight terms), yet they almost all use these two terms: TF (term frequency): a term more frequently in a document, it is more important in the document.

IDF (Inverse Document Frequency): a term is uncommon in the collection, it is more important in the document. 2.1

Model BM25 [5] This is a ranking function used by search engines to rank matching documents according to their relevance to a given query. It is a probabilistic model ( 1 ) : BM 25 = ∑ ( t2q\d

tf tf + k1:nb :log ( N

dft + 0:5 ) dft + 0:5 :qtf ) with: { tf : frequency of term occurrences, { N : total number of documents in the collection, { df t: number of documents containing a term t, { qtf : frequency of occurrences of a term t in the query, { k1: parameters influencing the frequency of terms that is adjusted to 1.2 by default, { nb: normalization factor is calculated as follows: nb = (1 b) + b:

tl tlavg with: tl: Number of terms in the document (document length), tlavg : Average number of words in a document, ( 1 ) ( 2 ) 2.2

Model TF IDF (Term Frequency Inverse Document Frequency) This model works by determining the relative frequency of words in a specific document compared to the inverse proportion of that word over the entire document corpus ( 3 ) 2.

T F

IDF = Roberston tf idf

Kf ( 3 ) { idf = log (dfn+d1 ) { Roberston tf=k1

tf tf+k1 (1 b+ tlbavdg ) { tf : The term frequency of the term in the document { dl: The document’s length { df : The document frequency of the term { Kf : The term frequency in the query { nd: Nombre de documents Inverse expected document frequency model for randomness, the ratio of two Bernoulli’s processes for first normalisation, and Normalisation 2 for term frequency normalisation ( 4 ).

! (t; d) =

F + 1 nt: (tf n + 1) ( tf n:log2 ( N + 1 )) ne + 0:5 ( 4 ) 2.4

Model BB2 [1][4] Bose-Einstein model for randomness, the ratio of two Bernoulli’s processes for first normalisation, and Normalisation 2 for term frequency normalization ( 5 ). ! (t; d) =

F +1 nt:(tfn+1) ( log2 (N 1) log2 (e) + f (N + F 1; N + F tnf 2) f (F; F ( 5 ) tnf )) { ! (t; d) is the within-document term weight of the term t in the document d, { tf is the within-document frequency of the term t in the document d, { F is the term frequency of the term t in the whole collection, { N is the number of documents in the collection, { nt is the document frequency of the term t, { is given by NF ,

2 http://terrier.org/docs/v3.5/

{ ne = N: (1 (1 nNt )F ), { f (n; m) = (m + 0:5) :log2 (n )+ (n

m { tf n = tf:log2 1 + c: avg l ), (

l m) log2n, where c is a parameter. l and avg l are the document length of the document d and the average document length in the collection respectively. 3

Conceptual Indexing of medical images For extracting concept from ”caption + title” of each image, we choose to use MetaMap3 which performs the following steps [3] { 1-Parse the text into noun phrases { 2-Look for variants for each nominal sentence, with a variant consists of a noun phrase or words with all its variant spellings, abbreviations, acronyms, synonyms, inflectional and derivational variants, and meaningful combinations of these; { 3-Look for different candidates from all metathesaurus strings containing one of the variants found in step 2; { 4-By using an evaluation function, compute the mapping from the noun phrase and calculate the strength of the mapping, this step is performed for each candidate, finding during stage 3.; { 5-Combine candidates involved with disjoint parts of the noun phrase, recompute the match strength based on the combined candidates, and select those having the highest score to form a set of best Metathesaurus mappings for the original noun phrase.

The evaluation function used to calculate the strength of the mapping, is based on four components: centrality, variation, coverage, and cohesiveness. A normalized value between 0 (the weakest match) and 1 (the strongest match) is computed for each of these components. 4

Evaluation

The collection used in the medical retrieval task is sized of 2.5 GB and it consists of 306,528 documents. The number of queries is 22[2].

3 http://metamap.nlm.nih.gov/

For the word indexing runs, we used Terrier IR platform4, the open source search engine written in Java and developed at the School of Computing, University of Glasgow.

For the conceptual indexing runs, MetaMap has been used: a document (respectively a query) is represented as a set of weighted concepts extracted using the UMLS 5.

For all runs, title, caption and abstract were used to represent images. Our runs are described as follows: 4.1

Evaluation of the use of textual indexing { Run1-Terrier CapTitAbs BM25b0.75 (BM25): this is our official run, which uses BM25 as an information retrieval model and Terrier as a tool for textual indexing. { Run2-Terrier CapTitAbs BB2 (BB2): this run uses model BB2,which is described in the 2.4 subsection, to compare the result obtained with that obtained by BM25. { Run3-Terrier CapTitAbs In expB2 (In expB2): using the In expB2 model,which is described in the 2.3 subsection, to compare the result obtained with that obtained by BM25. { Run4-Terrier CapTitAbs TF IDF (TF IDF): TF IDF used for the calculation of weight, and as an information retrieval model, to compare the result obtained with that obtained by BM25 as an information retrieval model . we used Map (Mean Average Precision) as evaluation measure, results of table 1 show the difference between the four information retrieval models. According to the results, we observed that In expB2 model has higher Mean Average Precision than other models. TF IDF model has lower Mean Average Precision

4 http://terrier.org/docs/v3.5/ 5 http://www.nlm.nih.gov/research/umls/

than other models.

{ Run5-Terrier CapTitAbs BM25-DFR BM25: For this run, we used the model DFR BM25 for the task of re-rank the result obtained by the run-1 { Run6-Terrier CapTitAbs BM25-In expB2: we used the same principle as that of run-5, but we used the In expB2 model to re-rank. { Run7-Terrier CapTitAbs BM25-TF IDF: For this run, we used the model TF IDF for the task of re-rank the result obtained by the run-1 To improve the results obtained by the BM25 model, we try to sort the result by another model.Table 2 show obtained result.These Map confirm that the use of another model to re-rank the baseline result can help and improve the result obtained by BM25 only.

But the BM25 model gives the best results for P @ 5. So according to the needs of information retrieval system, we can choose between different models.

With the BM25 model, as shown in Table 3, we obtain a result which is really better than that achieved by the use of concepts. However, analyzing results query by query, we discovered that , using conceptual indexing can improve results for some queries. Also, the results obtained by the concepts can be improved if used right from the beginning, without indexing text as a first step, because it perhaps that the result obtained by BM25 affects negatively on the result achieved by the concept. Because these are two different types of indexing.

We can improve also the result obtained by the use of the concept by the implementation of another model instead of vector model. 5

Conclusion and future Work

Along this paper, we have compared the use of word indexing and conceptual indexing.

Results show that using word indexing is better than using conceptual indexing. However, we note that conceptual indexing improves significatively some queries. This finding encourages us to more work in conceptual indexing, and also in conceptual retrieval. In future work, we plan to continue studying conceptual indexing and to propose a conceptual retrieval model for medical image retrieval. We plan also to propose a mixed approach that combines the visual appearance of an image and conceptual description.

Ben

He and

Iadh

Ounis . A query-based pre-retrieval model selection approach to information retrieval . In RIAO , pages 706 - 719 , 2004 .

Jayashree

Kalpathy-Cramer Dina Demner Fushman Sameer Antani Ivan Eggel Henning Mller , Alba Garcia Seco de Herrera. Overview of the imageclef 2012 medical image retrieval and classification tasks . CLEF 2012 working notes , Rome, Italy, 2012 .

Quanzhi

Li and Yi fang Brook Wu . Identifying important concepts from medical documents . pages 668 - 679 , 2006 .

4. Sobhana

N.V.

Enhancing retrieval of geological text using named entity disambiguation . International Journal of Emerging Technology and Advanced Engineering , 2 ( 1 ): 2250 - 2459 , 2012 .

5. Stephen

Robertson , Steve Walker, Micheline Hancock-Beaulieu, Aarron

Gull , and Marianna

Lau . Okapi at TREC . In Text REtrieval Conference , pages 21 - 30 , 1992 .