<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>UESTC at ImageCLEF 2010 medical retrieval task</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hong Wu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Changjun Hu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sikun Chen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science and Engineering University of Electronic Science and Technology of China Chengdu 611731</institution>
          ,
          <country country="CN">P. R. China</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents the UESTC contribution to the ImageCLEF 2010 medical retrieval task. For ad-hoc retrieval and case-based retrieval, we only use text information, and propose a phrase-based approach. Phrases, subphrases and individual words are used with vector space model (VSM) for ranking. Phrases and subphrases are extracted with the help of MetaMap, and all extracted phrasal terms are corresponding to concepts in UMLS. Two term weighting methods are proposed, one is to weight terms with their idfs, and the other is adapted to assign lower weights to phrasal terms. We also propose a query expansion method which can extract more phrases for query by relaxing the restrictions on phrase extraction. For modality classification, we use three global texture features with SVM and Ada-boost.MH respectively.</p>
      </abstract>
      <kwd-group>
        <kwd>text retrieval</kwd>
        <kwd>image retrieval</kwd>
        <kwd>medical retrieval</kwd>
        <kwd>modality classification</kwd>
        <kwd>phrase extraction</kwd>
        <kwd>MetaMap</kwd>
        <kwd>UMLS</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>This paper describes the first participation of the School of Computer Science and
Engineering at University of Electronic Science and Technology of China (UESTC)
in the ImageCLEF 2010 medical retrieval task.</p>
      <p>
        ImageCLEFmed'10 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] includes three types of tasks, ad-hoc retrieval, case-based
retrieval and modality classification. For the retrieval tasks, the datatset similar to
2008 and 2009 is used but with a larger number of images. The dataset contains all
images (&gt;77,000) from articles published in Radiology and Radiographics including
the text of the captions and a link to the html of the full text articles. In the ad-hoc
retrieval task, a set of textual queries, each of which with several sample images, are
given, and the goal is to retrieve the images most relevant to each topic. In the
casebased task, a set of case-based information requests are given, and the goal is to
retrieve the articles most relevant to the topic case. In the modality classification
task, training and testing medical images are given for classification based on their
modality, such as CT, MR, XR etc.
      </p>
      <p>
        In this paper, we describe our phrase-based approach to two retrieval tasks and
classification algorithm for modality classification. For retrieval tasks, only text
information of title and caption is used. Phrases, subphrases and individual words
are used as indexing terms with vector space model (VSM). Phrases and subphrases
are extracted with the help of MetaMap 1 , so that all the phrasal terms are
corresponding to concepts in UMLS2. Since the text information for ad-hoc is very
short, it is necessary to adapt traditional term weighting methods. We propose to
weight terms with their idfs, and measure the similarity with dot-product. But in this
way, the phrasal terms are always over-rewarded. Then we give another weighting
method which assigning lower weights to phrasal terms. We also propose a query
expansion method which can extract more phrases for query by relaxing the
restrictions on phrase extraction. For modality classification, we use three global
texture features with SVM [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and Ada-boost.MH [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] respectively.
      </p>
      <p>The remainder of this paper is organized as follows. The phrase-based retrieval
approach and modality classification algorithm are described in section 2 and 3
respectively. And our submitted runs and results are presented in section 4, followed
by the conclusions and future works in section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Phrase-based Medical Retrieval</title>
      <sec id="sec-2-1">
        <title>2.1 Using phrase as indexing term</title>
        <p>
          The selection of appropriate indexing terms is critical to information retrieval.
Traditional retrieval systems use word or word stem as indexing term. And these
representations of content are usually inadequate since single words are rarely specific
enough for accurate discrimination. A better method is to identify groups of words
that form meaningful phrases, especially if these phrases denote important concepts in
the related domain. This is corresponding to using phrase or concept as indexing
term. In the past years, concept-based approaches have been investigated in
ImageCLEFmed [
          <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
          ], but to the best of our knowledge, there's still no work using
phrase as indexing term in this campaign. In this year, we investigate the
phrasebased medical retrieval.
        </p>
        <p>
          In the past, various types of phrases, such as sequential n-grams [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], head-modifier
pairs extracted from syntactic structures [
          <xref ref-type="bibr" rid="ref10 ref11 ref8 ref9">8, 9, 10, 11</xref>
          ], proximity-based phrases [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ],
were examined with conventional retrieval models (e.g. vector space model). In our
approach, we consider phrases which are corresponding to medical concepts. The
phrases are extracted with the help of MetaMap, which is a highly configurable
program to map biomedical text to the UMLS Metathesaurus. MetaMap maps the
longest possible phrase to concept so that it discovers the most specific concept
possible. This may cause much mismatch between query terms and document terms,
if the detected concepts (CUIs) are directly used as indexing term. Because a general
concept and a specific concept may all be relevant to a user’s need, also the meaning
of a concept can be expressed by a phrase or several words or phrases co-occur in the
context. [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] has given a example for this:
1 http://mmtx.nlm.nih.gov/
2 http://www.nlm.nih.gov/research/umls
        </p>
        <p>The 24th query of ImagCLEF2005 is “Show me images of right middle lobe
pneumonia”, and the best mapping schema of Metamap will give these concepts:
“C0150627” (Images)
“C0578577” (Right middle lobe pneumonia)</p>
        <p>But the relevant documents contain concepts, “ C0032285” (pneumonia) or
“C0796494” (lobe) will not match with query concepts, and then will get an
unfavorable ranking.</p>
        <p>
          One way to tackle this problem is to expand query or document with some
concepts related to the mapped concepts, e.g. hypernyms or hyponym [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
Following this way, we give the derivation of our approach.
        </p>
        <p>When mapping a phrase to concept, MetaMap also generates some candidates.
Candidates consist of one or more constituent words or their variants of the phrase,
and are corresponding to concepts in UMLS. Some of these concepts are related to
the mapped most specific concept, and can be used to expand query or document.
But when generating candidate, a phrase can be mapped to several concepts (more
frequently for subphrase), and much noise will be introduced if all corresponding
concepts are added. So, we consider using phrase (subphrase) instead of concept
(CUI) to represent document, and phrases, subphrases and individual words are all
used as index terms. The subphrases of a noun phrase capture a part of the meaning
of the noun phrase, and can be regarded as a weak representation of its meaning.
And the use of both phrase and its subphrases can increase the chance of match
between query and document having difference linguistic forms of similar meaning.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2 Phrase Extraction</title>
        <p>In our experiments, we do not develop a phrase extraction algorithm. All phrases
and their subphrase are extracted by MetaMap.</p>
        <p>
          MetaMap [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] performs the following steps to map text to concept for each textual
utterance:
1. Parse the text into noun phrases and perform the remaining steps for each phrase;
2.Generate the variants for the noun phrase where a variant essentially consists of
one or more noun phrase words together with all of its spelling variants,
abbreviations, acronyms, synonyms, inflectional and derivational variants, and
meaningful combinations of these;
3. Form the candidate set of all Metathesaurus strings containing one of the
variants;
4. For each candidate, compute the mapping from the noun phrase and calculate
the strength of the mapping using an evaluation function. Order the candidates by
mapping strength; and
5. Combine candidates involved with disjoint parts of the noun phrase, recompute
the match strength based on the combined candidates, and select those having the
highest score to form a set of best Metathesaurus mappings for the original noun
phrase.
        </p>
        <p>The best candidate is corresponding to the longest phrase, and other candidates
corresponding to its subphrases or constituent words. The phrases and multi-word
subphrases are added to query and document before indexing and retrieval.
MetaMap is designed for mapping the longest possible phrase to concept, not for
phrase and subphrase extraction. We find that it is not easy to control it for phrase
extraction. For example, When processing 8th query: “microscopic images of
streptococcus pneumonia”, MateMap generates a candidate “streptococcus
pneumoniae” with LexVariation=0.5 due to an inflectional variation. For “chest
xray” in 17th query, MetaMap generates an unwanted candidate “breast x-ray” with
LexVariation=2. There’s no simple rule to select candidates having meaning closely
related to the best candidate, and we use a strict rule.</p>
        <p>In experiments, we use 0910 Strict Model Dataset for MetaMap. When calling
MetaMap for phrase extraction, we do not allow derivational variants by setting the
parameter ‘-d’, because derivational variants always involve a significant change in
meaning. From output files of MetaMap, only candidates with ‘LexVariation’=0
and ‘MatchedWords Count’&gt;1 are selected to form the phrasal terms.
‘LexVariation’=0 implies that no lexical variances are permitted in phrase extraction,
and ‘MatchedWords Count’&gt;1 means only multi-words phrases are selected.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3 Term weighting</title>
        <p>We use phrasal terms and single word terms with VSM, and propose two term
weighting methods. For the ad-hoc retrieval task, the context information (title and
caption) is much shorter than document in traditional IR. Thus we think that the
term frequency (tf) may be not important in this case, and use a simple term weighting
method, where only idfs of indexing terms are used for term weighting. This term
weighting is also used for case-based retrieval. And the similarity between query
and document is measured by dot product of query vector and document vector.</p>
        <p>When using VSM to combine weights of phrases, subphrases and single word
terms, phrasal terms are over-rewarded. Since occurrence of a phrase in a document
also indicates the occurrence of its subphrases and constituent words. To solve this
problem, we propose another term weighting method which associates lower weights
to phrasal terms. For convenience of description, we introduce some concepts to
describe the relationship between phrases. We say a phrase or single word A is an
offspring component of a phrase B, if and only if it is a subphrase or constituent word
of phrase B. We say a phrase or single word A is a son component of a phrase B, if
and only if A is an offspring component of B, and there’s no offspring component of B
which having A as its offspring component. In the second term weighting method,
the weight of phrasal term is changed to be its idf minus the maximum idf of its son
components.</p>
      </sec>
      <sec id="sec-2-4">
        <title>2.4. Query Expansion</title>
        <p>Our query expansion algorithm is just to relax the restrictions on phrase extraction.
Candidates with ‘MatchedWords Count’&gt;1 (multi-words) are selected to form the
phrasal terms for query, and more phrase terms are extracted than previous setting.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. modality classification</title>
      <p>For modality classification, we use three global texture features: LBP texture feature,
Gabor texture feature and Tamura texture feature.</p>
      <p>
        LBP: Local Binary Pattern (LBP) [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] features have performed very well in various
applications, including texture classification and segmentation, image retrieval and
surface inspection. In our experiments, LBP operator with 8 neighbors on a circle of
radius 4 is applied to each pixel, and the obtained results are cumulated to form
256dim LBP histogram.
      </p>
      <p>
        Tamura Texture Feature: Based on the research of textural features corresponding
to human visual perception, Tamura et al.[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] proposed six basic textural features,
namely, coarseness, contrast, directionality, line likeness, regularity, and roughness.
In our experiments, coarseness, contrast and directionality features are computed on a
per-pixel basis, and the values are quantized into a three-dimensional histogram ( 8×
8×8=512 bins) to form one 512-dim vector.
      </p>
      <p>
        Gabor Texture Feature: Gabor filter based approaches are popular for texture
feature extraction. Based on the work of Manjunath et al [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], gabor filters with 3
scales and 4 orientations are used to filter image, and the values in the filtered images
are quantized to 10 bins to form a 120-dim histogram feature.
      </p>
      <p>For feature combination, they are simply concatenated to form an 888-dim feature
vector.</p>
      <p>
        We use two algorithms for classification. One is SVM [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] with rbf kernel, and
onevs-one strategy is used for multi-class classification. LibSVM [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] is used in our
experiments, and the parameters are tuned by cross-validation on training data. The
other is Adaboost.MH [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], a multi-class boosting algorithm. An implementation
named MultiBoost [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] is used in our experiments.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Submitted Runs and Results</title>
      <p>
        For ad-hoc retrieval, collection with title and caption only is used since it was proven
to be effective and obtain the best results in ImageCLEFmed 2008 [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. After
phrasal terms were added, collection is indexed by Lemur IR toolkit3. We also
update the stop word list to add common terms found in the queries that are not
relevant to medical domain such as ‘image’, ‘photo’, and ‘figure’. For convenience,
this procedure has also been applied to case-based retrieval without update. But the
use of title and caption only may lose important information for case-based retrieval
and result in poor performance.
      </p>
      <sec id="sec-4-1">
        <title>4.1 Ad-hoc Retrieval</title>
        <p>
          We have submitted the following 3 textual runs for the 16 ad-hoc topics [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ]:
3 http://www.lemurproject.org/
(1) UESTC_image_pBasic: Phrasal terms are extracted by the approach described in
Section 2.2, and terms are weighted by their idfs as the first method in Section 2.3.
Similarity is measured by dot product of query vector and document vector.
(2) UESTC_image_pNw: It is similar to the above basic run (UESTC_image_pBasic),
but the term weighting method is changed to the second method in Section 2.3.
(3) UESTC_image_pQE: This run is similar to the basic run (UESTC_image_pBasic),
but a query expansion method (Section 2.4) is used to get more phrasal terms for
query.
        </p>
        <p>To evaluate the effectiveness of the use of phrasal term, we conduct an additional run
Image_word_idf when preparing this report. Image_word_idf uses word stem as
indexing term, and the term weighting is the same as UESTC_image_pBasic.
Table 1 gives the results of our three submitted runs and the additional run for the
adhoc retrieval. The performances of the three submitted runs are very similar. The
performance of UESTC_image_pNw is better than UESTC_image_pBasic, but the
improvement is subtle. UESTC_image_pQE achieves the best MAP (0.2789) of our
submitted runs, and is ranked 3rd among all best official runs of each group for
automatic textual retrieval. But P10 of UESTC_image_pQE is lower than the other
two runs. The MAPs and bPrefs of the three phrase-based approaches are apparently
superior to word stem based approach Image_word_idf. This may hint the use of
phrase in medical retrieval. Table 2 presents the performance of best official runs of
each group for automatic textual retrieval. The third run with bold text is our best
official textual run.
NMFText_k2_11
MAP
0.3235</p>
        <p>Originally with our method, the 16th ad-hoc topic “images of dermatofibroma”
will match no document. And we update this query by inserting a space character to
“images of dermato fibroma”, when conducting the submitted runs. From the raw
results, we calculate the corrected results which corresponding to runs without update
to 16th topic. The corrected MAPs of UESTC_image_pQE, UESTC_image_pNW,
UESTC_image_pBasic are 0.2777, 0.2739, and 0.2701 respectively. The corrected
bPrefs of them are 0.2969, 0.3027, and 0.2962 respectively. And the P10s of them
stay unchanged. The differences between the corrected and original results are
subtle, and will not affect the conclusions on the results of submitted runs.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2 Case-based Retrieval</title>
        <p>
          The methods used in ad-hoc retrieval are directly used for case-based retrieval, and 3
textual runs are submitted for 14 case-based topics [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ].
(1) UESTC_case_pBasic: This run uses the same method as UESTC_image_pBasic.
(2) UESTC_case_pNw: This run uses the same method as UESTC_image_pNw.
(3) UESTC_case_pQE: This run uses the same method as UESTC_image_pQE.
        </p>
        <p>For evaluation, we also conduct an additional run Case_word_idf, and the methods
used in them are the same as Image_word_idf. All the four runs are automatic
textual runs.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3 Modality Classification</title>
        <p>We use LBP texture feature, Gabor texture feature and Tamura texture feature for
modality classification, and submit two visual runs with different classifier.
(1) UESTC_modality_boosting: This run uses Adaboost.MH with the three global
texture features for modality classification.
(2) UESTC_modality_svm: This run uses SVM with the three global features for
modality classification.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions and Future Work</title>
      <p>This paper describes our contribution to the ImageCLEF 2010 medical retrieval task.
For ad-hoc retrieval, we have submitted 3 runs with our phrase-based approaches.
With the same methods, 3 runs have been submitted for case-based retrieval. For
modality classification, 2 runs have been submitted, using global texture features with
two different classifiers respectively. The runs submitted to ad-hoc retrieval and
modality classification are successful, and achieve 3th rank in automatic textual
retrieval and 2th in modality classification.</p>
      <p>Our research on medical retrieval is still primary, both the phrase extraction and
term weighting. And there’s no extensive comparison of different methods. In the
future, we will develop and compare different phrase extraction algorithms and term
weighting schemes, and use more text features for cased-based retrieval. For
modality classification, we plan to test other visual features and advanced
classification algorithms.</p>
      <p>Acknowledgments. This research is partly supported by the National Science
Foundation of China under grants 60873185 and by the Key Program of the Youth
Science Foundation of UESTC under Grant JX0745.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Henning</given-names>
            <surname>Müller</surname>
          </string-name>
          , Jayashree Kalpathy-Cramer, Ivan Eggel, Steven Bedrick,
          <string-name>
            <given-names>Charles E. Kahn</given-names>
            <surname>Jr.</surname>
          </string-name>
          , and
          <string-name>
            <given-names>William</given-names>
            <surname>Hersh</surname>
          </string-name>
          .
          <article-title>Overview of the CLEF 2010 medical image retrieval track</article-title>
          .
          <source>In the Working Notes of CLEF</source>
          <year>2010</year>
          , Padova, Italy, (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Vapnik</surname>
            ,
            <given-names>V.N.:</given-names>
          </string-name>
          <article-title>The nature of statistical learning theory</article-title>
          . Springer, Heidelberg (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Schapire</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          &amp;
          <string-name>
            <surname>Singer</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , '
          <article-title>Improved boosting algorithms using confidence-rated prediction'</article-title>
          ,
          <source>Machine Learning 37(3)</source>
          ,
          <fpage>297</fpage>
          -
          <lpage>336</lpage>
          , (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Lacoste</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chevallet</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lim</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wei</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Raccoceanu</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le</surname>
            ,
            <given-names>T.H.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Teodorescu</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vuillenemot</surname>
          </string-name>
          , N.:
          <article-title>Ipal knowledge-based medical image retrieval in imageclefmed 2006</article-title>
          .
          <source>In: Working Notes for the CLEF 2006 Workshop</source>
          , Alicante, Spain,
          <source>September</source>
          <volume>20</volume>
          -
          <fpage>22</fpage>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chevallet</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lim</surname>
            ,
            <given-names>J.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Le</surname>
            ,
            <given-names>T.H.D.</given-names>
          </string-name>
          :
          <article-title>Domain knowledge conceptual inter-media indexing, application to multilingual multimedia medical reports</article-title>
          .
          <source>In: ACM Sixteenth Conference on Information and Knowledge Management (CIKM</source>
          <year>2007</year>
          ), November 6-
          <issue>9</issue>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Maisonnasse</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaussier</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chevallet</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          :
          <article-title>Multiplying concept sources for graph modeling</article-title>
          . In: Peters,
          <string-name>
            <given-names>C.</given-names>
            ,
            <surname>Jijkoun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Mandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            ,
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            ,
            <surname>Oard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.W.</given-names>
            ,
            <surname>Peñas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Petras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Santos</surname>
          </string-name>
          ,
          <string-name>
            <surname>D</surname>
          </string-name>
          . (eds.)
          <article-title>CLEF 2007</article-title>
          .
          <article-title>LNCS</article-title>
          , vol.
          <volume>5152</volume>
          . Springer, Heidelberg (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Mitra</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buckley</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Singhal</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Cardie</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <article-title>An analysis of statistical and syntactic phrases</article-title>
          .
          <source>In Proceedings of RIAO '97</source>
          , pages
          <fpage>200</fpage>
          -
          <lpage>214</lpage>
          , (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lewis</surname>
            ,
            <given-names>D.D.</given-names>
          </string-name>
          , and Croft.,
          <string-name>
            <surname>W.B.</surname>
          </string-name>
          ,
          <article-title>Term clustering of syntactic phrases</article-title>
          .
          <source>In Proceedings of SIGIR '90</source>
          , pages
          <fpage>385</fpage>
          -
          <lpage>404</lpage>
          , (
          <year>1990</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Zhai</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <article-title>Fast statistical parsing of noun phrases for document indexing</article-title>
          .
          <source>In Proceedings of ANLP '97</source>
          , pages
          <fpage>312</fpage>
          -
          <lpage>319</lpage>
          , (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Dillon</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Gray</surname>
            ,
            <given-names>A.S.</given-names>
          </string-name>
          <string-name>
            <surname>Fasit</surname>
          </string-name>
          :
          <article-title>A fully automatic syntactically based indexing system</article-title>
          .
          <source>Journal of the American Society for Information Science</source>
          ,
          <volume>34</volume>
          (
          <issue>2</issue>
          ):
          <fpage>99</fpage>
          -
          <lpage>108</lpage>
          , (
          <year>1983</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Strzalkowski</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perez-Carballo</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Marinescu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>Natural language information retrieval: Trec-3 report</article-title>
          .
          <source>In Proceedings of TREC-3</source>
          , pages
          <fpage>39</fpage>
          -
          <lpage>54</lpage>
          , (
          <year>1994</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Turpin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Moffat</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <article-title>Statistical phrases for vector-space information retrieval</article-title>
          .
          <source>In Proceedings of SIGIR '99</source>
          , pages
          <fpage>309</fpage>
          -
          <lpage>310</lpage>
          , (
          <year>1999</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Le T.H.D.</surname>
          </string-name>
          ,
          <string-name>
            <surname>Chevallet</surname>
            ,
            <given-names>J.-P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dong</surname>
            <given-names>T.B.T.</given-names>
          </string-name>
          ,
          <article-title>Thesaurus-based query and document expansion in conceptual indexing with UMLS: Application in medical information retrieval</article-title>
          ,
          <source>IEEE International Conference on In Research, Innovation and Vision for the Future</source>
          , pp.
          <fpage>242</fpage>
          -
          <lpage>246</lpage>
          ,(
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Aronson</surname>
            ,
            <given-names>A. R.</given-names>
          </string-name>
          , MetaMap: Mapping Text to the UMLS Metathesaurus, http://skr.nlm.nih.gov/papers/references/metamap06.pdf ,
          <string-name>
            <surname>July</surname>
          </string-name>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Ojala</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peitikäinen</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Mäenpää</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          , “
          <article-title>Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Trans</article-title>
          .
          <article-title>Pattern Analysis and</article-title>
          .
          <source>Machine Intelligence</source>
          , vol.
          <volume>24</volume>
          , pp.
          <fpage>971</fpage>
          -
          <lpage>987</lpage>
          ,
          <string-name>
            <surname>July</surname>
          </string-name>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Tamura</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mori</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Yamawaki</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <article-title>Texture features corresponding to visual perception</article-title>
          .
          <source>IEEE Trans. On Systems, Man, and Cybernetics</source>
          ,
          <volume>8</volume>
          (
          <issue>6</issue>
          ) , (
          <year>1978</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Manjunath</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          , and Ma, W.,
          <article-title>Textures for browsing and retrieval of image data</article-title>
          .
          <source>IEEE Trans on Pattern Analysis and Machine Intelligence</source>
          ,
          <volume>18</volume>
          (
          <issue>8</issue>
          ):
          <fpage>837</fpage>
          -
          <lpage>842</lpage>
          , (
          <year>1996</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.J.,</given-names>
          </string-name>
          <article-title>LIBSVM: a library for support vector machines</article-title>
          , (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Fekete</surname>
            ,
            <given-names>R.B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casagrande</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kegl</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,: MultiBoost: http://mloss.org/software/view/246/
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>García-Cumbreras</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Díaz-Galiano</surname>
            ,
            <given-names>M.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martín-Valdivia</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ureña</surname>
            <given-names>López</given-names>
          </string-name>
          ,
          <string-name>
            <surname>L.A.</surname>
          </string-name>
          :
          <article-title>SINAI at ImageCLEFphoto 2008</article-title>
          . In: On-line Working Notes,
          <string-name>
            <surname>CLEF</surname>
          </string-name>
          <year>2008</year>
          (
          <year>2008</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>