<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>FHDO Biomedical Computer Science Group at Medical Classi cation Task of ImageCLEF 2015</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Obioma Pelka</string-name>
          <email>obioma.pelka@googlemail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christoph M. Friedrich</string-name>
          <email>christoph.friedrich@fh-dortmund.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science University of Applied Sciences and Arts Dortmund (FHDO) Emil-Figge-Strasse 42</institution>
          ,
          <addr-line>44227 Dortmund</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper presents the modelling approaches performed by the FHDO Biomedical Computer Science Group for the compound gure detection and sub gure classi cation tasks at ImageCLEF 2015 medical classi cation. This is the rst participation of the group at an accepted lab of the Cross Language Evaluation Forum. For image visual representation, various state-of-the-art visual features such as Bag-of-Keypoints computed with dense SIFT descriptors and the new Border Pro le presented in this work, were adopted. Textual representation was obtained by vector quantisation on Bag-of-Words codebook generated using attribute importance derived from 2-test and the Characteristic Delimiters feature presented in this paper. To reduce feature dimension and noise, the principal component analysis was computed separately for all features. Various multiple feature fusion were adopted to supplement visual image information with their corresponding textual information. Random forest models with 100 to 500 deep trees grown by resampling, a multi class linear kernel SVM with C = 0:05 and a late fusion of the two classi ers were used for classi cation prediction. Six and Eight runs of submission categories: Visual, Textual and Mixed were submitted for the compound gure detection task and sub gure classi cation task, respectively.</p>
      </abstract>
      <kwd-group>
        <kwd>bag-of-keypoints</kwd>
        <kwd>bag-of-words</kwd>
        <kwd>compound gure detection</kwd>
        <kwd>modality classi cation</kwd>
        <kwd>medical imaging</kwd>
        <kwd>image border pro le</kwd>
        <kwd>principal component analysis</kwd>
        <kwd>random forest</kwd>
        <kwd>support vector machine</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>This paper describes the modelling methods and experiments performed by the
FHDO Biomedical Computer Science Group (BCSG) at the ImageCLEF 2015
medical classi cation. This is the rst participation of the BCSG, a research
group from the University of Applied Sciences and Arts Dortmund, at the
crosslanguage image retrieval track ImageCLEF [28] of the Cross Language
Evaluation Forum (CLEF)1.</p>
    </sec>
    <sec id="sec-2">
      <title>1 http://www.clef-initiative.eu/</title>
      <p>The ImageCLEF 2015 medical classi cation task consists of four subtasks:
compound gure detection, multi-label classi cation, gure separation and
subgure classi cation of which the BCSG participated in two subtasks [14].
The remaining of this paper is organised as follows: In section 2 for the
subtask compound gure detection, various image representations extracted are
presented and the model classi er setup as well as submitted runs and their
corresponding results are described. Modelling approach, submitted runs and
results for the sub gure classi cation task are elaborated in section 3. Finally,
conclusions are drawn in section 4.
2</p>
      <sec id="sec-2-1">
        <title>Compound Figure Detection</title>
        <p>2.1</p>
        <sec id="sec-2-1-1">
          <title>Task De nition</title>
          <p>Several gures found in biomedical literature consist of several sub gures. To
obtain e cient image retrieval on a given search, it is necessary that these gures
are separated and not considered as single gures. The rst step in achieving
this goal is to detect these compound gures. The detailed task de nition is
presented in [14].
2.2</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>Visual Features</title>
          <p>For the visual image representation, a combination of high level and low level
features was pursued. This is an important step in order to have 'whole-image'
and 'detail' representation of an image. The Bag-of-Keypoints feature and the
new Border Pro le feature speci cally adapted for this task were used for the
visual image representation. The feature de nitions and extraction procedures
are described in the following subsections.</p>
          <p>Border Pro le: A highly distinguishing feature characterising a compound
gure is the existence of a separating border. These borders are usually of white
or black color. Hence the rst visual feature computed is to detect the presence
of such horizontal and vertical black and white color pro les for all images. For
comprehension, a white or black border is present when all pixels of a row or
column have RGBV alue = [255; 255; 255] or RGBV alue = [0; 0; 0] respectively.
To detect this presence, the functions listed in Table 1 were implemented and
their respective results were concatenated to obtain the complete feature vector.
Fig. 1 depicts a owchart containing the steps computed for detection of white
horizontal borders.</p>
          <p>To visually demonstrate the outcomes of the functions in Table 1, compound
gures separated with white as well as black borders were selected. The
compound gure in Fig. 2 displays the central nervous system and skeletal
involvement by breast cancer of a rat and was adapted from [25].</p>
          <p>The horizontal and vertical bars adjoining the resized [256 x 256] gure show
number of white pixels present in the rows and columns respectively. Considering
that not all existing borders actually separate the existing sub gures, the next
step is to detect and eliminate such frame borders. The cut-o threshold used
was [1:50] and [206:256], i.e only borders located in the rows and columns [51:205]
are treated as separating borders. The light blue bars in Fig. 2 and 3 show frame
borders while the dark blue bars displays detected separating borders.</p>
          <p>
            Compound gures can also be separated using borders with colors other than
white. Figure 3 display the detection of horizontal and vertical black separating
borders. The compound gure adapted from [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ], shows a planning CT image
and its corresponding follow-up CT image acquired at week 6 of combined
radiochemotherapy of a patient. The same cut-o threshold outlined above was
used.
          </p>
          <p>
            Bag-of-Keypoints: For whole-image classi cation tasks, the bag of feature
approach has achieved high accuracy results [29] and [18]. The motivation to this
idea comes from bag-of-word approach used for text categorisation. Limitations
of invariance present in [19] was eliminated in the comprehensively evaluated
approach presented in [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ], which has now become a common state of the art
approach for image classi cation. They proposed a method called Bag-of-Keypoints
(BoK) which is based on vector quantization of a ne invariant descriptors of
image patches. Apart from the invariance to a ne transformation, another
advantage that comes with this method is the simplicity.
          </p>
          <p>
            The task here to tackle being a whole-image classi cation task, the
Bag-ofKeypoints approach was adopted as a visual image representation. The functions
used for this approach are from the VLFEAT library [27]. As visual descriptors,
dense SIFT descriptors applied at several resolutions were uniformly extracted
with an interval grid of 4 pixels using the vl-phow function. To speed up
computational time, k-means clustering with approximated nearest neighbours (ANN)
[15] was computed on these randomly chosen descriptors using the vl-kmeans
function, to partition the observations into k clusters so that the within-cluster
sum of squares is minimised [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ].
          </p>
          <p>A maximum of 20 iterations was de ned to allow the k-means algorithm
converge. The cluster centres were initialised using random data points. As
k = 12000, a codebook containing 12; 000 keypoints was generated and was
further optimised by adapting a kd-tree with metric distance L2 for quick nearest
neighbour lookup using vl-kdtreebuild function.
2.3</p>
        </sec>
        <sec id="sec-2-1-3">
          <title>Textual Features</title>
          <p>Text representations for all images was derived from their gure caption. All
gures in the ImageCLEF collection originate from biomedical literature published
in PubMed Central2. The original gure caption and journal title are extracted
from the provided XML les of this task.</p>
          <p>Bag-of-Words: The Bag-of-Words (BoW) approach [24] is one of the common
methods used for text classi cation. The basic concept here is to extract features
by counting the frequency or presence of words in the text to be classi ed. These
words have to be de ned rst in a dictionary or codebook. To generate the
needed dictionary, all words from the captions of all images in the distributed</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2 http://www.ncbi.nlm.nih.gov/pmc/</title>
      <p>collection were extracted. Several text processing procedures such as removal
of stop-words and stemming using PorterStemmer [23] were induced to obtain
a positive e ect on computational time performance. The occurrence (%) for
all words in both classes was computed. Words with less than 85% di erence
between the two classes were eliminated to further reduce the dictionary size.
For the BoW representation two dictionaries were created:
{ Dictionary1 (D1): 455 words obtained with porter stemming, removal of
stop-words and word occurrence.
{ Dictionary2 (D2): 3906 words obtained with removal of stop-words and
word occurrence.</p>
      <p>The bene t of 2-test and Information Gain have been investigated, but not
further used since no relevant advantage was detected during feature selection.
Charateristic Delimiters: When captions of compound gures are written,
it is most likely that existing sub gures are addressed using some delimiter.
Depending on the certainty that a gure can only be called 'compound gure'
when it contains at least two sub gures, the presence of two delimiters was
determined.</p>
      <p>To achieve this task, a set of possible double delimiters characterising
compound gures was computed. This step was manually done by analysing the
captions of compound gures from the training set and selecting words with very
high occurrence. Such words that appear often and hence signi cantly
characterise the presence of sub gures are referred in this work as 'Characteristic
Delimiters'. A sub-collection of delimiters used are listed in Table 2.</p>
      <p>
        If existence of a delimiter pair is detected in the caption of an image, the
gure is textually represented by assigning the value [1; 1] and otherwise [0; 0]
to the feature vector.
A fusion of all textual and visual representations will result to a feature
vector with 15910 columns. To model an e cient and e ective classi er, the
feature dimensions and noise is reduced using the principal component analysis [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
The principal component analysis is separately computed on each feature vector
group as shown in Fig. 4. Subsequently, the best number of principal components
needed to describe the feature were estimated by model selection.
      </p>
      <p>The feature vector Border Pro le and Characteristic Delimiter have both 2
columns and hence do not need any dimension reduction. Di erent
combinations of the derived principal components are concatenated to obtain the nal
feature vector used for training the classi er. These combinations are the various
runs submitted for evaluation. Table 3 lists the e ects on prediction accuracy
when certain features are left out during the feature fusion stage. In this ex-post
analysis, the contribution (%) for each feature was computed by applying the
classi er model of Run4 on the evaluation set and on 10 sampled learning and
validation sets. It can be seen that all features contribute positively.</p>
      <p>
        The distributed collection was split into 10 di erent learning and validation
sets using the bootstrap algorithm [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. For category prediction, a random forest
(RF) classi er [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] using the fitensemble function from the MATLAB software
package [22] was modelled. The list below is an excerpt of several parameters
used to tune the classi er model.
      </p>
      <p>{ Number of Trees = 200
{ Number of Leaf Size = [0.04, 0.06, 0.3]
{ Split Criterion = Exact
{ Ensemble grown = By resampling
2.5</p>
      <sec id="sec-3-1">
        <title>Submitted Runs</title>
        <p>In this section, the six compound gure detection runs submitted by the
Biomedical Computer Science Group for evaluation are presented.</p>
        <p>{ task1 run1 mixed stemDict: A combination of BoW with Dictionary1
textual features and BoK visual features was used to train the classi er.
{ task1 run2 mixed sparse1: Visual features: Border pro le and
Characteristic Delimiter combined with textual features derived from the BoW
Dictionary1.
{ task1 run3 mixed sparse2: Same as run2 without the BoW textual
representation.
{ task1 run4 mixed bestComb: Fusion of all features described. BoW
features extracted using Dictionary2.
{ task1 run5 visual sparseSift: This random forest model classi er is trained
only with the visual features: Bag-of-Keypoints and Border Pro le.
{ task1 run6 text sparseDict: Model was trained only with the textual
features BoW with Dictionary1 and Characteristic Delimiter.
2.6</p>
      </sec>
      <sec id="sec-3-2">
        <title>Results</title>
        <p>Six runs (four Mixed, one Visual and one Textual) were submitted for evaluation.
Table displays the o cial evaluation accuracy and retrieval type for each run.
The fourth column displays the standard mean accuracy and standard deviation
achieved on 10 sampled learning and validation set derived using the bootstrap
algorithm.</p>
        <sec id="sec-3-2-1">
          <title>Sub gure Classi cation</title>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>Task De nition</title>
        <p>Clinicians have implied on the importance of the modality of an image in
several user-studies. The usage of modality information signi cantly increases the
retrieval e ciency, thus image modality has become an essential and relevant
factor regarding medical information retrieval [13]. The sub gure classi cation
subtask aims to evaluate approaches that automatically predict the modality of
medical images from biomedical Journals. For further task de nition, refer to
[14].</p>
        <p>Some image categories were represented by few annotated examples, thus
the expansion of the original collection was strived in order to counteract the
imbalanced dataset. Additional datasets created are described below:
3.2</p>
      </sec>
      <sec id="sec-3-4">
        <title>Visual Features</title>
        <p>
          Over the years, various techniques for medical imaging have been developed.
Each having not only its advantages and disadvantages, but also di erent
acquiring technique. Hence various feature extracting methods are needed to
apprehend the possible characteristics of medical images [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. In addition, images
have to be completely represented, i.e. 'whole image' and 'detail' representation.
This can be acquired by extracting global and local features. These features:
BAF, Gabor, JCD, Tamura, PHOG were extracted using functions from the
LIRE: Lucene Image Retrieval library [21].
        </p>
        <p>
          { Bag-of-Keypoints: Visual image representation using the Bag-of-Keypoints
approach described in the subsection 2.2. With the distinction, that three
di erent datasets were used to create various codebooks accordingly.
{ BAF: The global features (brightness, clipping, contrast, hueCount,
saturation, complexity, skew and energy) represented as a 8-dimensional vector.
{ CEDD: Low-level feature CEDD (Color and Edge Directivity Descriptor) [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]
incorporating color and textures information were extracted and represented
as a 144-dimensional vector.
{ FCH: The Fuzzy Color Histogram considers through fuzzy-set membershipp
function the color similarity of each pixel's color to all histogram binis and
is represented as a 10-dimensional vector using the fuzzy linking method
[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ],[17].
{ Gabor: A 60-dimensional vector was used to represent texture features
based on Gabor functions.
3 http://www.imageclef.org/2013/medical
{ JCD: The Joint Composite Descriptor (JCD) is a combination of two
Compact Composite Descriptors: Color and Edge Directivity Descriptor (CEDD)
and Fuzzy Color Texture Histogram (FCTH) [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. The feature made up of
merging the texture areas of CEDD and FCTH was represented as a
168dimensional vector.
{ Tamura: The Tamura features consisting of six basis textural feature:
coarseness, contrast, directionality, line-likeness, regularity and roughness, were
represented as a 18-dimensional vector [26].
{ PHOG: The Pyramid of Histograms of Oriented Gradients (PHOG) feature
proposed in [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] represents an image by its local shape and the spatial layout
of the shape. A 630-dimensional vector was used for feature representation.
3.3
        </p>
      </sec>
      <sec id="sec-3-5">
        <title>Textual Features</title>
        <p>Similar to the compound gure detection task, textual representation for the
gures was adopted using their corresponding captions.</p>
        <p>Bag-of-Words: The process of textual representation executed is
complementary to the process for the compound gure detection task described in
subsection 2.3. With an adjustment in dictionary generation and word selection
method. The gures distributed for the sub gure classi cation task are sub
gures extracted from compound gures, hence their corresponding captions
actually describe compound gure and not the single sub gures. Considering that
multipane gures consist of sub gures not only from the same category but also
from multiple categories, using the original captions to represent the sub gures
will not lead to a valuable characterisation.</p>
        <p>
          To overcome this limitation, the dictionary was built using the DataSet4.
The gures in this dataset do not originate from multipane gures and thus have
characteristic captions that can be mapped to the 30 sub gure categories. All
words from all captions were retrieved, removal of stop-words and stemming was
done in the text preprocessing stage. To develop a dictionary containing relevant
words for each category, vector quantisation on all gures and the 2-test [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]
was computed on the derived matrix. With this step, attribute importance for
all words was e ectuated.
        </p>
        <p>A dictionary with 438 words was nally obtained by selecting words with
attribute importance over a xed cuto threshold. The captions of the sub
gures were trimmed to relevance using the characteristic delimiters presented in
subsection 2.3 before vector quantisation on the generated dictionary was
performed.
3.4</p>
      </sec>
      <sec id="sec-3-6">
        <title>Classi er Setup</title>
        <p>
          Contrary to the compound gure detection task, not only a random forest
classier model was used. A multiclass linear kernel SVM from the libSVM library [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ]
was modelled to compare prediction accuracies between the two classi er
models, as it has been a popular approach in former ImageCLEF medical challenges
[13]. The cost parameter used was C = 0:05. The random forest model was tuned
with the same parameters mentioned in subsection 2.4. Ten samples of learning
and validation sets were obtained using the bootstrap algorithm [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ].
        </p>
        <p>To reduce computational time, feature dimension and noise reduction was
achieved using the principal component analysis. All features beside the BAF
features were reduced using this method. Table 5 presents the original and
truncated vector size after computing the principal component analysis on each
feature. The contribution of a feature to prediction performance is an important
attribute that assists e cient feature selection. To obtain each feature
contribution, the di erence between the accuracy when all features are combined and the
accuracy when a certain feature is omitted was calculated and displayed in the
fourth column of Table 5. The feature contribution analysis was done ex-post.
The prediction accuracy used for this analysis was computed by applying the
classi er model Run1 on the original evaluation set.</p>
        <p>Drawing conclusions using Table 5, it can be seen that omitting most of
the extracted features leads to a negative e ect on prediction performance. The
representations BoK, BoW and BAF have the most contributions. In contrary,
the omission of PHOG feature has a positive e ect on the prediction
performance and hence increases the evaluation accuracy with +0.27%. The principal
components computed from Gabor image representation did not improve the
prediction accuracy and was omitted from the nal fused feature vector used for
classi cation.</p>
        <p>Descriptor
Bag-of-Keypoints
Bag-of-Words
BAF
CEDD
FCH
Gabor
JCD
Tamura
PHOG
12000
438
8
144
10
60
168
18
630
25
40
8
5
3
0
5
2
2
-2.99
-6.42
-4.06
-0.49
-0.67
00.00
-0.43
-0.76
+0.27
n
u
r
4
k
s
a
t
:
D
I
n
u</p>
        <p>R
1 combination
2 visual
3 textual
4 clean rf
5 train 20152013
6 clean libnorm
7 clean comb librf Mixed
8 clean short rf</p>
        <p>Mixed
e
p
y
T
n
iisso roy
l
e
d
o
M
r
e
i
ubm teag lsa</p>
        <p>s</p>
        <p>S C C
Mixed Random</p>
        <p>Forest
Visual Random</p>
        <p>Forest
Textual Random</p>
        <p>Forest
Mixed Random</p>
        <p>Forest
Mixed Random</p>
        <p>Forest
Mixed LibSVM</p>
        <p>LibSVM
RF
Random
Forest
d
seu ign
teS iran
taaD froT
DS1
DS1
DS1
DS1
DS3
DS1
DS1
DS1
The BCSG submitted eight (six Mixed, one Textual and one Visual) runs for
evaluation. The several fusion approaches de ning the submitted runs are
displayed in Table 6. In addition for each set, the prediction performance obtained
on 10 sampled learning and validation set using the same modelling approach is
listed.
The BCSG submitted runs in all submission categories: Visual, Textual and
Mixed. Most of the submitted runs belong to the submission category 'Mixed'
which is a combination of textual and visual representation. This decision was
made because not only were better accuracies obtained during development,
but also evaluation results presented by other ImageCLEF participant groups
in the previous years tasks have proven to be better when the 'Mixed'
submission category is induced [13],[16]. Figure 5 depicts the achieved performance of
all submitted runs for the sub gure classi cation task. Runs belonging to the
Biomedical Computer Science Group are represented in as colored bars and the
gray bars represent submissions of other participants.</p>
        <p>The prediction confusion obtained applying the modelling setup Run5 on
the o cial evaluation set is shown in Fig. 6. Applying the same model setup
on a sampled validation set results to the prediction confusion displayed in Fig.
7. The prediction performance achieved for this task is not comparable to that
of the ImageCLEF 2013 Modality Classi cation subtask. The two tasks have a
similar modality hierarchy, however 37.74% of the ImageCLEF 2013 training set
represents the additional 'Compound or Multipane images (COMP)' class.
Fig. 7. Confusion matrix by applying
run5 on a sampled validation set</p>
        <sec id="sec-3-6-1">
          <title>Conclusions</title>
          <p>Various classi cation prediction approaches based on multiple feature fusion
and combination of classi er models were explored for the ImageCLEF 2015
medical classi cation task. Negative di erences in the prediction performance
were observed when the Bag-of-Keypoints representation was computed using
SIFT [20] instead of dense SIFT descriptors, feature vectors weren't normalised
and single precision format was used rather than double precision format to
de ne oating-points numbers. The discrepancy between prediction performance
on the evaluation set and on the sampled learning and validation sets is assumed
to be an over tting problem. Supplementing visual image representation with
corresponding textual representation proved to be a bene cial strategy regarding
classi cation accuracy. Omitting any of the described features apart from the
PHOG feature, results to a negative decrease on the o cial evaluation accuracy.
The proposed Border Pro le image representation could be further enhanced by
implementing additional functions to detect border pro le of colors other than
black and white.
13. Garc a Seco de Herrera, A., Kalpathy-Cramer, J., Demner Fushman, D., Antani,
S., Muller, H.: Overview of the ImageCLEF 2013 medical tasks. In: Working Notes
of CLEF 2013 (Cross Language Evaluation Forum) (2013)
14. Garc a Seco de Herrera, A., Muller, H., Bromuri, S.: Overview of the ImageCLEF
2015 medical classi cation task. In: Working Notes of CLEF 2015 (Cross Language
Evaluation Forum) (2015)
15. Indyk, P., Motwani, R.: Approximate nearest neighbors: Towards removing the
curse of dimensionality. In: Proceedings of the 30th Annual ACM Symposium on
Theory of Computing. pp. 604{613. STOC '98, ACM, New York, NY, USA (1998)
16. Kalpathy-Cramer, J., Garc a Seco de Herrera, A., Demner-Fushman, D., Antani,
S., Bedrick, S., Muller, H.: Evaluating performance of biomedical image retrieval
systems{ an overview of the medical image retrieval task at ImageCLEF 2004{2014.</p>
          <p>Computerized Medical Imaging and Graphics (2014)
17. Konstantinidis, K., Gasteratos, A., Andreadis, I.: Image retrieval based on fuzzy
color histogram processing. Optics Communications 248(4{6), 375 { 386 (2005)
18. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid
matching for recognizing natural scene categories. In: Proceedings of the 2006
IEEE Computer Society Conference on Computer Vision and Pattern Recognition
- Volume 2. pp. 2169{2178. CVPR '06 (2006)
19. Li, S.Z., Zhu, L., Zhang, Z., Blake, A., Zhang, H., Shum, H.: Statistical learning of
multi-view face detection. In: In Proceedings of the 7th European Conference on
Computer Vision. pp. 67{81 (2002)
20. Lowe, D.G.: Distinctive image features from scale-invariant keypoints.
International Journal of Computer Vision 60, 91{110 (2004)
21. Lux, M., Chatzichristo s, S.A.: Lire: Lucene image retrieval an extensible java
cbir library. In: El-Saddik, A., Vuong, S., Griwodz, C., Bimbo, A.D., Candan,
K.S., Jaimes, A. (eds.) ACM Multimedia. pp. 1085{1088. ACM (2008)
22. MATLAB: version 8.5.0.197613 (R2015a). The MathWorks Inc., Natick,
Massachusetts (2015)
23. Porter, M.: An algorithm for su x stripping. Program-electronic Library and
Information Systems 14, 130{137 (1980)
24. Salton, G., McGill, M.J.: Introduction to modern information retrieval.
McGraw</p>
          <p>Hill computer science series, McGraw-Hill, New York (1983)
25. Song, H.T., Jordan, E.K., Lewis, B.K., Liu, W., Ganjei, J., Klaunberg, B., Despres,
D., Palmieri, D., Frank, J.A.: Rat model of metastatic breast cancer monitored by
MRI at 3 tesla and bioluminescence imaging with histological correlation. Journal
of Translational Medicine 7 (2009)
26. Tamura, H., Mori, S., Yamawaki, T.: Texture features corresponding to visual
perception. IEEE Transactions on System, Man and Cybernatic 6 (1978)
27. Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision
algorithms. In: Proceedings of the International Conference on Multimedia. pp.
1469{1472. MM '10, ACM (2010)
28. Villegas, M., Muller, H., Gilbert, A., Piras, L., Wang, J., Mikolajczyk, K., de
Herrera, A.G.S., Bromuri, S., Amin, M.A., Mohammed, M.K., Acar, B., Uskudarli,
S., Marvasti, N.B., Aldana, J.F., del Mar Roldan Garc a, M.: General Overview of
ImageCLEF at the CLEF 2015 Labs. Lecture Notes in Computer Science, Springer
International Publishing (2015)
29. Zhang, H., Berg, A.C., Maire, M., Malik, J.: Svm-knn: Discriminative nearest
neighbor classi cation for visual category recognition. In: Proceedings of the 2006
IEEE Computer Society Conference on Computer Vision and Pattern Recognition
- Volume 2. pp. 2126{2136. CVPR '06 (2006)</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bosch</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zisserman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Munoz</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Representing shape with a spatial pyramid kernel</article-title>
          .
          <source>In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval</source>
          . pp.
          <volume>401</volume>
          {
          <fpage>408</fpage>
          . CIVR '07,
          <string-name>
            <surname>ACM</surname>
          </string-name>
          , New York, NY, USA (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Breiman</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>Random forests</article-title>
          .
          <source>Mach. Learn</source>
          .
          <volume>45</volume>
          (
          <issue>1</issue>
          ),
          <volume>5</volume>
          {
          <fpage>32</fpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <issue>3</issue>
          .
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.J.:</given-names>
          </string-name>
          <article-title>LIBSVM: A library for support vector machines</article-title>
          .
          <source>ACM Transactions on Intelligent Systems and Technology</source>
          <volume>2</volume>
          ,
          <issue>27</issue>
          :1{
          <fpage>27</fpage>
          :
          <fpage>27</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Chatzichristo s,
          <string-name>
            <given-names>S.A.</given-names>
            ,
            <surname>Boutalis</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.S.</surname>
          </string-name>
          :
          <article-title>Compact Composite Descriptors for Content Based Image Retrieval: Basics, Concepts, Tools</article-title>
          . VDM Verlag, Saarbrucken,
          <string-name>
            <surname>Germany</surname>
          </string-name>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>h</year>
          .:
          <article-title>Computer vision in medical imaging</article-title>
          . World Scienti c (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Cochran</surname>
          </string-name>
          , W.G.:
          <article-title>The 2 test of goodness of t</article-title>
          . Ann. Math. Statist.
          <volume>23</volume>
          (
          <issue>3</issue>
          ),
          <volume>315</volume>
          {
          <fpage>345</fpage>
          (
          <year>1952</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Csurka</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dance</surname>
            ,
            <given-names>C.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fan</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Willamowski</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bray</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Visual categorization with bags of keypoints</article-title>
          . In: In Workshop on Statistical Learning in
          <source>Computer Vision</source>
          , ECCV. pp.
          <volume>1</volume>
          {
          <issue>22</issue>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Dunteman</surname>
            ,
            <given-names>G.H.</given-names>
          </string-name>
          :
          <article-title>Principal Components Analysis</article-title>
          . Sage University paper.
          <article-title>Quantitative applications in the social sciences, Sage publications</article-title>
          , Newbury Park, London, New Delhi (
          <year>1989</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Efron</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tibshirani</surname>
            ,
            <given-names>R.J.:</given-names>
          </string-name>
          <article-title>An Introduction to the Bootstrap</article-title>
          . Chapman &amp; Hall, New York (
          <year>1993</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Guckenberger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Baier</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Richter</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wilbert</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Flentje</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Evolution of surface-based deformable image registration for adaptive radiotherapy of non-small cell lung cancer (NSCLC)</article-title>
          .
          <source>Radiation Oncology</source>
          <volume>4</volume>
          (
          <issue>68</issue>
          ),
          <volume>2169</volume>
          {
          <fpage>2178</fpage>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Han</surname>
            ,
            <given-names>J</given-names>
          </string-name>
          ., Ma, K.K.:
          <article-title>Fuzzy color histogram and its use in color image retrieval</article-title>
          .
          <source>IEEE Transactions on Image Processing</source>
          <volume>11</volume>
          (
          <issue>8</issue>
          ),
          <volume>944</volume>
          {
          <fpage>952</fpage>
          (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Hartigan</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wong</surname>
            ,
            <given-names>M.A.</given-names>
          </string-name>
          :
          <article-title>A k-means clustering algorithm</article-title>
          .
          <source>JSTOR: Applied Statistics</source>
          <volume>28</volume>
          (
          <issue>1</issue>
          ),
          <volume>100</volume>
          {
          <fpage>108</fpage>
          (
          <year>1979</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>