<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>AI600 Lab at ImageCLEF 2019 Concept Detection Task </article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xinyi Wang</string-name>
          <email>iwangxinyi@163.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ningning Liu</string-name>
          <email>ningning.liu@uibe.edu.cn</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Information Technology and Management, University of International Business and Economics</institution>
          ,
          <addr-line>Beijing 100029</addr-line>
          ,
          <country country="CN">P.R.China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of International Trade and Economics</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper we describe the participation of AI600 Lab in the ImageCLEF 2019 Concept Detection task. We adopted an approach based on bagof-visual-words model and logistic regression, using different SIFT descriptors as visual features. The classifiers were trained with different features respectively and weighted results were presented. Our best result ranked 26th among 58 runs and 7th out of 11 participant teams.</p>
      </abstract>
      <kwd-group>
        <kwd>Concept Detection</kwd>
        <kwd>Bag of Visual Words</kwd>
        <kwd>Logistic Regression</kwd>
        <kwd>ImageCLEF</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        In the previous ImageCLEF medical tasks, a lot of remarkable works have been
proposed. While traditional methods and features were used [
        <xref ref-type="bibr" rid="ref1 ref2 ref3">1-3</xref>
        ], methods based on
deep learning were also introduced [
        <xref ref-type="bibr" rid="ref3 ref4">3-4</xref>
        ]. In this year, ImageCLEF 2019 [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] Concept
Detection task [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] aims on interpreting and summarizing the insight of radiology
medical images automatically. For this task, we focused on multi-label classification with
traditional visual features.
      </p>
      <p>The remainder of this paper is organized as follows: Section 2 introduces the
detailed process of our experiment. Section 3 summarizes all of our submissions.
Finally, in Section 4, we make a brief conclusion of our results.
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>Experiments</title>
      <sec id="sec-2-1">
        <title>Data description</title>
        <p>
          This task used a subset of the Radiology Objects in COntext (ROCO) dataset [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
Three image datasets were provided. The training, validation and test datasets
contained 56,629, 14,157 and 10,000 radiology images. The training and validation sets
were accompanied by UMLS concepts extracted from the original image caption. No
external data were used in our participation.
        </p>
        <p>The training and validation sets were labeled with a total of 5,528 different
concepts. We obtained the frequency distribution of all concepts. The distribution is
showed in Table 1. Most of the concepts rarely appeared in the dataset. Only 58 of the
5,528 concepts were labeled with for more than 1000 times. Some major concepts
appeared frequently in the image set while most concepts were difficult to detect.</p>
        <p>
          Besides, we noticed that many labels are linked and correlated. For instance,
images labeled with Concept B in Table 2 were always labeled with Concept A. Among
the concepts which were annotated with for more than 100 times in the training set,
there were 157 pairs of concepts with strict inclusion relation. This relation was used
for detecting some minor concepts.
We employed 4 kinds of SIFT descriptors as visual features: SIFT [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], C-SIFT [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ],
HSV-SIFT [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] and RGB-SIFT. A series of key points of all kinds of descriptors were
extracted from each image. To build a bag-of-visual-words (BoVW) model, 2 million
key points were randomly selected from the training set as the template key points of
visual codebooks. To overcome the memory limitation, we calculated visual
codebooks using mini batch k-means [11], a variant of k-means algorithm. Compared to
kmeans algorithm, mini batch k-means can reduce the amount of computation and
work faster. We tried various codebook sizes, or numbers of cluster centroids, and
eventually used two different sizes: k = 10,000 and k = 20,000.
        </p>
        <p>For all images, histograms of features were calculated with different codebooks.
Each extracted key point in an image was assigned to its closest clustering in the
codebook by calculating the Euclidean distance to the cluster centroids. Then the
frequency of different clusters was calculated as the representations of images.</p>
        <p>Finally, the Term Frequency – Inverse Document Frequency (tf-idf) weights of
visual words frequency matrices were calculated and normalized by the L1-norm.
2.3</p>
      </sec>
      <sec id="sec-2-2">
        <title>Classification</title>
        <p>We employed a two-round classification. As the distribution of concepts was
unbalanced, we dropped most of the concepts and only considered major concepts which
appeared in the training set more than a frequency threshold, F. F ranged from 800 to
1,500. After the first stage of classification, the matrices fed into the model were
augmented with ground truth or predicted values of the appearances of major labels,
then some minor concepts which were subsets of the concepts predicted and appeared
more than 100 times were predicted. This improved the performance of the model
slightly.</p>
        <p>We applied logistic regression as we deemed it a competitive and faster method of
classification compared to support vector machine or k-Nearest Neighbor cluster. For
this multi-label classification task, we trained classifiers for each concept separately.
Each time we only used one feature for training and prediction. The final submissions
were generated from the probabilistic results.
2.4</p>
      </sec>
      <sec id="sec-2-3">
        <title>Experimental environment</title>
        <p>Our experiment was conducted under Ubuntu 18.04 operating system with Python
2.7.15. The mini batch k-means clustering and logistic regression algorithm were
implemented using scikit-learn library [12]. Some necessary libraries, such as
NumPy, Pandas and SciPy were also used. All SIFT visual features were extracted
with ColorDescriptor software (version 4.0) [13].
3
3.1</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <sec id="sec-3-1">
        <title>The submitted runs</title>
        <p>We submitted 7 runs to ImageCLEF 2019 concept detection task, with 1 run of single
feature model and 6 runs of ensemble models. For the ensemble model we weighted
the results of single feature model. The weights of [SIFT, C-SIFT, HSV-SIFT,
RGBSIFT] were [0.3, 0.2, 0.2, 0.3]. For the probability threshold p, we proposed a method
for optimal threshold selection. The probability threshold we used made the concept
distribution of the results on test set similar to the concept distribution of best
predictions on validation set which had higher F1-scores [14]. We picked a few thresholds
in a small range.</p>
        <p>The details of submitted runs are as follows.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Results</title>
        <p>The results obtained by our 7 runs are given in Table 3. All 7 runs were graded
successfully. The best result of our runs scored a F1-score of 0.1656, which ranked 26th
out of 58 runs and 7th out of 11 teams.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>In this paper we have presented the methods we have used in the ImageCLEF 2019
Concept Detection task. We applied multi-label classification based on
bag-of-visualwords model with color descriptors and logistic regression. From our experimental
results we can conclude the following: (i) while RGB-SIFT descriptors performed
best among the color descriptors, the weighted model improved the performance
greatly; (ii) using the semantic relations among the concepts, the two-stage
classification is able to detect some concepts which are small in number, and on the validation
set it can improve the F1-score for about 1%; (iii) with the approach we proposed, it is
still challenging to predict concepts with a very limited number of image samples.
11. Sculley, D.: Web-scale k-means clustering. Proceedings of the 19th International
Conference on World Wide Web. pp. 1177–1178. WWW ’10, ACM, New York, NY, USA, 2010.
12. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel,
M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D.,
Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal
of Machine Learning Research 12, pp. 2825–2830, 2011.
13. van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating Color Descriptors for Object
and Scene Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence,
volume 32 (9), pp. 1582-1596, 2010.
14. Liu, N., Dellandréa, E., Chen, L., Trus, A., Zhu, C., Zhang, Y., Bichot, C., Bres, S., Tellez,
B.: LIRIS-Imagine at ImageCLEF 2012 Photo Annotation Task. CLEF working notes,
CEUR, 2012.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Valavanis</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stathopoulos</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>IPL at ImageCLEF 2017 Concept Detection Task</article-title>
          .
          <source>CLEF working notes, CEUR</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Valavanis</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kalamboukis</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          : IPL at ImageCLEF 2018:
          <article-title>A kNN-based Concept Detection Approach</article-title>
          . CLEF working notes, CEUR,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Pinho</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Costa</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Feature Learning with Adversarial Networks for Concept Detection in Medical Images: UA</article-title>
          .
          <source>PT Bioinformatics at ImageCLEF</source>
          <year>2018</year>
          , CLEF working notes, CEUR,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guo</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>ImageSem at ImageCLEF 2018 Caption Task: Image Retrieval and Transfer Learning</article-title>
          , CLEF working notes, CEUR,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ionescu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Müller</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Péteri</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cid</surname>
            ,
            <given-names>Y.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liauchuk</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kovalev</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klimuk</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tarasau</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ben</surname>
            <given-names>Abacha</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Hasan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.A.</given-names>
            ,
            <surname>Datla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            ,
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            ,
            <surname>Demner-Fushman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            ,
            <surname>DangNguyen</surname>
          </string-name>
          , D.T.,
          <string-name>
            <surname>Piras</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riegler</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tran</surname>
            ,
            <given-names>M.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lux</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pelka</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedrich</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>de Herrera</surname>
            ,
            <given-names>A.G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Garcia</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kavallieratou</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>del Blanco</surname>
            ,
            <given-names>C.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cuevas</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vasillopoulos</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karampidis</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chamberlain</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Campello</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>ImageCLEF 2019: Multimedia Retrieval in Medicine, Lifelogging, Security and Nature</article-title>
          . In:
          <article-title>Experimental IR Meets Multilinguality, Multimodality, and Interaction</article-title>
          .
          <source>Proceedings of the Tenth International Conference of the CLEF Association (CLEF</source>
          <year>2019</year>
          ), Lugano, Switzerland,
          <source>LNCS Lecture Notes in Computer Science</source>
          , Springer,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Pelka</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedrich</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>de Herrera</surname>
            ,
            <given-names>A.G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Müller</surname>
          </string-name>
          , H.:
          <article-title>Overview of the ImageCLEFmed 2019 Concept Detection Task</article-title>
          , CLEF working notes, CEUR,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Pelka</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Koitka</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rückert</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nensa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Friedrich</surname>
            ,
            <given-names>C.M.</given-names>
          </string-name>
          :
          <article-title>Radiology Objects in COntext (ROCO): A Multimodal Image Dataset</article-title>
          ,
          <source>Proceedings of the MICCAI Workshop on Large-scale Annotation of Biomedical data and Expert Label Synthesis (MICCAI LABELS 2018), Lecture Notes in Computer Science (LNCS)</source>
          Volume
          <volume>11043</volume>
          , pp.
          <fpage>180</fpage>
          -
          <lpage>189</lpage>
          , Springer Verlag,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Lowe</surname>
            ,
            <given-names>D.G.</given-names>
          </string-name>
          :
          <article-title>Distinctive image features from scale-invariant keypoints</article-title>
          .
          <source>International Journal of Computer Vision</source>
          , vol.
          <volume>60</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>91</fpage>
          -
          <lpage>110</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Burghouts</surname>
            ,
            <given-names>G.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Geusebroek</surname>
            ,
            <given-names>J.M.:</given-names>
          </string-name>
          <article-title>Performance evaluation of local color invariants</article-title>
          ,
          <source>Computer Vision and Image Understanding</source>
          , vol.
          <volume>113</volume>
          , pp.
          <fpage>48</fpage>
          -
          <lpage>62</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Bosch</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zisserman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muoz</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Scene classifification using a hybrid generative/discriminative approach</article-title>
          ,
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          , vol.
          <volume>30</volume>
          , no.
          <issue>04</issue>
          , pp.
          <fpage>712</fpage>
          -
          <lpage>727</lpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>