<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CIS UDEL Working Notes on ImageCLEF 2015: Compound gure detection task</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xiaolong Wang</string-name>
          <email>xiaolong@udel.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiangying Jiang</string-name>
          <email>jiangxy@udel.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Abhishek Kolagunda</string-name>
          <email>abhi@udel.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hagit Shatkay</string-name>
          <email>shatkay@udel.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chandra Kambhamettu</string-name>
          <email>chandrak@udel.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer and Information Sciences, University of Delaware</institution>
          ,
          <addr-line>Newark, DE</addr-line>
          <country country="US">US</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Figures that are included in biomedical publications play an important role in understanding essential aspects of the paper. Much work over the past few years has focused on gure analysis and classi cation in biomedical documents. As many of the gures appearing in biomedical documents comprise multiple panels (sub gures), the rst step in the analysis requires identi cation of compound gures and their segmentation into sub gures. There is a wide variety ways to detect compound gures. In this paper, we utilize only visual information to identify compound vs non-compound gures. We have tested the proposed approach on the ImageCLEF 2015 benchmark of 10; 434 images; our approach has achieved an accuracy of 82:82%, thus demonstrating the best performance when compared to other systems that use only visual information for addressing the compound gure detection task.</p>
      </abstract>
      <kwd-group>
        <kwd>Compound gure detection</kwd>
        <kwd>visual information</kwd>
        <kwd>biomedical image analysis</kwd>
        <kwd>image classi cation</kwd>
        <kwd>ImageCLEF 2015</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        These authors contributed equally to this work.
work can be categorized into two main schemes: The rst is based on the
analysis of peak region detection within the image; the peak region is then used as a
reference to nd separating lines for segmentation [
        <xref ref-type="bibr" rid="ref2 ref8">2, 8</xref>
        ]. The main drawback of
this scheme is that it is susceptible to noise and may lead to over-segmentation
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. This issue is especially prevalent in irregular compound gures, where the
separators between di erent sub gures do not cut across a complete row or
column. Moreover, setting up the threshold value for segmentation with respect to
a peak region is not straightforward | di erent thresholds usually lead to di
erent results. For instance, Chhatkuli et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] set the threshold at 0.97 times the
maximum value in a given gure. This threshold value is based on manual tests
over the training data. Another factor is the occurrence of text within gures.
As text is irregular, it can be an obstacle for obtaining the segmentation lines
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Removing text from a compound gure usually plays an important role in
the nal result.
      </p>
      <p>
        The second scheme is based on connected components analysis [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], as was
done in earlier work [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The general idea is to evaluate the connectivity among
di erent sub gures within a compound gure using visual information.
Connected components analysis groups the pixels into di erent components using
similarity in pixel values. Pixels in each resulting component share similar
values. Once di erent connected regions are formed, the boundary between di erent
regions can be used as segmentation lines separating di erent components. In
this work, the analysis of connected components is applied rst to the given
gure, while we also add several post-processing steps. These post-processing steps
help improve compound gure detection. We then integrate the two di erent
schemes. The experimental results demonstrate that the fusion scheme can help
improve performance compared to each of the individual schemes applied alone.
      </p>
      <p>The rest of the paper is organized as following. In Section 2, we provide an
overview of the datasets. The proposed compound gure detection approach is
discussed in Section 3. The analysis of component connectivity based scheme is
discussed rst, followed by a presentation of the peak region detection scheme
and the fusion scheme. Section 4 presents the experimental results submitted to
the ImageCLEF 2015. In the end, conclusions are given in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Dataset</title>
      <p>
        In our experiments, we use the dataset provided in the ImageCLEFmed 2015
benchmark. We refer the reader for more details to the respective task
description [
        <xref ref-type="bibr" rid="ref1 ref3 ref4 ref7">7, 3, 1, 4</xref>
        ]. In this report, we focus on the medical image classi cation task
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and speci cally on compound gure detection. Notably, we use only visual
information for addressing this task.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Approach</title>
      <p>In this report, we rst discuss the proposed compound gure detection scheme,
where we illustrate the details of our detection method, utilizing only visual
information. As shown in the ImageCLEF15 comparison of the results with those
obtained by other systems, our approach achieves the highest level of
performance among schemes that use only visual information, while its accuracy is
only 2:57% lower than that of the top performing scheme, which combines
visual and textual information.
3.1</p>
      <sec id="sec-3-1">
        <title>Connected Component Analysis Based Scheme</title>
        <p>
          The rst part of our compound gure detection scheme is based on the analysis
of component connectivity of sub gures in an image [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. This scheme is based
on graph traversal theory. The general idea is to determine the connectivity of
the current pixel to neighboring pixels based on pixel-intensity; the method is
both e ective and simple to implement.
        </p>
        <p>
          The general scheme of connected component analysis used in our work is
as follows: First, RGB images are converted into grayscale images. Then we
rescale the pixel intensities in the whole image into values in the range [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ].
The underlying assumption is that the boundary between sub gures typically
consists of white pixels. The white area is de ned as the region where pixel values
are consistently greater than 0.9; other regions are de ned as black regions. By
comparing the image intensities to this threshold value, we can get the mask
image M . M is a binary image as indicated in Fig. 1. In this work, the white
color represents the foreground and black indicates the background region. The
connected components are extracted based on the mask binary image M .
        </p>
        <p>After that, we scan the resulting, simpli ed image pixel-by-pixel (top to
bottom and left to right). Connected regions in which adjacent pixels share a
similar range of intensity values [v0; v1] are identi ed. The connected components
labeling operator scans the whole image by moving along each row until it reaches
a pixel p which has not been previously labeled. If pixel p is not labeled in the
previous stage, we examine two p0s predecessor neighboring pixels directly up
(denoted pu), and to the left (pl). The label value assigned to pixel p is based
on the comparison with these two neighboring pixels.</p>
        <p>
          After scanning the whole image, each detected component in the gure is
labeled with a di erent value. An important issue for compound gure detection
is to minimize the in uence of false positive area where non-compound regions
are misclassi ed as compound regions. Most of these false positive areas are
caused by the connected text. To address this issue, rather than directly removing
text from the images [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ], we apply a criterion based on the ratio evaluation among
regions' areas. Two di erent ratio criteria are used in this work. The rst ratio
value Tr1 is de ned as the ratio between area of the detected sub gure and
the whole gure. If Tr1 is smaller than 0:1, then this region is classi ed as false
positive. The second ratio value Tr2 is calculated based on the area ratio between
the detected components and the maximum component. If the ratio value Tr2 is
smaller than 0:15, the detected region is classi ed as false positive. This setting
has proven e ective in our experiments as illustrated in Fig. 1.
        </p>
        <p>The illustration of the whole scheme is presented in Fig. 4. If more than one
sub gure is detected, the given gure is classi ed as compound gure, otherwise
not. To handle compound gures separated by black rather than white regions,
we invert all images and perform sub gure detection using the same procedure.
3.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Peak Region Detection Based Scheme</title>
        <p>
          Besides the connected component analysis method described above, we also test
the performance of directly using pixel intensity to segment the gure as used in
works [
          <xref ref-type="bibr" rid="ref2 ref8">2, 8</xref>
          ]. The idea of this method is to nd white margins based on the pixel
intensity. As indicated in Fig. 2, the images are scanned in two directions, namely
along the x-axis and along the y-axis. Both scanning processes are conducted
iteratively until no more white margins are detected.
        </p>
        <p>Consider an image I represented as a matrix I(x; y), where x is the row
index and y is the column index; let W and H be the total number of rows
and columns, respectively. Assuming that the sub gures are separated by white
margins, the rst step is pixel projections operation along the x-axis and along
the y-axis as indicated in Fig. 2. Formally:</p>
        <p>Ix = min I(x; y) y2 [1; :::; H ];
Iy = min I(x; y) x2 [1; :::; W ];
(1)
that is, Ix is a candidate separating row and Iy is a candidate separating column.
The next step is to nd a peak region within Ix and Iy. The peak region indicates
an area located within the continuous region whose pixel value is greater than
a prede ned threshold. In this work, considering the noise and other in uential
factors, the threshold is set to 0:85 times of the maximum pixel intensity in
the whole image. By comparing with the threshold, we can nd the peak region
along Ix and Iy vector. From this, we obtain the index and the region width.
These peak regions are regarded as the margin between sub gures. Based on
these detected margins, the sub gure region is then calculated.</p>
        <p>For a speci c testing image, to get rid of false positives and minimize the
in uence of the text region, two di erent post-processing steps are applied. First,
we set a threshold on the minimum area of a detected peak region. Another
criterion is to measure the ratio value calculated between the current segmented
area and the maximum segment detected. If the ratio is smaller than 0:3, the
detected segmentation region is classi ed as false positive. If more than one
sub-region is detected, input gure is classi ed as compound, otherwise not. As
before, this method assumes that the separation between sub- gures consists of
white pixels. To also consider black separators, if the gure is not classi ed as
compound, we invert the image and go through the processing steps discussed
above again.
3.3</p>
      </sec>
      <sec id="sec-3-3">
        <title>Fusion Scheme</title>
        <p>In our work, we also fuse the above two di erent schemes { connected component
analysis is used as the rst step and if no compound gure is detected, peak
region detection is applied as the second step.</p>
        <p>An illustration of the proposed scheme is shown in Fig. 3. Our results showed
that connectivity component analysis is good at removing false positives caused
by text regions. However, it is not as e ective for detecting compound gures
consisting of graph images (e.g. line graphs or diagrams). These types of compound
gures can be detected using peak region detection approach. We conducted a
standalone comparison between the proposed di erent schemes to evaluate their
respective performance.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Results</title>
      <p>We evaluated the proposed approach on ImageCLEF 2015 benchmark, which
includes 10,434 di erent gures. Several illustrations of the experimental results
are provided in Fig. 4. The overall accuracy is calculated as accuracy = Cg
C
100%, where Cg represents the number of correctly detected gures and C is the
total number of samples in the set. In addition, we also consider the well-known
recall and precision measures, as shown in Table 1. The latter two measures are
calculated as:</p>
      <p>P recision =</p>
      <p>Recall =</p>
      <p>T P
T P + F P</p>
      <p>T P
T P + F N
;
;
(2)
where T P is the number of true positives (compound gures) detected by the
proposed scheme, F P is the number of false positives ( gures that are
noncompound, but labeled as compound by our scheme), and F N is the number
of false negatives ( gures that are compound, but not detected as such by the
proposed approach).</p>
      <p>As listed in Table 1, the connected component analysis based scheme
performs better than the peak-region detection based scheme. By combining the two
di erent schemes, we have obtained an accuracy of 82:82% on the test dataset.</p>
      <p>
        For the sake of completeness, we also demonstrate several cases, shown in
Fig. 5, in which our system fails to detect or to correctly segment a compound
gure. As illustrated by Fig. 5(a), when the boundaries between sub gures are
thin, although our algorithm can correctly classify the given compound gure,
the sub- gure segmentation does not work well. Moreover, segmenting diagrams
remains a challenge, as indicated in Fig. 5(b). Over-segmentation is still a
common problem for this kind of non-compound gures [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Future Work</title>
      <p>In this work, we have studied the problem of compound gure detection. Two
di erent schemes, as well as an integration of the two, are evaluated. Our
integrated scheme outperforms the other systems that use only visual information,
participating in this challenge, by more than 10%. In this challenge, the only
system outperforming this system (by 2.57%) used a combination of textual and
visual information.
6</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgment</title>
      <p>This work was supported by NIH Award 1R56LM011354-01A1.</p>
      <p>Fig. 3. Illustration of the proposed fusion scheme for compound gure detection.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Amin</surname>
          </string-name>
          and
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Mohammed</surname>
          </string-name>
          .
          <article-title>Overview of the ImageCLEF 2015 medical clustering task</article-title>
          .
          <source>In CLEF2015 Working Notes, CEUR Workshop Proceedings</source>
          , Toulouse, France, September 8-11
          <year>2015</year>
          .
          <article-title>CEUR-WS.org</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>A.</given-names>
            <surname>Chhatkuli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Foncubierta-Rodr guez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Markonis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Meriaudeau</surname>
          </string-name>
          , and
          <string-name>
            <given-names>H.</given-names>
            <surname>Mu</surname>
          </string-name>
          <article-title>ller. Separating compound gures in journal articles to allow for sub gure classi cation</article-title>
          .
          <source>In SPIE medical imaging</source>
          ,
          <source>pages 86740J{86740J. International Society for Optics and Photonics</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>A.</given-names>
            <surname>Garc</surname>
          </string-name>
          a Seco de Herrera, H. Muller, and
          <string-name>
            <given-names>S.</given-names>
            <surname>Bromuri</surname>
          </string-name>
          .
          <article-title>Overview of the ImageCLEF 2015 medical classi cation task</article-title>
          .
          <source>In Working Notes of CLEF</source>
          <year>2015</year>
          (
          <article-title>Cross Language Evaluation Forum)</article-title>
          ,
          <source>CEUR Workshop Proceedings. CEUR-WS.org</source>
          ,
          <year>September 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>A.</given-names>
            <surname>Gilbert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Piras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Yan</surname>
          </string-name>
          , E. Dellandrea,
          <string-name>
            <given-names>R.</given-names>
            <surname>Gaizauskas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Villegas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Mikolajczyk</surname>
          </string-name>
          .
          <article-title>Overview of the ImageCLEF 2015 Scalable Image Annotation, Localization and Sentence Generation task</article-title>
          .
          <source>In CLEF2015 Working Notes, CEUR Workshop Proceedings</source>
          , Toulouse, France, September 8-11
          <year>2015</year>
          .
          <article-title>CEUR-WS.org</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>R.</given-names>
            <surname>Gonzalez</surname>
          </string-name>
          and
          <string-name>
            <given-names>E. Richard. Digital Image</given-names>
            <surname>Processing.</surname>
          </string-name>
          Prentice-Hall,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>H.</given-names>
            <surname>Shatkay</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Chen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Blostein</surname>
          </string-name>
          .
          <article-title>Integrating image data into biomedical text categorization</article-title>
          .
          <source>Bioinformatics</source>
          ,
          <volume>22</volume>
          (
          <issue>14</issue>
          ):e446{
          <fpage>e453</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>M.</given-names>
            <surname>Villegas</surname>
          </string-name>
          , H. Muller,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gilbert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Piras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mikolajczyk</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
            , S. Bromuri,
            <given-names>M. A.</given-names>
          </string-name>
          <string-name>
            <surname>Amin</surname>
            ,
            <given-names>M. K.</given-names>
          </string-name>
          <string-name>
            <surname>Mohammed</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Acar</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Uskudarli</surname>
            ,
            <given-names>N. B.</given-names>
          </string-name>
          <string-name>
            <surname>Marvasti</surname>
            ,
            <given-names>J. F.</given-names>
          </string-name>
          <string-name>
            <surname>Aldana</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <article-title>del Mar Roldan Garc a</article-title>
          .
          <source>General Overview of ImageCLEF at the CLEF 2015 Labs. Lecture Notes in Computer Science</source>
          . Springer International Publishing,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>X.</given-names>
            <surname>Yuan</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Ang</surname>
          </string-name>
          .
          <article-title>A novel gure panel classi cation and extraction method for document image understanding</article-title>
          .
          <source>International journal of data mining and bioinformatics</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ):
          <volume>22</volume>
          {
          <fpage>36</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>