<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Attention U-Net Based Adversarial Architectures for Chest X-ray Lung Segmentation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Guszt a´v Ga a´l</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bala´ zs Maga</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>X-ray is by far the most common among medical imaging modalities, being faster, more accessible, and more cost-effective compared to other radiographic methods. Chest X-ray (CXR) is the most commonly requested test due to its contribution to the early detection of lung cancer. The most important biomarker in detecting cancer of the lung are nodules, and in finding those, lung segmentation of chest X-rays is essential. Another goal is interpretability, helping radiologists integrate computer-aided detection methods into their diagnostic pipeline, greatly reducing their workload. For this reason, a robust algorithm to perform this otherwise arduous segmentation task is much desired in the field of medical imaging. In this work, we present a novel deep learning approach that uses stateof-the-art fully convolutional neural networks in conjunction with an adversarial critic model. Our network generalized well to CXR images of unseen datasets with different patient profiles, achieving a final DSC of 97.5% on the JSRT CXR dataset.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        X-ray is the most commonly performed radiographic examination,
being significantly easier to access, cheaper and faster to carry out
than computed tomography (CT), diagnostic ultrasound and
magnetic resonance imaging (MRI), as well as having lower dose of
radiation compared to a CT scan. According to the publicly
available, official data of the National Health Service ([
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]), in the period
from February 2017 to February 2018, the count of imaging activity
was about 41 million in England, out of which almost 22 million was
plain X-ray. Many of these imaging tests might contribute to early
diagnosis of cancer, amongst which chest X-ray is the most commonly
requested one by general practitioners. In order to identify lung
nodules, lung segmentation of chest X-rays is essential, and this step
is vital in other diagnostic pipelines as well, such as calculating the
cardiothoracic ratio, which is the primary indicator of cardiomegaly.
For this reason, a robust algorithm to perform this otherwise arduous
segmentation task is much desired in the field of medical imaging.
      </p>
      <p>
        Semantic segmentation aims to solve the challenging problem of
assigning a pre-defined class to each pixel of the image. This task
requires a high level of visual understanding, in which
state-of-theart performance is attained by methods utilizing Fully Convolutional
Networks (FCN) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], adversarial training is used to enhance
segmentation of colored images. This idea was incorporated to [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
in order to segment chest X-rays with a fully convolutional,
residual neural network. Recently, Mask R-CNN [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is utilized to realize
instance segmentation on chest X-rays and obtained state-of-the-art
results [
        <xref ref-type="bibr" rid="ref12 ref5">12, 5</xref>
        ].
2
2.1
      </p>
    </sec>
    <sec id="sec-2">
      <title>DEEP LEARNING APPROACH</title>
    </sec>
    <sec id="sec-3">
      <title>Network Architecture</title>
      <p>Our goal is to produce accurate organ segmentation masks on chest
X-rays, meaning for input images we want pixel-wise dense
predictions regarding if the given pixel is either part of the left lung, the
right lung, the heart, or none of the above.</p>
      <p>
        For this purpose Fully Convolutional Networks (FCNs) are known to
significantly outperform other widely used registration-based
methods. Specifically we applied a U-Net architecture, thus enabling us
to efficiently compute the segmentation mask in the same resolution
as the input images. The fully convolutional architecture also enables
the use images of different resolutions, since unlike standard
convolutional networks, FCNs don’t contain input-size dependent layers.
In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] it has been shown that for medical image analysis tasks the
integration of the proposed Attention Gates (AGs) improved the
accuracy of the segmentation models, while preserving computational
efficiency. The architecture of the proposed Attention U-Net is
described by Figure 1. Without the use of AGs, it’s common practice
to use cascade CNNs, selecting a Region Of Interest (ROI) with
another CNN where the target organ is likely contained. With the use of
AGs we eliminate the need for such a preselecting network, instead
the Attention U-Net learns to focus on most important local features,
and dulls down the less relevant ones. We note that the dulling of less
relevant local features also result in decreased false positive rates.
In order to enhance the performance of Attention U-Net, we
further experimented with adversarial techniques, motivated by [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In
that work, the authors first designed a Fully Convolutional Network
(FCN) for the lung segmentation task, and noted that in certain cases
the network tends to segment abnormal and incorrect organ shapes.
For example, the apex of the ribcage might be mistaken as an
internal rib bone, resulting in the mask “bleeding out” to the background,
which has similar intensity as the lung field. To address this issue,
they developed an adversarial scheme, leading to a model which they
call Structure Correcting Adversarial Network (SCAN). This
architecture is based on the idea of the General Adversarial Networks [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
They use the pretrained Fully Convolutional Network as a generator
of a General Adversarial Network, and they also train a critic
network which is fed the ground truth mask, the predicted mask and
optionally the original image. The critic network has roughly the same
architecture, resulting in similar capacity. This approach forces the
generator to segment more realistic masks, eventually removing
obviously wrong shapes.
      </p>
      <p>
        In our work, besides the standard Attention U-Net, we also created
a network of analogous structure, in which the FCN used in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] is
replaced by the Attention U-Net. We did not introduce any
modification in the critic model design, such experiments are left to future
work.
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Tversky Loss</title>
      <p>In the field of medical imaging, Dice Score Coefficient (DSC) is
probably the most widespread and simple way to measure the overlap
ratio of the masks and the ground truth, and hence to compare and
evaluate segmentations. Given two sets of pixels X; Y , their DSC is
DSC(X; Y ) = 2jX \ Y j :
jXj + jY j
If Y is in fact the result of a test about which pixels are in X, we can
rewrite it with the usual notation true/false positive (TP/FP), false
negative (FN) to be</p>
      <p>DSC(X; Y ) =</p>
      <p>2T P
2T P + F N + F P
:
We would like to use this concept in our setup. The class c we would
like to segment corresponds to a set, but it is more appropriate to
consider its indicator function g, that is gi;c 2 f0; 1g equals 1 if and
only if the ith pixel belongs to the object. On the other hand, our
prediction is a probability for each pixel denoted by pi;c 2 [0; 1]. Then
the Dice Score of the prediction in the spirit of the above description</p>
      <p>DSC =</p>
      <p>N
X pi;cgi;c + "
i=1
N
X (pi;c + gi;c) + "
i=1
;
where N is the total number of pixels, and " is introduced for the
sake of numerical stability and to avoid divison by 0. The linear Dice
Loss (DL) of the multiclass prediction is then</p>
      <p>X (1
c
DL =</p>
      <p>
        DSCc) :
A deficiency of Dice Loss is that it penalizes false negative and
false positive predictions equally, which results in high precision but
low recall. For example practice shows that if the region of interests
(ROI) are small, false negative pixels need to have a higher weight
than false positive ones. Mathematically this obstacle is easily
overcome by introducing weights ; as tuneable parameters, resulting
in the definition of Tversky similarity index [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]:
      </p>
      <p>T Ic =</p>
      <p>N
X pi;cgi;c +
i=1</p>
      <p>N
X pi;cgi;c + "
i=1
N
X pi;cgi;c +
i=1</p>
      <p>N
X pi;cgi;c + "
i=1
;
where pi;c = 1 pi;c and gi;c = 1 gi;c, that is the overline simply
stands for describing the complement of the class.</p>
      <p>Tversky Loss is obtained from Tversky index as Dice Loss was
obtained from Dice Score Coefficient:</p>
      <p>T L =</p>
      <p>X (1
c</p>
      <p>
        T Ic) :
Another issue with the Dice Loss is that it struggles to segment small
ROIs as they do not contribute to the loss significantly. This difficulty
was addressed in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], where the authors introduced the quantity Focal
Tversky Loss in order to improve the performance of their lesion
segmentation model:
      </p>
      <p>F T L =</p>
      <p>X (1
c</p>
      <p>T Ic)
1
;
where 2 [1; 3]. In practice, if a pixel with is misclassified with a
high Tversky index, the Focal Tversky Loss is unaffected. However,
if the Tversky index is small and the pixel is misclassified, the Focal
Tversky Loss will decrease significantly.
2.3</p>
    </sec>
    <sec id="sec-5">
      <title>Training</title>
      <p>
        The explanation of the training of our structure correcting network
is a bit longer to explain, we directly follow the footsteps of [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
Let S, D be the segmentation network and the critic network,
respectively. The data consist of the input images xi and the
associated mask labels yi, where xi is of shape [H; W; 1] for a
singlechannel gray-scale image with height H and width W , and yi is
of shape [H; W; C] where C is the number of classes including the
background. Note that for each pixel location (j; k), yijkc = 1 for
the labeled class channel c while the rest of the channels are zero
(yijkc0 = 0 for c0 6= c). We use S(x) 2 [0; 1][H;W;C] to denote the
class probabilities predicted by S at each pixel location such that the
class probabilities normalize to 1 at each pixel. Let D(xi; y) be the
scalar probability estimate of y coming from the training data. They
defined the optimization problem as
min maxnJ (S; D) :=
      </p>
      <p>S D</p>
      <p>N
X Js(S(xi); yi)
i=1
(1)
h</p>
      <p>Jd(D(xi; yi); 1) + Jd(D(xi; S(xi)); 0)io;
where</p>
      <p>1
HW</p>
      <p>C
X X
j;k c=1
Js(y^; y) :=
yjkc ln yjkc
is the multiclass cross-entropy loss for predicted mask y^ averaged
over all pixels.</p>
      <p>Jd(t^; t) :=
t ln t^+ (1
t) ln(1
^
t)
is the binary logistic loss for the critic’s prediction. is a tuning
parameter balancing pixel-wise loss and the adversarial loss. We can
solve equation (1) by alternate between optimizing S and
optimizing D using their respective loss functions. This is a point where
we introduced a modification: instead of using the multiclass
crossentropy loss Js(y^; y) in the first term, we applied the Focal Tversky
Loss F T L(y^; y).</p>
      <p>Now since the first term in equation (1) does not depend on D, we
can train our critic network by minimizing the following objective
with respect to D for a fixed S:</p>
      <p>N
X Jd(D(xi; yi); 1) + Jd(D(xi; S(xi)); 0):
i=1
Moreover, given a fixed D, we train the segmentation network by
minimizing the following objective with respect to S:</p>
      <p>N
X F T L(S(xi); yi) +
i=1</p>
      <p>
        Jd(D(xi; S(xi)); 1):
Following the recommendation in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], we use Jd(D(xi; S(xi)); 1)
in place of Jd(D(xi; S(x)); 0), as it leads to stronger gradient
signals. After tests on the value of we decided to use = 0:1.
      </p>
      <p>
        Concerning training schedule, we found that following pretraining
the generator for 50 epochs, we can train the adversarial network for
50 epochs, in which we perform 1 optimization step on the critic
network after each 5 optimization step on the generator. This choice
of balance is also borrowed from [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], however, we note that the
training of our network is much faster.
3
      </p>
    </sec>
    <sec id="sec-6">
      <title>DATASETS</title>
      <p>
        For training- and validation data, we used the Japanese Society
of Radiological Technology (JSRT) dataset [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] , as well as the
Montgomery- and Shenzhen dataset [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], all of which are public
datasets of chest X-rays with available organ segmentation masks
reviewed by expert radiologists.
      </p>
      <p>The JSRT dataset contains a total of 247 images, of which 154
contains lung nodules. The X-rays are all in 2048 2048 resolution,
and have 12-bit grayscale levels. Both lung and heart segmentation
masks are available for this dataset.</p>
      <p>The Montgomery dataset contains 138 chest X-rays, of which 80
X-rays are from healthy patients, and 58 are from patients with
tuberculosis. The X-rays have either a resolution of 4020 4892 or
4892 4020, and have 12-bit grayscale levels as well. In the case of
this dataset, only lung segmentation masks are publicly available.
The Shenzhen dataset contains a total of 662 chest X-rays, of which
326 are of healthy patients, and in a similar fashion, 336 are of
patients with tuberculosis. The images vary in sizes, but all are of
high resolution, with 8-bit grayscale levels. Only lung segmentation
masks are publicly available for the dataset.
3.1</p>
    </sec>
    <sec id="sec-7">
      <title>Preprocessing Data</title>
      <p>X-rays are grayscale images with typically low contrast, which
makes their analysis a difficult task. This obstacle might be
overcome by using some sort of histogram equalization technique. The
idea of standard histogram equalization is spreading out the the most
frequent intensity values to a higher range of the intensity domain
[0; 255] by modifying the intensities so that their cumulative
distribution function (CDF) on the complete modified image is as close
to the CDF of the uniform distribution as possible. Improvements
might be made by using adaptive histogram equalization, in which
the above method is not utilized globally, but separately on pieces of
the image, in order to enhance local contrasts. However, this
technique might overamplify noise in near-constant regions, hence our
choice was to use Contrast Limited Adaptive Histogram Equalization
(CLAHE), which counteracts this effect by clipping the histogram at
a predefined value before calculating the CDF, and redistribute this
part of the image equally among all the histogram bins.
Applying CLAHE to an X-ray image has visually appealing results,
as displayed in Figure 3. As our experiments displayed, it does not
merely help human vision, but also neural networks.
The images were then resized to 512x512 resolution and mapped
to [ 1; 1] before being fed to our network.
4</p>
    </sec>
    <sec id="sec-8">
      <title>EXPERIMENTS AND RESULTS</title>
      <p>The aforementioned Attention U-Net architecture was implemented
using Keras-TensorFlow Python libraries, to which we have fed our
dataset and trained for 40 epochs with 8 X-ray scans in each batch.
Our optimizer of choice was stochastic gradient descent, having
found that Adam failed to converge in many cases. As loss function,
we applied Focal Tversky Loss.</p>
      <p>We have found that applying various data augmentation
techniques such as flipping, rotating, shearing the image as well as
increasing or decreasing the brightness of the image were of no help
and just resulted in slower convergence.</p>
      <p>
        Using the Attention U-Net infrastructure, we managed to reach
a dice score of 0.9628 for the lungs. Unlike in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], where no
major preprocessing was done, with our preprocessing method, the
97.3 0.8%
      </p>
      <p>ATTN U-Net</p>
      <p>Ours (Adv. ATTN)
network performed very well even if the test- and the validation
sets were of different datasets. This is extremely important for
real world applications, as X-ray images of different machines are
significantly different, largely dependent on the specific calibration
of each machine, thus it is no trivial task to have X-rays accurately
evaluated that are from machines from which no images were in the
training set.</p>
      <p>We note that even though introducing the adversarial scheme in
our setting increased the dice scores, the improvement was not as
drastic as in the case of the FCN and SCAN. By checking the masks
generated by the vanilla Attention U-Net, we found that this
phenomenon can be attributed to the fact that while the FCN
occasionally produces abnormally shaped masks, due to our preprocessing
steps the Attention U-Net does not commit this mistake.
Consequently, the adversarial scheme is responsible for subtle shape
improvements only, which is indicated by the Dice Score less
spectacularly.
5</p>
    </sec>
    <sec id="sec-9">
      <title>FUTURE WORK</title>
      <p>
        So far we have not experimented with the architecture of the critic
network, we found the performance of the architecture in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]
completely satisfying. However, it would be desirable to carry out further
tests in this direction in order to achieve better understanding of the
role of adversarial scheme.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Nabila</given-names>
            <surname>Abraham</surname>
          </string-name>
          and
          <article-title>Naimul Mefraz Khan, 'A novel focal Tversky loss function with improved attention U-Net for lesion segmentation'</article-title>
          ,
          <source>in 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI</source>
          <year>2019</year>
          ), pp.
          <fpage>683</fpage>
          -
          <lpage>687</lpage>
          . IEEE, (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>NHS</given-names>
            <surname>England and NHS Improvement</surname>
          </string-name>
          , '
          <article-title>Diagnostic imaging dataset statistical release', (</article-title>
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Ian</given-names>
            <surname>Goodfellow</surname>
          </string-name>
          , Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, '
          <article-title>Generative adversarial nets'</article-title>
          ,
          <source>in Advances in neural information processing systems</source>
          , pp.
          <fpage>2672</fpage>
          -
          <lpage>2680</lpage>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Kaiming</given-names>
            <surname>He</surname>
          </string-name>
          , Georgia Gkioxari, Piotr Dolla´r, and Ross Girshick, '
          <string-name>
            <surname>Mask</surname>
          </string-name>
          R-CNN',
          <source>in Proceedings of the IEEE international conference on computer vision</source>
          , pp.
          <fpage>2961</fpage>
          -
          <lpage>2969</lpage>
          , (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Qinhua</given-names>
            <surname>Hu</surname>
          </string-name>
          , Lu´ıs Fabr´ıcio de F Souza, Gabriel Bandeira Holanda,
          <source>Shara SA Alves</source>
          , Francisco He´
          <article-title>rcules dos S Silva, Tao Han, and Pedro P Rebouc¸as Filho, 'An effective approach for CT lung segmentation using mask region-based convolutional neural networks'</article-title>
          ,
          <source>Artificial Intelligence in Medicine</source>
          ,
          <volume>101792</volume>
          , (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Stefan</given-names>
            <surname>Jaeger</surname>
          </string-name>
          , Sema Candemir, Sameer Antani, Y`ı-Xia´ng J Wa´ng, PuXuan Lu, and George Thoma, '
          <article-title>Two public chest X-ray datasets for computer-aided screening of pulmonary diseases', Quantitative imaging in medicine</article-title>
          and surgery,
          <volume>4</volume>
          (
          <issue>6</issue>
          ),
          <fpage>475</fpage>
          , (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Jonathan</given-names>
            <surname>Long</surname>
          </string-name>
          , Evan Shelhamer, and Trevor Darrell, '
          <article-title>Fully convolutional networks for semantic segmentation'</article-title>
          ,
          <source>in Proceedings of the IEEE conference on computer vision and pattern recognition</source>
          , pp.
          <fpage>3431</fpage>
          -
          <lpage>3440</lpage>
          , (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Pauline</given-names>
            <surname>Luc</surname>
          </string-name>
          , Camille Couprie, Soumith Chintala, and Jakob Verbeek, '
          <article-title>Semantic segmentation using adversarial networks'</article-title>
          ,
          <source>arXiv preprint arXiv:1611.08408</source>
          , (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Ozan</given-names>
            <surname>Oktay</surname>
          </string-name>
          , Jo Schlemper, Loic Le Folgoc,
          <string-name>
            <given-names>Matthew</given-names>
            <surname>Lee</surname>
          </string-name>
          , Mattias Heinrich, Kazunari Misawa, Kensaku Mori,
          <string-name>
            <surname>Steven</surname>
            <given-names>McDonagh</given-names>
          </string-name>
          , Nils Y Hammerla,
          <string-name>
            <given-names>Bernhard</given-names>
            <surname>Kainz</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>'Attention</surname>
          </string-name>
          U-Net:
          <article-title>Learning where to look for the pancreas'</article-title>
          , arXiv preprint arXiv:
          <year>1804</year>
          .
          <volume>03999</volume>
          , (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Junji</surname>
            <given-names>Shiraishi</given-names>
          </string-name>
          , Shigehiko Katsuragawa, Junpei Ikezoe, Tsuneo Matsumoto, Takeshi Kobayashi, Ken-ichi
          <string-name>
            <surname>Komatsu</surname>
          </string-name>
          , Mitate Matsui, Hiroshi Fujita, Yoshie Kodera, and Kunio Doi, '
          <article-title>Development of a digital image database for chest radiographs with and without a lung nodule: receiver operating characteristic analysis of radiologists' detection of pulmonary nodules'</article-title>
          ,
          <source>American Journal of Roentgenology</source>
          ,
          <volume>174</volume>
          (
          <issue>1</issue>
          ),
          <fpage>71</fpage>
          -
          <lpage>74</lpage>
          , (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Amos</surname>
            <given-names>Tversky</given-names>
          </string-name>
          , 'Features of similarity.', Psychological review,
          <volume>84</volume>
          (
          <issue>4</issue>
          ),
          <fpage>327</fpage>
          , (
          <year>1977</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Jie</surname>
            <given-names>Wang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Zhigang</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Rui</given-names>
            <surname>Jiang</surname>
          </string-name>
          , and Zhen Xie,
          <article-title>'Instance segmentation of anatomical structures in chest radiographs'</article-title>
          ,
          <source>in 2019 IEEE 32nd International Symposium on Computer-Based Medical Systems (CBMS)</source>
          , pp.
          <fpage>441</fpage>
          -
          <lpage>446</lpage>
          . IEEE, (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Nanqing</given-names>
            <surname>Dong Wei Dai</surname>
          </string-name>
          <string-name>
            <given-names>B</given-names>
            ,
            <surname>Zeya</surname>
          </string-name>
          <string-name>
            <surname>Wang</surname>
          </string-name>
          , Xiaodan Liang, Hao Zhang, and Eric P Xing, 'SCAN:
          <article-title>Structure correcting adversarial network for organ segmentation in chest X-rays', in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop</article-title>
          , DLMIA 2018,
          <article-title>and</article-title>
          8th International Workshop, ML-CDS
          <year>2018</year>
          ,
          <article-title>Held in Conjunction with MICCAI 2018, Granada</article-title>
          , Spain,
          <year>September 20</year>
          ,
          <year>2018</year>
          , Proceedings, volume
          <volume>11045</volume>
          , p.
          <fpage>263</fpage>
          . Springer, (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>