<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Evaluation of Expectation Maximization for the Segmentation of Cervical Cell Nuclei</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alexander Ihlow</string-name>
          <email>alexander.ihlow@tu-ilmenau.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christian Held</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Christoph Rothaug</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Claudia Dach</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas Wittenberg</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dirk Steckhan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Fraunhofer Institute for Integrated Circuits IIS</institution>
          ,
          <addr-line>Erlangen</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Ilmenau University of Technology</institution>
        </aff>
      </contrib-group>
      <fpage>139</fpage>
      <lpage>143</lpage>
      <abstract>
        <p>As cervical cancer is one of the most common cancers worldwide, screening programs have been established. For that task stained slides of cervical cells are visually assessed under a microscope for dysplastic or malignant cells. To support this challenge, image processing methods offer advantages for objective classification. As the cell nuclei carry a high extent visual information, all depicted cell nuclei need to be delineated. Within this work, the expectation maximization (EM) algorithm is evaluated as a yet unused method for this task. The EM was trained on 33 micrographs, where nuclei were manually annotated as reference. The EM was evaluated with varying parameter for the number of classes and with four different color spaces (RGB, Lab, HSV, polar HSV). Segmentation results for all images and parameters were compared to the ground truth, yielding average accuracy and standard deviation for all cells. The best color spaces were RGB and Lab. The best number of classes to be used in the color space was found to be K = 3. It can be concluded that the EM is an appropriate and useful approach for cell nuclei segmentation, but needs some image post-processing for the elimination of false positives.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Since cervical cancer is one of the most common cancers amongst women
worldwide, screening programs have been established and are carried out in many
countries. For the screening, cervical cells are obtained from the portio vaginalis
using a brush during routine examination. The cells are then prepared directly
on a slide or are prepared applying a monolayer-preparation (Fig. 1a). To make
the cells visible for microscopic examinations, the slide is stained by the method
of Papanicolaou, also known as PAP-stain [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>For visual assessment the stained slides are put under a microscope and
screened for dysplastic or malignant cells by a trained specialist. Since this
work is tedious and tiring, the viewing and screening process depends on the
professional competence as well as the personal and subjective comprehension of
the specialist. This comprehension may change during a working day depending
on stress, fatigue, and personal issues and may also differ between two people.</p>
      <p>Under these conditions methods of digital image processing can offer advantages
for a more objective classification of these highly complex images. Under the
assumption of standardized staining and smearing techniques, machines tend to
be neutral and immune to inter- and intra-observer changes and influences.</p>
      <p>As the nuclei of cervical cells carry a high extent of morphological and
textural information, which can be used to diagnose pre-cancerous stages (CIN I-III)
as well cancer itself, in a primary step all depicted cell nuclei need to be detected
and delineated against the surrounding cell plasma and image back ground.
1.1</p>
    </sec>
    <sec id="sec-2">
      <title>State of the Art</title>
      <p>
        Various approaches have been suggested to solve the problem of cervical cell
nuclei detection and segmentation within automated micrograph analysis.
Morphological image processing [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] is well understood and very suitable for objects
with well-defined size and form [
        <xref ref-type="bibr" rid="ref3 ref4">3, 4</xref>
        ] and nuclei can be detected using a
nucleusshaped structuring element. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] have suggested the application of a linear
tophat-operator [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] make use of thresholding and morphology, where in a
postprocessing step candidates for cell nuclei are selected or rejected using a fuzzy
clustering approach. An alternative is the approximation of the nuclei borders
by a LoG edge detector [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which are post-processed using morphological
operators for the detection of cell nuclei. Another approach is the circular Hough
transform [
        <xref ref-type="bibr" rid="ref4 ref8">4, 8</xref>
        ], which is applied to detect circular structures in gradient images.
A technique intended to provide further information about cell nuclei is the use
of multi-modal image pairs [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], where an additional fluorescence (FL) staining
is applied high lightening only the cell nuclei. On FL image a thresholding
operation can be applied to detect the cell nuclei. Disadvantage is that the FL
image has to be registered to the PAP brightfield image, which is a difficult and
expensive task
      </p>
      <p>
        Within this work the well-known expectation maximization (EM) approach
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is evaluated as a yet unused method for automated detection and
segmentation of cervical cells.
2
2.1
      </p>
      <sec id="sec-2-1">
        <title>Materials and Methods</title>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Methods</title>
      <p>
        Using the EM algorithm, usually a Gaussian mixture model (GMM) is applied to
describe the data at hand. The distribution of the annotated color samples x is
modeled by a mixture of K classes f (x) = ∑K k=1 k = 1
k=1 k f (x j k) with ∑K
and ∫x f (x j k) dx = 1, where k is the a-priori probability of class k and k
denotes the set of parameters which describes the distribution of class k. As
a description for the distribution of the mixture components the multivariate
Gaussian is chosen due to its convenience. Its equal-probability surfaces describe
(hyper)ellipsoids in the d-dimensional space. Here, d = 3 corresponds to the
tristimulus color spaces RGB, Lab, HSV, and polar HSV. The model parameters
k consist of the mean vector k 2 Rd 1 describing the center of the ellipsoid,
and the covariance matrix k 2 Rd d determining its shape and orientation.
This yields a GMM of
f (x j k = f k;
kg) =
( √
1= (2 )d j kj
)
exp
(1)
The EM algorithm [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] is an iterative technique for finding the maximum
likelihood parameter estimates when fitting a distribution onto a given data set.
During the iterations, the probability p of the N data samples xn
belonging to class k is calculated by Bayes’ theorem, known as the expectation step
p(k j xn) = k f (xn j k)=∑jK=1 j f (xn j j ) In the subsequent maximization
step, updated prior probabilities, mean vectors, and covariance matrices for
each class are calculated, using knew = X=N , knew = X1 ∑nN=1 Y xn, knew =
X1 ∑nN=1 Y (xn knew)(xn knew)T with X = ∑nN=1 p(k j xn; k) and Y =
p(k j xn; k).
For the training and evaluation of the proposed methods an image data set of
33 cervical micrographs with a spatial resolution of 1000 700 pixels has been
used where all nuclei were manually annotated as reference or ground truth by
an expert. The cells in these probes are typically stained depicting colors from
basophile (blue) to eosinophile (red). The images used are ranging from healthy
to a dysplastic CIN III state (nearly tumorous), and thus cover the complete
range of cervical cells. Fig. 1a shows a typical example of an micrograph with
cervical cells, while in Fig. 1b some representative regions of the classes nuclei
and rest (including cell plasma and background) were manually marked. Fig. 1c
depicts the ground truth of cell nuclei used for later performance evaluation. The
EM approach was evaluated with various parameters (K = 3; 4; 5; 6 classes) to
describe the number of clouds in the color spaces as well as with different color
spaces, including RGB, Lab, HSV, and polar HSV. For all experiments, the
EM algorithm was terminated after four iterations. In Fig. 2, the segmentation
results for the example image from Fig. 1a with the above described parameters is
(a)
(b)
(c)
      </p>
      <sec id="sec-3-1">
        <title>3 classes</title>
      </sec>
      <sec id="sec-3-2">
        <title>4 classes</title>
      </sec>
      <sec id="sec-3-3">
        <title>5 classes</title>
      </sec>
      <sec id="sec-3-4">
        <title>6 classes</title>
        <p>RGB
Lab
HSV
polar
HSV
depicted. It can be seen that the depending on the two parameters investigated,
being the number of classes K and the color spaces, the resulting images show
more or less over- and under-segmentation artifacts.
3</p>
        <sec id="sec-3-4-1">
          <title>Results</title>
          <p>For evaluation, each image was subdivided into Voronoi regions based on the
ground truth nuclei (Fig. 2). The segmentation accuracy A was determined for
each Voronoi region by A = (NTP + NTN)=(NTP + NFP+NFN + NTN), where
NT P , NF P , NT N , and NF N denote the number of true and false positive, and
true and false negative pixels, respectively. Based on these measurements, the
average accuracy and standard deviation of all cells was evaluated as depicted
in Fig. 3. It is visible that the RGB and Lab color spaces outperform the HSV
and polar HSV color spaces for all K. Furthermore, results obtained by EM
with K = 3 in the RGB color space show the best average accuracy values while
exhibiting the lowest standard deviation. This is consistent with the example
segmentations shown in Fig. 2. Therefore, this parameter set seems to be best
suited for the presented task.</p>
        </sec>
        <sec id="sec-3-4-2">
          <title>Discussion</title>
          <p>We have presented and evaluated an EM approach using different color spaces
(RGB, Lab, HSV, polar HSV) for the segmentation of cervical cell nuclei. Our
results indicate that the RGB and LAB color spaces are most suitable for this
task. Using these color spaces the EM is able to perform a segmentation with
high sensitivity. A drawback of this method is caused by its strong dependency
on the initialization. Future research could focus on the improvement of the
specificity of the proposed methods by appropriate post processing steps.
Furthermore, some post processing is needed to eliminate false positive pixels.</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Papanicolaou</surname>
            <given-names>G.</given-names>
          </string-name>
          <article-title>A new procedure for staining vaginal smears</article-title>
          .
          <source>Science</source>
          .
          <year>1942</year>
          ;
          <volume>95</volume>
          :
          <fpage>438</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Soille P. Morphological Image</surname>
            <given-names>Analysis</given-names>
          </string-name>
          :
          <source>Principles &amp; Applications</source>
          . Springer;
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Anoraganingrum</surname>
            <given-names>D</given-names>
          </string-name>
          , Kro¨ner
          <string-name>
            <given-names>S</given-names>
            ,
            <surname>Gottfried</surname>
          </string-name>
          <string-name>
            <surname>B</surname>
          </string-name>
          .
          <article-title>Cell segmentation with adaptive region growing</article-title>
          .
          <source>Proc Int Conf Image Anal Process</source>
          .
          <year>1999</year>
          ; p.
          <fpage>27</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4. Ko¨hler H,
          <string-name>
            <surname>Wittenberg</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Paulus</surname>
            <given-names>D</given-names>
          </string-name>
          .
          <article-title>Detection and segmentation of cervical cell nuclei</article-title>
          .
          <source>Proc ICMP &amp; BMT</source>
          ,
          <string-name>
            <given-names>Biomed</given-names>
            <surname>Tech</surname>
          </string-name>
          .
          <year>2005</year>
          ;
          <volume>50</volume>
          (
          <issue>1</issue>
          ):
          <fpage>288</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Abmayer</surname>
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Einfu</surname>
          </string-name>
          <article-title>¨hrung in die digitale Bildverarbeitung</article-title>
          . Teubner;
          <year>1994</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Plissiti</surname>
            <given-names>M</given-names>
          </string-name>
          , et al.
          <article-title>Automated detection of cell nuclei in PAP stained cervical smear images using fuzzy clustering</article-title>
          .
          <source>In: Proc EMBEC; 2008</source>
          . p.
          <fpage>637</fpage>
          -
          <lpage>641</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Loukas</surname>
            <given-names>C</given-names>
          </string-name>
          , et al.
          <article-title>An image analysis-based approach for automated counting of cancer cell nuclei in tissue sections</article-title>
          .
          <source>Cytometry A</source>
          .
          <year>2003</year>
          ;55A:
          <fpage>30</fpage>
          -
          <lpage>42</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Thomas</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Davies</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Luxmoore</surname>
            <given-names>A</given-names>
          </string-name>
          .
          <article-title>The Hough transform for locating cell nuclei</article-title>
          .
          <source>In: IEE Colloq Appl Image Proc in Mass Health Screening; 1992</source>
          . p.
          <volume>8</volume>
          /1-8/4.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Lindblad</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengsston</surname>
            <given-names>E</given-names>
          </string-name>
          , Wa¨hlby C.
          <article-title>Robust cell image segmentation</article-title>
          .
          <source>Pattern Recogn Image Anal</source>
          .
          <year>2004</year>
          ;
          <volume>13</volume>
          (
          <issue>2</issue>
          ):
          <fpage>157</fpage>
          -
          <lpage>167</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Dempster</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Laird</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rubin</surname>
            <given-names>D</given-names>
          </string-name>
          .
          <article-title>Maximum Likelihood from Incomplete Data via the EM algorithm</article-title>
          .
          <source>J Royal Stat Soc B</source>
          .
          <year>1977</year>
          ;
          <volume>39</volume>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>