<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Multi-Level Approach for the Discriminative Generalized Hough Transform</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>H. Ruppertshofen</string-name>
          <email>heike.ruppertshofen@fh-kiel.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>D. Künne</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>C. Lorenz</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Schmidt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>P. Beyerlein</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Z. Salah</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>G. Rose</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>H. Schramm</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Otto-von-Guericke University, Institute of Electronics, Signal Processing and Communication Technology</institution>
          ,
          <addr-line>Magdeburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Philips Research Laboratories</institution>
          ,
          <addr-line>Hamburg</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Applied Sciences Kiel, Institute of Applied Computer Science</institution>
          ,
          <addr-line>Kiel</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Applied Sciences Wildau, Department of Engineering</institution>
          ,
          <addr-line>Wildau</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2011</year>
      </pub-date>
      <fpage>67</fpage>
      <lpage>70</lpage>
      <abstract>
        <p>The Discriminative Generalized Hough Transform (DGHT) is a method for object localization, which combines the standard Generalized Hough Transform (GHT) with a discriminative training technique. In this setup the aim of the discriminative training is to equip the models used in the GHT with individual model point weights such that the localization error in the GHT becomes minimal. In this paper we introduce an extension of the DGHT using a multi-level approach to improve localization accuracy and to reduce processing time. The approach searches for the target object on multiple resolution levels and combines this information for better and faster results. The advantage of the approach is demonstrated on whole-body MR images, which are intended for PET attenuation correction.</p>
      </abstract>
      <kwd-group>
        <kwd>Object Localization</kwd>
        <kwd>Generalized Hough Transform</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Discriminative Training</kwd>
        <kwd>Multi-level Approach</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Problem</title>
      <p>2</p>
    </sec>
    <sec id="sec-2">
      <title>Methods and Material</title>
      <p>The GHT [3] is a standard method for object localization, which employs a point model to represent and search for a
target object in an image. The model is thereby moved across the edge image corresponding to the original image and
the co-incidences of model and edge points are counted in a voting process and accumulated in the Hough space. The
Hough cell, which obtained the highest vote, is assumed to represent the true target location.</p>
      <p>In the DGHT the models are furthermore equipped with individual model point weights. These weights are trained with
a discriminative training algorithm [6], based on the information available in the Hough space, i.e. which point has voted
in which Hough cell, with the aim to obtain a low localization error in the GHT.</p>
      <p>In order to obtain a meaningful model and to be able to capture the variability contained in a training dataset, the model
for the GHT is generated directly from the image data by taking the edge points from a given volume of interest (VOI)
around the target point from a number of images and is refined in an iterative approach. The procedure starts on a small
set of training images, on which preliminary model point weights are trained. The current model is then evaluated on a
larger development dataset. Images where the model performs poorly are added to the training dataset, further model
points are created from these images, and another iteration is performed until the error on all development images is
below a certain threshold or no further improvement is achieved. For more detailed information on the iterative training
technique and the DGHT, we refer the reader to [2].</p>
      <p>In this paper, we introduce the combination of the DGHT with a multi-level approach. To this end, a Gaussian pyramid
of the image is created and the localization is performed on each level. To speed up the procedure the localization is first
executed on the lowest resolution level, where only little detail is visible and the localization is fast, due to the small
image size. For the next higher resolution level it is assumed that the previous localization result is near the target point
such that the search can be constrained to a smaller region. In the following experiments an extract with half the side
lengths of the previously considered image extract is cut out around the localized point, such that the number of pixels
remains almost constant, while the resolution of the image increases. Thus more and more detail is taken into account on
each level while zooming into the target object. The idea of the approach is illustrated in Fig. 1.</p>
      <p>For each level of the pyramid an individual model is created using edge points from a VOI as stated above. While the
VOI needs to be given for the standard approach, here it is chosen to be centered at the target point with a side length of
75% of the current image extract. Only part of the extract is used for model generation in order to reduce model size and
to prevent the algorithm from learning the exact field of views, which might be different on test images.</p>
      <p>Fig. 1: Illustration of the steps of the multi-level approach. The localization procedure starts on the low
resolution image on the left. In the subsequent steps (left to right), the procedure zooms into the target by performing
the localization on regions with decreasing size and increasing resolution around the previously localized point
(white cross hairs). The right image shows an overlay of the different image extracts used for the localization.
The method is tested on 22 whole-body MR images, which were acquired on a Philips Achieva 3T X-Series MRI system
using a whole-body protocol suitable for attenuation correction. As was said earlier, the images are not intended for
diagnostic purposes but for the attenuation correction of PET images; therefore a sequence with fast acquisition is applied,
which results in images with the rather low resolution of approximately 1.875 mm in plane and a slice thickness of
6 mm. Example images are displayed in Fig. 2.</p>
      <p>The given task for these images is to localize the femur for a subsequent segmentation. To this end the center of the
femoral head of the right leg was chosen as target point, which is marked in the left images in Fig. 2.</p>
      <p>From the dataset 10 images are chosen randomly as development dataset, while the remaining images are used for
testing purposes. The image pyramid is created with 4 levels and since the resolution in z-direction is much larger than
inslice, the image is downsampled only in x- and y-direction for the first levels of the pyramid to obtain a rather isotropic
resolution.</p>
      <p>To be able to compare the new multi-level approach with the former approach, a second experiment is performed using
the same parameter setting. Yet, instead of utilizing an image pyramid, only one resolution level is employed. To reduce
processing time and memory need of the GHT, the image is downsampled once in-slice. For the model creation a VOI
around the femoral head is defined, which can be seen in Fig. 2 (right).
3</p>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>The results of the two experiments are stated in Table 1. For the training images the results differ only slightly with a
good mean localization error of 3.1 mm and 2.2 mm for the standard and multi-level approach, which is not surprising
since the models were trained on these images. However, regarding the unknown test data, the standard approach was
substantially outperformed by the multi-level approach. The latter achieves a much better localization error with a mean
distance of 3.8 mm of the localized and annotated point, while the former obtains only 6.7 mm and even fails on one of
the test images.</p>
      <sec id="sec-3-1">
        <title>Standard approach Multi-level approach</title>
      </sec>
      <sec id="sec-3-2">
        <title>Training Local. rate Mean error 100 % 3.1 mm 100 % 2.2 mm</title>
      </sec>
      <sec id="sec-3-3">
        <title>Test</title>
      </sec>
      <sec id="sec-3-4">
        <title>Local. rate 91.6 % 100 %</title>
      </sec>
      <sec id="sec-3-5">
        <title>Mean error 6.7 mm 3.8 mm</title>
      </sec>
      <sec id="sec-3-6">
        <title>Proc. time</title>
        <p>28 s
3 s</p>
        <p>The advantage of the multi-level approach becomes even more obvious when considering the processing time. Due to
the much smaller image extracts (only 1-3 % of the pixels of the original image) on which the localization is performed
on each resolution level, the multi-level approach takes only about 10 % of the processing time of the standard
approach.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>The multi-level approach has proven to be significantly better and faster than the standard approach for the given task.
The main reason for the smaller localization error is the higher resolution used in the multi-level approach. While the
standard approach, due to computational reasons, performs the localization only on the second resolution level of the
image pyramid, the multi-level approach employs the original resolution in its final stage. The obtained localization
error of 3.8 mm is exceptional, considering the low resolution of the images, especially the slice distance of 6 mm.
Another advantage of the newly proposed approach, which zooms into the target object, is that it takes the neighborhood
of the target into account on a larger scale and thereby facilitates to localize objects with low contrast, high variability or
which can be easily confused with further objects visible in the image. One of the test images has a smaller field of view
compared to the rest of the dataset. This image covers the body only from the head to the upper part of the femur, still
showing the femoral head but not the remainder of the femoral bone. The standard approach, which relies on the whole
bone being visible, fails to deal with this occlusion, while the multi-level approach, which orients itself on a larger scale,
is not affected by the limited field of view.</p>
      <p>In the presented example the algorithm achieved to localize the target with the necessary accuracy on all resolution
levels. However, if necessary, it would be conceivable to keep several candidate points on each level, which could be
discarded later on higher resolution levels, when identified as false-positives.</p>
      <p>Besides facilitating the object localization and making it more robust, the multi-level approach has another large
advantage, which lies in the shorter processing times. Since the image extracts, which are used for the localization, are much
smaller than the original image, only a fraction of the run time is needed, depending on the size of the image and the
number of zoom levels used. In the presented example, a reduction of processing time of 10 % was achieved. With the
runtime of 3 s the application of the algorithm in 3D becomes really feasible. Furthermore, the procedure is not yet
optimized for speed, so that a further reduction of processing time is to be expected.</p>
      <p>In future work, the usage of the demonstrated localization procedure in combination with the segmentation for the
attenuation correction will be examined. Since only little anatomical detail is visible in the image, a precise positioning of the
segmentation models is needed, which we are confident to fulfill with the presented approach.
The authors would like to thank the Department of Radiology, Mt. Sinai School of Medicine, New York, the Department
of Imaging Sciences and Medical Informatics, Geneva University Hospital and the Business Unit NM/CT, Philips
Healthcare for providing the data used in this study. This work is partly funded by the Innovation Foundation
SchleswigHolstein under the grant 2008-40 H.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>T.</given-names>
            <surname>Heimann</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. van Ginneken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Styner</surname>
          </string-name>
          et al.,
          <article-title>Comparison and Evaluation of Methods for Liver Segmentation from CT Datasets</article-title>
          ,
          <source>IEEE Transactions on Medical Imaging</source>
          <volume>28</volume>
          (
          <issue>8</issue>
          ), 2009
          <string-name>
            <given-names>H.</given-names>
            <surname>Ruppertshofen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lorenz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Beyerlein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Salah</surname>
          </string-name>
          , G. Rose,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schramm</surname>
          </string-name>
          ,
          <article-title>Discriminative Generalized Hough Transform for Localization of Joints in the lower extremities</article-title>
          ,
          <source>Computer Science - Research &amp; Development 26</source>
          , Springer, 2011
          <string-name>
            <given-names>D. H.</given-names>
            <surname>Ballard</surname>
          </string-name>
          , Generalizing the Hough Transform to Detect Arbitrary Shapes,
          <source>Pattern Recognition</source>
          <volume>13</volume>
          (
          <issue>2</issue>
          ), 1981
          <string-name>
            <given-names>H.</given-names>
            <surname>Schramm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Ecabert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Peters</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Philomin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Weese</surname>
          </string-name>
          ,
          <source>Towards Fully Automatic Object Detection and Segmentation, Proceedings of SPIE medical imaging</source>
          , 2006
          <string-name>
            <given-names>H.</given-names>
            <surname>Ruppertshofen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lorenz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Beyerlein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Salah</surname>
          </string-name>
          , G. Rose,
          <string-name>
            <given-names>H.</given-names>
            <surname>Schramm</surname>
          </string-name>
          ,
          <article-title>Lokalisierung der Leber mittels einer Diskriminativen Generalisierten Hough Transformation</article-title>
          ,
          <source>Proceedings of CURAC</source>
          , 2010
          <string-name>
            <given-names>Z.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ojha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Renisch</surname>
          </string-name>
          , et al.,
          <article-title>MR-based Attenuation Correction for a Whole-body Sequential PET/MR System</article-title>
          ,
          <source>Proceedings of IEEE Nuclear Science Symposium</source>
          , 2009
          <string-name>
            <given-names>P.</given-names>
            <surname>Beyerlein</surname>
          </string-name>
          , Discriminative Model Combination,
          <source>Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing</source>
          ,
          <year>1998</year>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>