<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>DAI at the MediaEval 2013 Visual Privacy Task: Representing People with Foreground Edges</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dominique Maniry</string-name>
          <email>dmaniry@cs.tu-berlin.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Esra Acar</string-name>
          <email>esra.acar@tu-berlin.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sahin Albayrak</string-name>
          <email>sahin.albayrak@tu-berlin.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>DAI Laboratory, Technische Universität Berlin</institution>
          <addr-line>Ernst-Reuter-Platz 7, TEL 14, 10587 Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2013</year>
      </pub-date>
      <fpage>18</fpage>
      <lpage>19</lpage>
      <abstract>
        <p>In this paper, we present a method for removing identityrelated information from image sequences for the privacy protection of individuals. The face, despite being an important feature to identify a person, is not the only body part that needs to be obscured. Therefore, we propose to replace the whole body of individuals by their silhouette de ned by moving edges.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        The MediaEval 2013 Visual Privacy Task [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] addresses the
problem of privacy protection in video surveillance, which is
gaining more and more importance due to concerns raised
about the privacy of monitored individuals. Detailed
description of the task, the dataset and the evaluation
methodologies are given in the paper by Badii et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. As part
of the MediaEval 2013 Visual Privacy Task, our privacy
lter is evaluated using the Privacy Evaluation Video Dataset
(PEViD) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        In order to prevent the misuse of video surveillance
systems, visual privacy lters are being developed to remove
identity-related information from a video stream. A human
operator or an automatic analysis system needs to be able to
track persons and their actions in order to detect anomalies.
Any other information such as identity, skin color, ethnicity
and gender can be misused (e.g., abuse or discrimination) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
In this context, our privacy lter aims not only at obscuring
facial identity, but also protecting other identity revealing
features such as accessories and clothing. The goal of our
approach is to prevent possible abuse and discrimination by
overlaying a background image with silhouettes.
      </p>
    </sec>
    <sec id="sec-2">
      <title>PROPOSED METHOD</title>
      <p>
        The proposed privacy lter is an adaptation of the
foreground privacy lter with stored background proposed by
O'Gormans [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The approach presented in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] is based on
two observations [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]: 1) motion edge detection is more
robust to lighting changes than intensity-based segmentation
methods, and 2) video privatization can often be
accomplished by obscuring edge regions only. Instead of using a
stored background, we initialize the background using the
rst frame and update it using every new frame. Using the
annotation of the dataset, only pixels that are not labeled as
person are updated. With this scheme, we can display scene
changes (e.g., moving cars) that do not correspond to
individuals and therefore, are not subject to privacy protection.
The drawback of this approach is that the rst frame of a
video might already contain an individual. Instead of
further ltering as in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], we directly use the foreground edges
as silhouettes. The rationale behind this is to achieve better
intelligibility. Every pixel that is considered as a foreground
edge and is within a person's bounding box, is set to a
particular color, namely green. The whole body annotation of
individuals provided in the dataset helps to restrict
interfering false positives edges to edges around individuals.
3.
      </p>
    </sec>
    <sec id="sec-3">
      <title>EVALUATION RESULTS</title>
      <p>
        The paper by O'Gormans [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] uses various parameters for
the foreground detection, in particular a threshold on Sobel
horizontal and vertical gradient results (T 1) and the value
of the exponential moving average constant which
basically controls foreground pixel classi cation change to
background. We adapted the teaching of this paper by
respectively setting them to 60 and 0.5 (for details, the reader is
referred to [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]). The privacy lter has been evaluated using
objective and subjective measures [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The objective and
subjective evaluation results and their comparison to the
average score of all 9 teams participating in the MediaEval
2013 Visual Privacy Task are given in Table 1 and Table 2,
respectively.
      </p>
      <p>The objective intelligibility score (in Table 1) is far below
average. This is an expected result, as the objective
intelligibility score is measured using an automatic human detector
which classi es our silhouette representation as non-humans.
A privacy lter could provide tracking information on a
sidechannel to compensate for this problem.</p>
      <p>Our objective privacy score is above average for the same
reason. The score is based on a face detection algorithm
which detects natural faces. Due to the silhouette
representation of individuals, the face detection algorithm is
expected to nd no faces in ltered image sequences.</p>
      <p>In the subjective evaluation (Table 2), the intelligibility
score is above average. This shows that users were able to
track individuals and their actions, by only seeing the
silhouette. The below average privacy score in the user study
might suggest that the silhouettes still contain information
related to the identity of individuals. In some cases,
accessories, clothing and/or hair style can still be recognized by
the users. Although our method has better appropriateness
score in the subjective evaluation, the appropriateness scores
are still below average in both objective and subjective
evaluations. This is likely due to the visual artifacts produced by
the imperfect foreground edge segmentation. The edges that
belong to the background have a green color, when they are
close to the bounding box of a person (see Figure 2). This
reduces the appropriateness score of our method.</p>
      <p>The objective and subjective evaluations for our method
and the average results of all 9 teams participating in the
MediaEval 2013 Visual Privacy Task are summarized in
Figure 3 and Figure 4, respectively.
0:7
0:55
0:5
0:45
Our Method</p>
      <p>Average
Intelligibility</p>
      <p>Privacy</p>
      <p>Appropriateness
4.</p>
    </sec>
    <sec id="sec-4">
      <title>CONCLUSIONS AND FUTURE WORK</title>
      <p>In this paper, we propose a privacy lter that replaces the
whole body by a silhouette. The user study shows that this
lter is able to provide privacy while maintaining
intelligibility. Future work needs to be done to improve foreground
segmentation, and thus, to reduce artifacts produced by the
imperfect foreground segmentation.</p>
    </sec>
    <sec id="sec-5">
      <title>ACKNOWLEDGMENTS</title>
      <p>The research leading to these results has received funding
from the European Community's FP7 under grant
agreement number 261743 (NoE VideoSense).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Badii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Einig</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Piatrik</surname>
          </string-name>
          .
          <article-title>Overview of the mediaeval 2013 visual privacy task</article-title>
          .
          <source>In MediaEval 2013 Workshop</source>
          , Barcelona, Spain, October
          <volume>18</volume>
          -19
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Korshunov</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Ebrahimi</surname>
          </string-name>
          .
          <article-title>PEViD: privacy evaluation video dataset</article-title>
          .
          <source>In Applications of Digital Image Processing XXXVI</source>
          , San Diego, CA,
          <year>August</year>
          25-29
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L. O'</given-names>
            <surname>Gorman.</surname>
          </string-name>
          <article-title>Video privacy lters with tolerance to segmentation errors for video conferencing and surveillance</article-title>
          .
          <source>In Pattern Recognition (ICPR)</source>
          ,
          <source>Int. Conf. on</source>
          , pages
          <year>1835</year>
          {
          <year>1838</year>
          . IEEE,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Piatrik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Fernandez</surname>
          </string-name>
          , and
          <string-name>
            <surname>E. Izquierdo.</surname>
          </string-name>
          <article-title>The privacy challenges of in-depth video analytics</article-title>
          .
          <source>In Multimedia Signal Processing (MMSP)</source>
          ,
          <source>2012 IEEE 14th International Workshop on</source>
          , pages
          <volume>383</volume>
          {
          <fpage>386</fpage>
          . IEEE,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>