<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Multi-Level Cartooning for Context-Aware Privacy Protection in Visual Sensor Networks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ádám Erdélyi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thomas Winkler</string-name>
          <email>thomas.winkler@aau.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bernhard Rinner</string-name>
          <email>bernhard.rinner@aau.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Networked and Embedded Systems Alpen-Adria Universität Klagenfurt and Lakeside Labs Lakeside Park B02b</institution>
          ,
          <addr-line>9020 Klagenfurt</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2014</year>
      </pub-date>
      <fpage>16</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>Our solution to the MediaEval 2014 Visual Privacy Task [4] is a privacy-preserving video lter that is able to maintain a high intelligibility level in surveillance systems while providing a reasonable privacy protection level to monitored people and a pleasant view to observers. This paper describes our context-aware method that is based on cartooning and pixelation e ects. Subjective evaluation results are also presented to demonstrate the performance of our algorithm.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Surveillance cameras play various roles in our everyday
lives and their increasing number attracted attention to
privacy issues. The goal of the MediaEval 2014 Visual
Privacy Task [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] is to nd a method that protects privacy while
the original purpose of surveillance can be maintained.
Together with annotations of sensitive regions such as faces,
people and carried items the PEViD data-set [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is provided
to evaluate solutions submitted by task participants.
Desired privacy levels ([H]igh, [M]edium, or [L]ow) are also
included in the annotations for each region so that various
lters can be combined and adjusted accordingly.
      </p>
      <p>
        Traditional CCTV cameras are continually being replaced
by more modern smart cameras which are usually part of
Visual Sensor Networks (VSNs). Other widespread
videocapable devices such as smart phones, tablets or web-cams
also pose privacy threats due to their frequent use in
public spaces. Processing capabilities of these devices allow the
integration of privacy protection methods directly into the
camera. Our aim is to create such an integrated lter. In
order to simulate the limited computational power of the above
mentioned embedded devices, we ran our privacy-preserving
algorithm on the Jetson TK1 [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] development board to
process the provided videos.
      </p>
      <p>Our method is based on a cartooning e ect which is
applied both globally and locally. In sensitive regions the lter
intensity is adjusted according to the annotation. Faces are
further protected with an extra pixelation e ect.</p>
    </sec>
    <sec id="sec-2">
      <title>2. IMPLEMENTATION</title>
      <p>
        A prototype of our lter is implemented in C++ by using
OpenCV [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] for video processing and pugixml [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] to parse
the annotation les. Figure 1 depicts the processing pipeline
of the proposed algorithm. A detailed description of our
privacy protection lter is provided in Sections 2.1 to 2.3.
The submitted videos have been generated on the Jetson
TK1 platform in the following software environment: Linux
for Tegra R19 (Kernel version 3.10.24) and OpenCV 2.4.9
with GPU support via CUDA 6.0.
2.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Global Cartooning</title>
      <p>
        First, a medium-intensity cartooning e ect is applied to
the whole video frame. This always ensures a default level of
privacy thereby preparing the lter for real-world use where
privacy loss may occur at sensitive regions due to
inaccurate feature extractors. Additionally, implicit privacy
channels [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] are also protected. The cartooning e ect
(represented by the box labelled \Cartooning" in Figure 1) is a
result of the following main steps:
1. Preliminary blurring with a k k size kernel is applied
in order to reduce noise. Edges are detected by the
Sobel edge detector for later use.
2. Then the blurred video frame goes through a Mean
Shift [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] lter with a spatial window radius of sp and
a colour window radius of sr. This makes the frame
smoother and replaces ne details with solid colour
patches as if it was drawn like a cartoon.
3. Finally, edges are recovered along object contours by
performing a bitwise weighted copy from the original
input frame. This makes the nal output less blurry
and more similar to hand-drawn cartoons where object
contours are usually emphasized.
      </p>
      <p>The parameters used in the steps above are dependent on
desired privacy levels taken from the annotation les. k=17,
sp=30, sr=60 are used for high level; k=9, sp=20, sr=40 for
medium level; and k=3, sp=10, sr=20 for low level privacy.
For global cartooning we used the medium level.
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Local Cartooning</title>
      <p>
        After global cartooning, protection levels of sensitive
regions are adjusted locally according to Table 1 in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. More
sensitive image regions such as faces are further protected
with high-intensity cartooning while less sensitive ones are
downgraded to a lower privacy level in order to increase
intelligibility. The same cartooning e ect is used locally that was
described in Section 2.1 except the parameters are changed
according to the annotations.
2.3
      </p>
    </sec>
    <sec id="sec-5">
      <title>Pixelation</title>
      <p>In the nal step of our processing pipeline an extra
pixelation e ect is applied on faces in order to further obscure
the identity of people. The region of pixelation is the
maximum inscribed ellipse of the face's bounding box and the
pixel size is one- fteenth of its larger dimension.</p>
      <p>(a) Original. (b) Filtered. (c) Original. (d) Filtered.</p>
    </sec>
    <sec id="sec-6">
      <title>EVALUATION RESULTS</title>
      <p>Two pairs of video frOavmereasll (original and ltered) in
Figure 2 demonstrate the visual e ect of our privacy lter.</p>
      <p>
        In terms of processing speed the Jetson TK1 board is
capable of 5 fps for 320 180, 2 fps for 640 360, 1 fps
for 800 450, 0.8 fps for 1024 576, and 0.2 fps for the
provided full HD resolution videos. It proves that privacy
protection can really be done directly inside the camera.
Despite running the Mean Shift lter on the GPU instead of
the CPU it remains the bottleneck of our algorithm. Thus,
the current version of our prototype cannot lter full HD
videos in real time, although acceptable frame-rates can
be achieved at lower resolutions. More detailed discussion
about achievable frame-rates on embedded devices can be
found in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] where we show a scenario-adaptive version of
cartooning lter. An alternative implementation of
cartooning is presented in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] proving that acceptable frame-rates
are possible on even morPaegree1source-constrained devices.
      </p>
      <p>Figure 3 shows the subjective evaluation results provided
by the Visual Privacy Task organizers. These numbers are
calculated from the outcome of a 12-question survey that
has been conducted in three di erent groups. The rst
group consists of 230 regular people and the questionnaire
was lled out in frame of a crowd-sourcing campaign. The
second group is constituted of 65 participants from Thales,
France. And the third is a focus group with 59
participants from all over the world. Questions of the survey are
assigned around the following three criteria: intelligibility,
privacy, and pleasantness.</p>
      <p>After analysing Figure 3, it is clear that the performance
of our method is always better than the median performance
among the 8 participants in terms of intelligibility and
pleasantness. We also achieved competitive results as for privacy,
although we slightly underperform the median.
4.</p>
    </sec>
    <sec id="sec-7">
      <title>CONCLUSION AND FUTURE WORK</title>
      <p>By introducing a global component to our privacy
protection lter we cover implicit privacy channels and ensure a
default level of privacy even if inaccurate real-world feature
detectors are being used. Our method provides a pleasant
view and high intelligibility while reasonably protecting
privacy. It works reasonably well on the Jetson TK1 board for
lower resolution videos, although further improvements are
necessary to reach acceptable frame-rates for full HD videos.
5.</p>
    </sec>
    <sec id="sec-8">
      <title>ACKNOWLEDGEMENT</title>
      <p>This work was partly funded by the European Regional
Development Fund and the Carinthian Economic Promotion
Fund (KWF) under grant KWF-3520/23312/35521.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>NVIDIA</given-names>
            <surname>{ Jetson TK1 Development Kit</surname>
          </string-name>
          . https://developer.nvidia.com/jetson-tk1
          <source>(last visited: Sept</source>
          .
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>[2] OpenCV { Open Source Computer Vision</article-title>
          . http://opencv.org (last
          <source>visited: Sept</source>
          .
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <article-title>[3] pugixml { Light-weight, simple and fast XML parser for C++ with XPath support</article-title>
          . http://pugixml.org (last
          <source>visited: Sept</source>
          .
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Badii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ebrahimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Fedorczak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Korshunov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Piatrik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Eiselein</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A</given-names>
            .
            <surname>Al-Obaidi</surname>
          </string-name>
          .
          <article-title>Overview of the MediaEval 2014 Visual Privacy Task</article-title>
          .
          <source>In Proceedings of the MediaEval Workshop</source>
          , Barcelona, Spain,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Yizong</given-names>
            <surname>Cheng</surname>
          </string-name>
          .
          <article-title>Mean shift, Mode Seeking, and Clustering</article-title>
          .
          <source>IEEE Transactions on Pattern Analysis and Machine Intelligence</source>
          ,
          <volume>17</volume>
          (
          <issue>8</issue>
          ):
          <volume>790</volume>
          {
          <fpage>799</fpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Adam</given-names>
            <surname>Erdelyi</surname>
          </string-name>
          , Tibor Barat, Patrick Valet, Thomas Winkler, and
          <string-name>
            <given-names>Bernhard</given-names>
            <surname>Rinner</surname>
          </string-name>
          .
          <article-title>Adaptive Cartooning for Privacy Protection in Camera Networks</article-title>
          .
          <source>In Proceedings of the Int. Conf. on Advanced Video and Signal Based Surveillance</source>
          , volume
          <volume>6</volume>
          , page 6,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Korshunov</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Ebrahimi</surname>
          </string-name>
          .
          <article-title>PEViD: Privacy Evaluation Video Dataset</article-title>
          .
          <source>In Proceedings of SPIE Applications of Digital Image Processing XXXVI</source>
          , volume
          <volume>8856</volume>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Saini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Atrey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mehrotra</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Kankanhalli</surname>
          </string-name>
          .
          <article-title>Considering Implicit Channels in Privacy Analysis of Video Data</article-title>
          .
          <source>IEEE Communications Society E-Letters</source>
          ,
          <volume>6</volume>
          (
          <issue>11</issue>
          ):
          <volume>27</volume>
          {
          <fpage>30</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Winkler</surname>
          </string-name>
          , Adam Erdelyi, and Bernhard Rinner. TrustEYE.M4:
          <article-title>Protecting the Sensor{not the Camera</article-title>
          .
          <source>In Proceedings of the Int. Conf. on Advanced Video and Signal Based Surveillance, page 6</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>