<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Imcube @ MediaEval 2015 DroneProtect Task: Reversible Masking using Steganography</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sebastian Schmiedeke</string-name>
          <email>schmiedeke@imcube.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pascal Kelm</string-name>
          <email>kelm@imcube.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lutz Goldmann</string-name>
          <email>goldmann@imcube.de</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Imcube Labs GmbH Berlin</institution>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <fpage>14</fpage>
      <lpage>15</lpage>
      <abstract>
        <p>This paper describes Imcube's participation in the DroneProtect Task of MediaEval 2015, which aims to obscure privacy-concerned image regions in videos sequences captured with drones. As a result persons and vehicles should be unrecognisable, but the semantic meaning of the scene should remain understandable to viewer. We use an approach which replaces the privacy-concerned region with an automatically computed composite of inpainted background and foreground contour. Before obfuscation, the image region to be hidden is extracted and steganographically embedded into the processed frame leading to a reversible solution. The evaluation shows that the developed solution achieves good privacy protection while preserving the intelligibility and aesthetic pleasantness of the original video.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>Since drones become a ordable, these devices are
increasingly used for security applications. Due to their exibility
videos captures by a drone may contain highly sensitive
personal data. Consequently, individuals are increasingly
concerned about the \invasiveness" of such ubiquitous
surveillance and fear that their privacy is at risk. The demands of
stakeholders to prevent criminal activities are often seen to
be in con ict with the privacy requirements of individuals.</p>
      <p>
        The DroneProtect Task of MediaEval 2015 deals with
the problem of privacy protection in dynamic surveillance
videos [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        A common way to protect privacy in images and videos is
to apply techniques such as blurring or masking, as shown
in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Since these techniques are irreversible,
steganography [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] can be used to preserve this information. A
typical steganography algorithm is located in the process chain
between quantisation of the DCT coe cients and Hu man
coding. The information to be hidden is embedded within
the least signi cant bit of non-zero AC coe cients of each
DCT block.
      </p>
    </sec>
    <sec id="sec-2">
      <title>APPROACH</title>
      <p>Our approach combines both masking and steganography
to obtain a visually appealing obfuscated video and to have
the possibility to recover the original frame. An example is
shown in Figure 1.
Inpainting, edge detection</p>
      <p>DCT-based steganography
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Masking</title>
      <p>
        Since the videos captured by the mini-drones [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] are highly
dynamic compared to static surveillance cameras, traditional
background subtraction techniques [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] cannot be applied for
foreground object extraction. Instead of that we use edge
detection to extract the outline of the foreground objects
within the region of interest. Hence, each frame of the
sequence is transformed into grey scale and then smoothed by
applying a Gaussian kernel (5 5). Edges are detected by
applying adaptive thresholding. Therefore, the area around
each pixel is cross-correlated with Gaussian windows of a
sufciently large kernel width. Pixels exceeding the weighted
sum of cross-correlation become edge pixels which are
subsequently used to form the outline. The binary outline is
enhanced by applying morphological operations.
      </p>
      <p>
        In order to remove the original foreground object, the
region of interest needs to be lled with reasonable
background information. We rely on the inpainting algorithm by
Telea [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] which is able to rapidly reconstruct missing image
parts and works as follows. Starting with the boundaries the
colour information is propagated to the inside of the region
by smoothly interpolating along pixel intensity lines. For
this purpose, the \image smoothness information" [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which
is estimated by a weighted sum of Laplacians of the known
neighbourhood, is propagated along these intensity lines.
The direction of the intensity lines, also called \isophotes",
is estimated by discretised intensity gradient vectors. Based
on the assumption that isophotes have the smallest changes
along their direction and the largest changes perpendicular
to their direction, it is estimated by nding the largest
gradient which is orthogonal to the isophote belonging to that
pixel. To improve the temporal stability of the inpainting
result, the inpainted areas are temporally ltered, e.g. for
each pixel the median value with its temporally neighbours
is computed.
      </p>
      <p>The obfuscated region of interest is obtained by
blending the extracted foreground contour with the reconstructed
background texture. Depending on the application the
contour may be emphasized with di erent colours.
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Steganography</title>
      <p>
        Since the masking algorithm itself is irreversible, the
original image data contained within the region of interested
must be embedded into the obfuscated image. Therefore, we
make use of a steganography library, which is based on the
F5-algorithm [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. This algorithm embeds binary information
into the DCT coe cients of a JPEG image. The least
significant bits of non-zero AC coe cients are replaced by the bits
to be embedded in such a way that the statistical
distribution of coe cients remains unchanged. Since only non-zero
coe cients can carry steganographic values and these coe
cients occur less frequent than zero-valued coe cients, only
a limited amount of data can be embedded. Depending on
the amount and size of regions of interests that shall be
hidden, the capacity is often insu cient. Therefore, each region
is treated as rectangular region which is JPEG compressed
with an adjustable compression parameter. All compressed
regions together with their bounding boxes are then
concatenated, encrypted and embedded in the obfuscated JPEG
encoded image. Since the embedded information maybe
destroyed if the image sequence is transcoded, the individual
frames are simply combined into a Motion JPEG video.
      </p>
    </sec>
    <sec id="sec-5">
      <title>EXPERIMENTS &amp; RESULTS</title>
      <p>
        The video sequences of the DroneProtect dataset [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] are
obscured by replacing foreground objects with their outlines.
We are sure that individuals can be identi ed not only by
their face but also their clothes or accessories. So, the colour
of objects with each regions are replaced by an estimated
background. Since the regions of interests are provided with
the dataset, we apply our masking algorithm only on the
provided areas.
      </p>
      <p>The evaluation of the obscured videos took place using
subjective procedures. Two groups of subjects with di erent
experience in surveillance applications were asked to survey
the videos and respond to questions concerning the content.
Based on the answers to these questions three di erent
metrics (privacy, pleasantness, intelligibility) and the deviation
between were computed. The average scores of the di erent
subject groups (experts, novices, overall) for all 38 videos
are summarized in Table 1. In the following we will
analyse the results for the di erent metrics and di erent subject
groups.</p>
      <p>The privacy metric measures how well the identity of the
persons and vehicles was protected through the obfuscation
or in other words how di cult the obfuscation made the
identi cation of a person or vehicle by hiding relevant
visual information. The proposed method achieves a medium
overall score (0.50) which suggests that even though only the
contour of the object of interest is preserved in some cases
it can still be identi ed. A deeper analysis of these cases is
needed to identify potential improvements.</p>
      <p>Intelligibility stands for the ability of classifying objects
and actions within a video sequence and evaluates how well
the activities within a scene are preserved even if the object
of interest is obfuscated to prevent its identi cation. The
proposed method achieves a good overall score (0.61) which
shows that contour information alone is enough to
understand most of the semantics of a scene from a surveillance
perspective. A deeper analysis of the individual videos is
needed to understand what additional information is needed
to improve the intelligibility further.</p>
      <p>Pleasantness evaluates the in uence of the obfuscation
method on the visual quality of the video or by how much the
quality of the video is degraded by distortions and artefacts
within the region of interest. The subjective score is based
on the level of user acceptance. Here, the proposed methods
achieves a good overall score (0.67), since the black
foreground contours composed over an inpainted background
blend well with the original content outside the region of
interest. This score may be further improved by using a
more sophisticated inpainting algorithm which reconstructs
a more plausible background texture.</p>
      <p>Since the three metrics mentioned above evaluate quite
contrary requirements, the deviation evaluates di erence
between these metrics by computing the standard deviation.
As it can be expected from the similar scores for the di erent
metrics, the proposed approach has a very good deviation
score (0.09). This shows that it strikes a good balance
between the di erent criteria (privacy, intelligibility and
pleasantness).</p>
      <p>Comparing the results between the di erent subject groups
shows that the scores of the novices are more equal across
the di erent metrics, while the experts evaluate the
intelligibility and pleasantness higher and the privacy lower. This
follows the intuition that experts will be able to recognize
actions and identities better than novices. The higher
pleasantness score suggests that experts value the content of a
video higher than its quality.
4.</p>
    </sec>
    <sec id="sec-6">
      <title>CONCLUSION</title>
      <p>A reversible approach for protecting the privacy based on
masking and steganographic embedding of the region of
interest has been proposed and evaluation on the
DroneProtect dataset. The results shows that the approach strikes
a good balance between privacy, intelligibility and
pleasantness. For developing potential improvements a more detailed
analysis of the scores for the individual videos is needed.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Badii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Koshunov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Oudi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ebrahimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Piatrik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Eiselein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ruchaud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Fedorczak</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-L.</given-names>
            <surname>Dugelay</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D. F.</given-names>
            <surname>Vazquez</surname>
          </string-name>
          .
          <article-title>Overview of the MediaEval 2015 Drone Protect Task</article-title>
          . In MediaEval 2015 Workshop, Wurzen, Germany, September 14-15
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bertalmio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sapiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Caselles</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Ballester</surname>
          </string-name>
          .
          <article-title>Image inpainting</article-title>
          .
          <source>In Annual Conference on Computer Graphics</source>
          , pages
          <volume>417</volume>
          {
          <fpage>424</fpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bonetto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Korshunov</surname>
          </string-name>
          , G. Ramponi, and
          <string-name>
            <given-names>T.</given-names>
            <surname>Ebrahimi</surname>
          </string-name>
          .
          <article-title>Privacy in mini-drone based video surveillance</article-title>
          . In Workshop on De-
          <article-title>identi cation for privacy protection in multimedia</article-title>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Morkel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H. P.</given-names>
            <surname>Elo</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. S.</given-names>
            <surname>Olivier</surname>
          </string-name>
          .
          <article-title>An Overview of Image Steganography</article-title>
          . In H. S. Venter,
          <string-name>
            <given-names>J. H. P.</given-names>
            <surname>Elo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Labuschagne</surname>
          </string-name>
          , and M. M. Elo , editors,
          <source>Proceedings of the Fifth Annual Information Security South Africa Conference (ISSA2005)</source>
          , Sandton, South Africa,
          <volume>6</volume>
          <fpage>2005</fpage>
          .
          <article-title>Published electronically</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Schmiedeke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kelm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Goldmann</surname>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Sikora</surname>
          </string-name>
          . TUB @
          <article-title>MediaEval 2014 Visual Privacy Task: Reversible Scrambling on Foreground Masks</article-title>
          .
          <source>In Proceedings of the MediaEval 2014 Multimedia Benchmark Workshop</source>
          , pages
          <volume>73</volume>
          {
          <fpage>74</fpage>
          .
          <string-name>
            <surname>CEUR-WS</surname>
          </string-name>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Telea</surname>
          </string-name>
          .
          <article-title>An image inpainting technique based on the fast marching method</article-title>
          .
          <source>Journal of graphics tools</source>
          ,
          <volume>9</volume>
          (
          <issue>1</issue>
          ):
          <volume>23</volume>
          {
          <fpage>34</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Westfeld.</surname>
          </string-name>
          F5
          <article-title>-a steganographic algorithm</article-title>
          . In I. S. Moskowitz, editor,
          <source>Information Hiding</source>
          , volume
          <volume>2137</volume>
          of Lecture Notes in Computer Science, pages
          <volume>289</volume>
          {
          <fpage>302</fpage>
          . Springer Berlin Heidelberg,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>