<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Adversarial Photo Frame: Concealing Sensitive Scene Information of Social Images in a User-Acceptable Manner</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Zhuoran Liu</string-name>
          <email>z.liu@cs.ru.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zhengyu Zhao</string-name>
          <email>z.zhao@cs.ru.nl</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Radboud University</institution>
          ,
          <country country="NL">Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>27</fpage>
      <lpage>29</lpage>
      <abstract>
        <p>Personal privacy protection has become more and more crucial in the era of big multimedia data and artificial intelligence. This paper presents our submission to pixel privacy task, where we propose to fool the deep visual classification model that is for recognition of sensitive scenes by adding adversarial frame to the image. Experimental results indicate that our method can achieve strong adversarial efects while maintaining the visual appeal and social function of the transformed images.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Scene recognition is a hallmark topic in computer vision, and it
provides global semantic information that facilitates diferent tasks.
Leveraging large-scale scene datasets, deep learning-based scene
recognition algorithms have made great progress in scene
recognition [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. But these algorithms also raised people’s concerns on
social multimedia privacy at the same time [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. In the past, privacy
protection algorithms mainly focused on leveraging adversarial
machine learning, image style transfer [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and image
enhancement [
        <xref ref-type="bibr" rid="ref1 ref3">1, 3</xref>
        ]. Conventional adversary algorithms [
        <xref ref-type="bibr" rid="ref2 ref5">2, 5</xref>
        ] are efective
for inducing misclassfication but not intentionally designed for
increasing image appeal. In most cases, the resulting distortions even
degrade the image quality. Methods of image enhancement and
style transfer that are able to increase the visual appeal of images
do not consider the adversarial function during the transformation
process.
      </p>
      <p>
        In the 2019 Pixel Privacy Task [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], we propose a baseline
approach, called adversarial photo frame (APFrame). The approach
strives for a balance between privacy protection and visual appeal
of images. APFrame achieves privacy protection against
networkbased scene classifier based on adversarial machine learning
techniques, while restricting the transformations to the edge of the
image, i.e., the photo frame, in order to maintain the visual appeal.
Experiments are conducted with diferent photo frame settings
of APFrame. The algorithm details and experimental results are
discussed in section 2 and section 3.
      </p>
    </sec>
    <sec id="sec-2">
      <title>ADVERSARIAL PHOTO FRAME</title>
      <p>The working diagram of APFrame is described in Figure 1. Given
an image, a photo frame is generated randomly. We start from a
white additive frame with all components equaling 1. This frame
is fed into the classifier and the gradients of cross-entropy loss
with respect to the original label is calculated by back propagation</p>
      <p>Deep learning-based scene classifier
Use back propagation to update adversarial frame</p>
      <p>in order to increase model loss
to update the frame. This process is repeated until the classifier
outputs a diferent prediction.</p>
      <p>Specifically, we consider two alternative constraints used for
generating adversarial photo frames, namely, full-flexibility frame
and weight-constrained frame. In the full flexibility APFrame, all
values are admissible for pixels in the frame, and the resulting frame
gives the impression of random couples (e.g., top row in Figure 2).
In the weight-constrained frame, All the components in any of the
three RGB color channels are required to change uniformly, i.e.,
only three parameters are learned for increasing the classification
loss.</p>
      <p>This setting enforces the adversarial frames to only have one
color for more natural look. The width of the frame can be
predefined to balance the protection efect and visual appeal. A narrow
frame leads to less influence on the image content, but normally
yields weaker protection efect.</p>
      <p>The algorithm is summarized in Equation 1.</p>
      <p>minimize − L(f (x + δ ), c0) (1)</p>
      <p>δ
s.t. c0 , argmax(f (x + δ )),</p>
      <p>c
where x is the input image, which is correctly predicted into c0
class, and δ is the frame that only has values in the edge of the
image with pre-defined width.</p>
      <p>f represents the classifier, and L represents the cross-entropy
loss function. The objective function is minimized until the
predicted label of the classifier is diferent from the ground truth label.
Obviously, narrow frame with the weight-constrained setting
performs less efective due to the limited searching space of possible
transformations.</p>
      <p>Figure 2 shows some image examples achieved by APFrame.</p>
    </sec>
    <sec id="sec-3">
      <title>EXPERIMENTS AND EVALUATION</title>
      <p>
        We submit five runs to pixel privacy task for the oficial evaluation
on our APFrame. Among them, three runs are using full-flexibility
constraints with varied widths in the range [
        <xref ref-type="bibr" rid="ref10 ref5">5,10,15</xref>
        ], denoted as
Ff W5, Ff W10 and Ff W15. The other two runs are using
weightconstrained settings with two diferent widths 15 and 20, denoted
as WcW15 and WcW20.
      </p>
      <p>
        We use Adam optimizer [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] with a learning rate of 1 to perform
the gradient descent in Equation 1 on a Tesla P100 GPU.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4 DISCUSSION AND OUTLOOK</title>
      <p>We proposed APFrame that can achieve adversarial efects against
deep scene recognition networks for privacy protection, while
maintaining image appeal. Compared with other adversarial machine
learning-based techniques, APFrame maintains the visual appeal
of images by avoiding modifications in the central part of images,
which is arguably the part of the image most important to human
viewers. In short, the social function and visual appeal of the images
can be maintained to a large degree.</p>
      <p>But it still has some disadvantages. For instance, it is not robust to
preprocessing methods, e.g., center (random) cropping or resizing.
The adversarial photo frames can be erased directly by
centercropping which is a common preprocessing step in deep
learningbased models. To resolve this issue, developing the techniques that
consider the semantics of image content is a direction to research
in the future. APFrame can be extended to any shape that indicates
diferent number of pixels in image, or implemented as QR code.</p>
      <p>
        We also point out that due to the neural structure of NIMA,
the generated adversarial frame patterns may have a transferable
impact on the evaluation score [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. This encourage us to have a
non-neural evaluation scheme to better address human perception
in the future.
      </p>
    </sec>
    <sec id="sec-5">
      <title>ACKNOWLEDGMENTS</title>
      <p>This work was carried out on the Dutch national e-infrastructure
with the support of SURF Cooperative.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Simon</given-names>
            <surname>Brugman</surname>
          </string-name>
          , Maciej Wysokinski, and
          <string-name>
            <given-names>Martha</given-names>
            <surname>Larson</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>MediaEval 2018 Pixel Privacy Task: Views on image enhancement</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval 2018 Workshop.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Nicholas</given-names>
            <surname>Carlini</surname>
          </string-name>
          and
          <string-name>
            <given-names>David</given-names>
            <surname>Wagner</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Towards evaluating the robustness of neural networks</article-title>
          .
          <source>In 2017 IEEE Symposium on Security and Privacy (SP)</source>
          . IEEE,
          <fpage>39</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Jaeyoung</given-names>
            <surname>Choi</surname>
          </string-name>
          , Martha Larson,
          <string-name>
            <given-names>Xinchao</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Kevin</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Gerald</given-names>
            <surname>Friedland</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Alan</given-names>
            <surname>Hanjalic</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>The Geo-Privacy Bonus of Popular Photo Enhancements</article-title>
          .
          <source>In ACM International Conference on Multimedia Retrieval (ICMR)</source>
          .
          <source>ACM</source>
          ,
          <volume>84</volume>
          -
          <fpage>92</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Diederik</surname>
            <given-names>P</given-names>
          </string-name>
          <string-name>
            <surname>Kingma and Jimmy Ba</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Adam: A method for stochastic optimization</article-title>
          .
          <source>In International Conference on Learning Representations (ICLR).</source>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Alexey</given-names>
            <surname>Kurakin</surname>
          </string-name>
          , Ian Goodfellow, and
          <string-name>
            <given-names>Samy</given-names>
            <surname>Bengio</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Adversarial examples in the physical world</article-title>
          .
          <source>In International Conference on Learning Representations (ICLR).</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Martha</given-names>
            <surname>Larson</surname>
          </string-name>
          , Zhuoran Liu, Simon Brugman, and
          <string-name>
            <given-names>Zhengyu</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Pixel Privacy: Increasing Image Appeal while Blocking Automatic Inference of Sensitive Scene Information</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval 2018 Workshop.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Yanpei</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Xinyun</given-names>
            <surname>Chen</surname>
          </string-name>
          , Chang Liu, and
          <string-name>
            <given-names>Dawn</given-names>
            <surname>Song</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Delving into transferable adversarial examples and black-box attacks</article-title>
          .
          <source>In International Conference on Learning Representations (ICLR).</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Zhuoran</given-names>
            <surname>Liu</surname>
          </string-name>
          and
          <string-name>
            <given-names>Zhengyu</given-names>
            <surname>Zhao</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>First Steps in Pixel Privacy: Exploring Deep Learning-based Image Enhancement against Largescale Image Inference</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval 2018 Workshop.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Zhuoran</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Zhengyu</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Martha</given-names>
            <surname>Larson</surname>
          </string-name>
          .
          <year>2019</year>
          . Pixel Privacy 2019:
          <article-title>Protecting Sensitive Scene Information in Images</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval 2019 Workshop.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Hossein</given-names>
            <surname>Talebi</surname>
          </string-name>
          and
          <string-name>
            <given-names>Peyman</given-names>
            <surname>Milanfar</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Nima: Neural image assessment</article-title>
          .
          <source>IEEE Transactions on Image Processing 27</source>
          ,
          <issue>8</issue>
          (
          <year>2018</year>
          ),
          <fpage>3998</fpage>
          -
          <lpage>4011</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Bolei</surname>
            <given-names>Zhou</given-names>
          </string-name>
          , Agata Lapedriza, Aditya Khosla, Aude Oliva, and Antonio Torralba.
          <year>2017</year>
          .
          <article-title>Places: A 10 million image database for scene recognition</article-title>
          .
          <source>IEEE transactions on pattern analysis and machine intelligence 40</source>
          ,
          <issue>6</issue>
          (
          <year>2017</year>
          ),
          <fpage>1452</fpage>
          -
          <lpage>1464</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>