=Paper=
{{Paper
|id=Vol-1263/paper86
|storemode=property
|title=MediaEval 2014 Visual Privacy Task: Context-Aware Visual Privacy Protection
|pdfUrl=https://ceur-ws.org/Vol-1263/mediaeval2014_submission_86.pdf
|volume=Vol-1263
|dblpUrl=https://dblp.org/rec/conf/mediaeval/BadiiA14
}}
==MediaEval 2014 Visual Privacy Task: Context-Aware Visual Privacy Protection==
<pdf width="1500px">https://ceur-ws.org/Vol-1263/mediaeval2014_submission_86.pdf</pdf>
<pre>
MediaEval 2014 Visual Privacy Task: Context-Aware Visual
                   Privacy Protection

                                     Atta Badii                              Ahmed Al-Obaidi
                                 ISR Laboratory                                ISR Laboratory
                            University of Reading, UK                     University of Reading, UK
                           atta.badii@reading.ac.uk                    a.al-obaidi@reading.ac.uk

ABSTRACT
In this paper, we describe a privacy filter proposed for VPT 2014
in an attempt to provide a context-aware solution. The proposed
solution comprises three different techniques applied to the face,
skin, and body regions separately. The proposed combination of
filtering techniques aimed to produce an adaptive solution and
provide an example of a context-aware-like privacy filtering
capability. The results demonstrated the effectiveness of the
proposed techniques in maintaining a high level of Intelligibility
and retaining the appeal of the video i.e. Pleasantness. However
the still identifiable gender and race of certain individuals                        Figure 1: Outputs of the proposed filter
contributed to the perception of lower levels of privacy of the
person in the case of some of the video frames.                        Face filter (H): The face being a highly identity revealing region
                                                                       has been ranked high (H) for privacy protection compared to other
Categories and Subject Descriptors                                     body parts. Accordingly, we sought to ensure the anonymity of
I.2.10 [Artificial Intelligence]: Vision and Scene Understanding -     the visible face. First an empirical threshold was set to examine
video analysis, representations, data structure, and transforms        the minimum size of the face in which it could be identifiable (i.e.
                                                                       50 pixels in our case, rightmost snapshot in Figure 1). Below the
1. INTRODUCTION                                                        said threshold, a simple median blur filter proved sufficient to
The Visual Privacy Task at MediaEval 2014 [1] acknowledged             protect the person’s identity. Once the face size exceeded the
that the perceived privacy of a citizen cannot be divorced from the    threshold, a key point detector was applied on the face region
contexts in which the privacy is valued by the citizen and thus        followed by adaptive colour quantisation and circle texturing to
worth protecting. A citizen may have a variety of roles,               produce the effect as shown in Figure 1.
responsibilities and relationships; each associated with a
particular persona which may be activated in a given context           Skin filter (M): The exposed skin regions could provide sufficient
within their everyday life-style e.g. husband, father, employee,       information to enable the detection of the ethnicity of the subject.
boss etc. Each such persona of the citizen is commensurate with a      Skin provides a focus of attention. Exposed skin regions, e.g.
particular privacy boundary linked to a certain context. This could    hands, are also important in activity recognition and detection of
guide the levels and scopes of privacy filtering according to the      weapons. Therefore, morphologic changes or non-homogenous
situated (context-dependent) scenario. The VPT task evaluation         colour changes are not suitable in this case. To manipulate the
methodology has responded to the need for a more inclusive,            skin region in a unified fashion, we reduced the colour saturation
holistic and high resolution assessment of privacy filtering           and luminance values in the R, G, and B colour channels
requirements as well as the evaluation of the efficacy and impacts     separately within the oriented boxes enclosing the skin regions.
of the resulting privacy filtering solutions based on the UI-REF       The filtered skin regions were still recognisable. There are only
methodology [2]. The PEViD dataset [3] was updated with                two identifiable skin colours: light-like and dark-like. The skin
privacy ranking system for the subject body parts for context-         texture which is responsible for the skin attractiveness was also
aware impact assessment of privacy protection solutions.               eliminated.

2. THE PROPOSED FILTER                                                 Person filter (L): The last stage of privacy protection is applied to
We proposed a privacy filter which primarily aims to achieve a         the bounding box enclosing the subject region. This region is
balance in the well-addressed Privacy-Intelligibility trade-off. In    ranked low in the provided annotation. An edge-based analysis is
addition, the Pleasantness and the appropriate filter application      implemented on the foreground region(s) within the subject
criteria are also considered alongside the real-time applicability.    bounding box. The procedure begins with morphological
Accordingly our three different filtering techniques for face, skin,   operations to enhance the foreground mask to reduce the
and body regions of the subject featured in the video were applied     background noise and minimise the holes in the foreground
as follows:                                                            region. Canny edge detector is subsequently applied and further
                                                                       refined to eventually draw the subject’s contour. The final effect
Copyright is held by the author/owner(s).                              produced as depicted in Figure 1 is the result of a distance
MediaEval 2014 Workshop, October16-17, 2014, Barcelona, Spain          transformation which calculates the distance of each pixel of the
resulting binary contour map with the closest zero pixel in the       appearance. On the other hand, the proposed solution would
image. OpenCV implementation based on [4] was used which              clearly prevent the viewer from being able to identify the featured
calculate the Euclidean distance to the nearest zero pixel            subject in the normal cases by successfully hide most of the face
consisting of basic shifts: horizontal, vertical, diagonal, or        details. Regarding the Pleasantness criterion the obtained scores
knight’s move. A mask of size (5 X 5) was used for the best           were comparable to the median values for the three streams which
results.                                                              fell within the range of 50-70% for stream 2 and 3 and noticeably
                                                                      lower for stream 1 for slightly above 20%. Table 1 summarises the
3. EVALUATION RESULTS                                                 numerical values of the obtained scores for the evaluated streams.
A subjective evaluation consist of three (3) streams were
conducted and the performance of the proposed privacy protection                           Intelligibility   Privacy     Pleasantness
solution is examined in terms of the defined criteria namely,
Privacy, Intelligibility, and Pleasantness as described in [1].             Stream_1          75.10%         42.80%         23.50%

Figure 2-5 illustrate the performance of the proposed solution in           Stream_2          84.07%         35.55%         69.27%
the three evaluation streams respectively. A noticeable trend can           Stream_3          73.27%         39.01%         56.61%
be generalised from the three sets of results with only marginal                      Table 1: Scores for evaluated streams
variations.


                  Figure 2: Scores from Stream 1                              Figure 5: Integrating the scores for all three streams


                                                                      4. CONCLUSION
                                                                      In this paper we have proposed a video privacy filter using a
                                                                      combination of filtering techniques to simulate a context-aware
                                                                      solution. The filter aims to achieve the highest privacy with
                                                                      minimum content distortion and viewer distraction. The obtained
                                                                      scores were comparable to the average values of the scores for all
                                                                      the privacy filtering solutions as proposed for the Visual Privacy
                                                                      Task 2014. One possible future work would be the addition of
                  Figure 3: Scores from Stream 2                      person re-identification test to be included in evaluating the
                                                                      Privacy criterion as an important aspect.

                                                                      5. ACKNOWLEDGMENTS
                                                                      We would like to thank Lucas Teixeira and Kevin Lelu for their
                                                                      valuable inputs. This work was supported by the European
                                                                      Commission under contracts FP7-261743 VideoSense project.

                                                                      6. REFERENCES
                                                                      [1] Badii, Atta, Ebrahimi, Touradj, Fedorczak, Christian, Korshunov,
                                                                           Pavel, Piatrik, Tomas, Eiselein, Volker, Al-Obaidi, Ahmed
                                                                           “Overview of MediaEval 2014 Visual Privacy Protection Task”,
                  Figure 4: Scores from Stream 3                           Proceedings of the MediaEval 2014 Workshop, Barcelona, Spain,
                                                                           16-17 October 2014
The proposed solution scored above the average for Intelligibility    [2] Badii, Atta, “User-Intimate Requirements Hierarchy Resolution
criterion and well above the region of 70% in the three streams            Framework (UI-REF): Methodology for Capturing Ambient Assisted
which ensures that the processed video will still serve the main           Living Needs”, Proceedings of the Research Workshop, Int.
                                                                           Ambient Intelligence Systems Conference (AmI’08), Nuremberg,
purpose of CCTV security objective.
                                                                           Germany, November 2008
However, the Privacy scores were slightly below the median
                                                                      [3] Korshunov, Pavel, and Ebrahimi, Touradj. “PEViD: privacy
which is in general below the value of 50% in this competition.            evaluation video dataset”, Applications of Digital Image Processing
One possible explanation of the overall low score of Privacy is the        XXXVI, San Diego, California, USA, August 25-29, 2013.
fact that this criterion has been measured based on the ability to
                                                                      [4] Borgefors, Gunilla. “Distance transformations in digital images”
identify the gender and the ethnicity of the person which could be
                                                                           Computer vision, graphics, and image processing 34.3 (1986): 344-
hard to conceal without significantly manipulating the person              371.

</pre>