=Paper= {{Paper |id=Vol-1263/paper83 |storemode=property |title=UNIZA @ Mediaeval 2014 Visual Privacy Task: Object Transparency Approach |pdfUrl=https://ceur-ws.org/Vol-1263/mediaeval2014_submission_83.pdf |volume=Vol-1263 |dblpUrl=https://dblp.org/rec/conf/mediaeval/ParalicJ14 }} ==UNIZA @ Mediaeval 2014 Visual Privacy Task: Object Transparency Approach== https://ceur-ws.org/Vol-1263/mediaeval2014_submission_83.pdf
        UNIZA@Mediaeval 2014 Visual Privacy Task: Object
                   Transparency Approach

                           Martin Paralič                                          Roman Jarina
                       University of Žilina,                                      University of Žilina,
               Faculty of Electrical Engineering,                         Faculty of Electrical Engineering,
             Department of Telecommunications and                       Department of Telecommunications and
                           Multimedia                                                 Multimedia
              Univerzitná 1, 01026 Žilina, Slovakia                      Univerzitná 1, 01026 Žilina, Slovakia
                 martin.paralic@fel.uniza.sk                                roman.jarina@fel.uniza.sk


ABSTRACT                                                           video. Thus we utilized only the face position labels from
This paper describes our approach for the Visual Privacy           the provided XML metadata.
Task (VPT) of the MediaEval 2014. Video privacy filtering
based on privacy-sensitive-object transparency is proposed.        2.    OBJECT SEGMENTATION
The background (hidden behind the object) is estimated by             One of the requirements for the privacy filtering is object
median filtering over time sequence of pixel values. We fo-        segmentation. In this task, two kinds of segmentations were
cus only on the areas labeled as high privacy sensitive (i.e.      at disposal. The first one, automatic segmentation as de-
face). Low and medium privacy areas were rather untouched          scribed in [1], was utilized for background estimation. Sec-
to keep the most of information about person activities. De-       ond one, manual annotation of video stream in the XML
spite of simplicity of the proposed method it gives promising      form [7]. We extracted a face bounding box information
performance. The performance is at or slightly above the           from the provided XML metadata and used it for filtering.
average among the VPT participants.                                The box was fitted by masking ellipse.

1.   INTRODUCTION                                                  3.    THE PROPOSED METHOD
   The problem of privacy protection in video surveillance            We examined a simple and straightforward method based
is again concerned in this year’s MediaEval Visual Privacy         on the filtering that replaces high privacy sensitive pixel ar-
Task (VPT) [2]. The PEViD dataset [7] is used for the im-          eas by background pixels as depicted in Figure 1. The key
pact assessment of alternative solutions. Recently, a variety      part of the proposed algorithm is proper estimation of the
of image processing methods have been developed to pro-            whole background scene, with the aim to uncover the back-
tect privacy in multimedia content. A common approach is           ground parts that are of high privacy (e.g. face). However
based on replacing the sensitive information by color boxes        in some scenes, the background can be partially invisible
or distorting the pixels. More sophisticated methods utilize       as depicted in Figure 2. The procedure of the background
person silhouettes detection followed by blurring the whole        estimation is as follows. Time sequence of RGB values of
person [1]. Other methods are based on encrypting sensitive        each pixel is transformed into a time sequence of grayscale
regions where the process is reversible for authorized persons     values because of sorting purpose. Then median over each
only who know the encryption key [4].                              time sequence is computed. In addition , the foreground ob-
   The disadvantage of covering privacy information is the         jects, detected by automatic segmentation [1], are obscured
fact that the person’s activities detection of which is crucial    by black pixels [R, G, B] = [0, 0, 0]. This step is applied
for surveillance purposes, are often also hidden or altered.       to avoid inclusion of the foreground objects in background
The aim of this research activity is development of the visual     pixel estimation. It is obvious that for sufficient background
filtering method that keeps as much information as possible        estimation it is crucial that each background pixel has to be
about person’s activities in video while keeping the person’s      visible at least in one video frame. The background image
privacy intact.                                                    is built from RGB values from the middle of the sorted time
   We propose the filtering that yields to transparency of the     sequences of the pixel values.
privacy sensitive objects. The background (hidden behind
the object) estimation is based on computing median over           4.    THE EVALUATION FRAMEWORK
time-sequence of pixel values for each pixel. We have focused        The video sequences were evaluated to fulfill the UI-REF
only on the areas labeled as high privacy sensitive (i.e. face).   privacy protection requirements. Overall results of the crowd
Low and medium privacy areas were rather untouched to              evaluation for submitted entries were qualified in terms of
keep the most of information about person activities. This         the following criteria [3], [6], [5]:
method is aimed to minimize discomfort of watching filtered
                                                                        • The Privacy Protection Level – an average level of pri-
                                                                          vacy protection across all clips.

Copyright is held by the author/owner(s).                               • Level of Intelligibility – the amount of useful informa-
MediaEval 2014 Workshop, October 16-17, 2014,Barcelona, Spain             tion is retined after filtering.
                                                                        90,00%


                                                                        80,00%


                                                                        70,00%


                                                                        60,00%


                                                                        50,00%
                                                                                                                                            intelligibility score
                                                                                                                                            privacy score
                                                                        40,00%
                                                                                                                                            pleasantness score

                                                                        30,00%


                                                                        20,00%


                                                                        10,00%

Figure 1: Filtered persons face replaced by the back-                    0,00%
ground.                                                                          UNIZA1   Median1   UNIZA2   Median2   UNIZA 3   Median 3




                                                                   Figure 3: Experiment results of 3 streams compared
                                                                   to median among others participants.


                                                                   our future work, we will focus to automatic detection of face
                                                                   position and use more precise filtering tight around person
                                                                   face contours as well as use of more sophisticated translu-
                                                                   cency techniques.

                                                                   6.   REFERENCES
       Figure 2: Partially invisible background.                   [1] A. Badii, A. Al-Obaidi, and M. Einig. Mediaeval 2013
                                                                       visual privacy task: Holistic evaluation framework for
                                                                       privacy by co-design impact assessment. In
     • The Appropriateness – aesthetic perceptual appeal to
                                                                       MediaEval’2013, pages 1–1. CEUR-WS.org, 2013.
       human viewers.
                                                                   [2] A. Badii, T. Ebrahimi, C. Fedorczak, P. Korshunov,
The evaluation scenario is composed the following three streams:       T. Piatrik, V. Eiselein, and A. Al-Obaidi. Overview of
                                                                       the mediaeval 2014 visual privacy task. In MediaEval
Stream 1 For the crowdsourcing evaluation, about 290 work-             2014, 2014.
     ers answered several privacy, intelligibility, and pleas-     [3] A. Badii, M. Einig, M. Tiemann, D. Thiemert, and
     antness related questions for 6 pre-selected videos from          C. Lallah. Visual context identification for
     the test videos submitted by participants.                        privacy-respecting video analytics. In 14th IEEE
                                                                       MMSP International Workshop on Multimedia Signal
Stream 2 A focus group comprising 65 participants (15 fe-
                                                                       Processing, pages 366–371, September 2012.
     males), from Thales, France took part in this evalua-
     tion. The majority of the participants were staff from        [4] T. E. Boult. Pico: Privacy through invertible
     the R&D departments, while the rest were from Man-                cryptographic obscuration. In Computer Vision for
     agement, Security, and other departments.                         Interactive and Intelligent Environments, pages 27–38,
                                                                       November 2005.
Stream 3 A focus group comprising 59 participants (22 fe-          [5] H. Fradi, V. Eiselein, I. Keller, J.-L. Dugelay, and
     males), from sectors including R&D, data protection,              T. Sikora. Crowd context-dependent privacy protection
     law enforcement, from around the world took part in               filters. In 18th International Conference on Digital
     this study.                                                       Signal Processing, pages 1–6, July 2013.
                                                                   [6] P. Korshunov, S. Cai, and T. Ebrahimi. Crowdsourcing
5.    EVALUATION RESULTS                                               approach for evaluation of privacy filters in video
  The evaluation results of the proposed filter performance            surveillance. In 2013 18th International Conference on
were obtained in the 3 streams according to the VPT 2014               Digital Signal Processing (DSP), pages 1–6. ACM, 2012.
evaluation scenario. The reported results of the test among        [7] P. Korshunov and T. Ebrahimi. Pevid: Privacy
the VPT 2014 participants, which were evaluated in terms               evaluation video dataset applications of digital image
of the defined criteria, are presented as median score over            processing xxxvi. In SPIE International Society for
ten teams. This median serves as a baseline to compare our             Optics and Photonics, August 2013.
obtained results. The performance of the proposed approach
compared to median results is depicted in Figure 3. Despite
of simplicity of the proposed method it gives surprisingly
promising performance. The obtained score is at, or slightly
above the average in terms of all the three criteria.
  The challenging problem is detection and estimation of
the partially invisible background as shown in Figure 2. In