=Paper= {{Paper |id=Vol-1739/MediaEval_2016_paper_47 |storemode=property |title=A Hybrid Approach for Multimedia Use Verification |pdfUrl=https://ceur-ws.org/Vol-1739/MediaEval_2016_paper_47.pdf |volume=Vol-1739 |dblpUrl=https://dblp.org/rec/conf/mediaeval/PhanBPN16 }} ==A Hybrid Approach for Multimedia Use Verification== https://ceur-ws.org/Vol-1739/MediaEval_2016_paper_47.pdf
              A HYBRID APPROACH FOR MULTIMEDIA USE
                          VERIFICATION

       Quoc-Tin Phan1 , Alessandro Budroni2 , Cecilia Pasquini1 , Francesco G. B. De Natale3
                  Department of Information Engineering and Computer Science - University of Trento, Italy
             {quoctin.phan, cecilia.pasquini}@unitn.it1 ; alessandro.budroni@studenti.unitn.it2 ; denatale@ing.unitn.it3




ABSTRACT                                                             novel approach to assess the credibility of associated images
Social networks enable multimedia sharing between world-             or videos by using not only forensic features but also textual
wide users, however, there is no automatic mechanism im-             features which are acquired by performing online text search
plemented aiming to verifying multimedia use. This has               and image reverse search. The acquired results on develop-
been known as a highly challenging problem due to the va-            ment and test sets confirm the effectiveness of our proposed
riety of media types and huge amount of information they             method.
convey. As a participating team of MediaEval 2016, we pro-
pose a hybrid approach for detecting misused multimedia              2.    THE PROPOSED METHOD
on Twitter which has been known as Verifying Multime-                   We propose a verification system composing two classifi-
dia Use task. Specifically, we designed a verfication system         cation tiers as depicted in Figure 1. The first classification
that can answer how likely an associated multimedia file is          tier takes as inputs the event and the associated image or
fake based on multiple forensic features and textual features,       video, and answer How likely does this image or video reflect
which were acquired by performing online text search and             the event?. We consider the occurence context of associated
image reverse search. Next, effective post-based features and        images or videos on the Internet as a strong evidence for
user-based features are utilized to validate the credibility of      assessing their trustworthiness. Having certain confidence
tweet posts. Finally, based on the assumption that a tweet           about the credibility of associated images or videos, we pro-
sharing fake images or videos is likely to be fake, credibility      ceed to design the second classification tier to validate the
scores of tweet posts and associated multimedia are fused to         credibility of tweets based on Twitter-based features. Fi-
detect misused multimedia.                                           nally, scores returned from two classifiers are fused to give
                                                                     final decision.
1.   INTRODUCTION
   Online Social Network (OSN) services offer a medium for           2.1     Multimedia assessment
users to connect and share daily information. With respect             In the first step, we conduct online text search using rel-
to specific events, part of information is usually not trustable     evant keywords associated with the event and select top re-
and its dissemination causes several negative consequences           turned websites from which we extract most relevant terms
on the community. Attempts have been proposed to address             based on the statistical measurement TF-IDF (Term Fre-
the problem of image manipulation on online news [9], or the         quency - Inverse Document Frequency). On another side,
impact of image manipulations to users’ perceptions [6].             the associated image is searched over Google Images and we
   In MediaEval Verifying Multimedia Use task [3, 5] , given         select only top returned websites to check the frequency of
tweet content features, user features and some effective foren-      most relevant terms from event text search. To Youtube
sic features, innovative methods are welcomed to verify whether      videos, only users’ comments are extracted, while leaving
multimedia (images and videos) are correctly used on Twit-           out videos from other cites unprocessed. By this step, the
ter. Due to the variety of languages used and the fact that          system is expected to correctly recognize images or videos
many reposted tweets do not contain meaningful textual in-           not belonging to current event. In the second step, we check
formation, linguistic approaches like [8, 10] are believed not       occurences of positive, negative and “fake” related words in
effective enough in this task. Moreover, almost each tweet           the whole text retrieved from image or video reverse search,
post is accompanied by at least an image or video, and the           assuming that a fake multimedia should receive negative as-
image or video itself reflects the credibility of tweet. To the      sessment from readers.
best of our knowledge, only [4] took into account multimedia           Forensic operations can be applied on multimedia files
forensic features in Multimedia Use Verification task.               to verify whether or not the multimedia file is tampered,
   Despite the fact that associated multimedia files play a sig-     and even which regions are most likely to be modified. We
nificant role in assessing credibility of tweets, forensic algo-     adopt non-aligned double JPEG compression [2], block arti-
rithms are very sensitive to subsequent image modifications          fact grid [7], and Error Level Analysis [1] as useful forensic
and multiple lossy compression. In this work we propose a            features. Finally, we integrate textual features and forensic
                                                                     features in the first classification tier.
Copyright is held by the author/owner(s).                            2.2     Tweet credibility assessment
MediaEval 2016 Workshop, Oct. 20-21, 2016, Hilversum, Nether-
lands.                                                                  After having the output from the first classification tier
                           Forensic feature                              Forensic
                             extraction                                  features

       Multimedia                                                      Concatenate           Classifier 1
                             Search by
                            image/video
                                                  Textual feature        Textual
                                                    extraction           features
         Event                Search by                                                                            Final
                              keywords                                                       Score fusion
                                                                                                                  decision
                                                                        Post-based
          Post                                                           features


                                                                       Concatenate           Classifier 2


          User                                                          User-based
                                                                         features
                                          Figure 1: Schema of the proposed method.

reflecting the trustworthiness of associated multimedia, the        main task based on two-tier classification. In the sub-task,
second classification tier is designed to assess how multi-         we submit two RUNs: i) RUN 1 (required): apply only foren-
media are used on Twitter. Tweet credibility assessment is          sic features described in Section 2.1, ii) RUN 2: apply both
feasible thanks to post-based features, i.e. whether the tweet      textual features and forensic features described in Section
contains the question mark or exclamation mark characters,          2.1. Especially, on the second RUN, we train the classi-
number of negative sentiment words the tweet contains, to-          fier on entire multimedia available in development set of the
gether with user-based features, i.e. the number of followers       main task. Acquired results from Table 2 reveal the fact
the user has, whether the user is verified by Twitter.              that our method gains recall if we take into account textual
                                                                    features acquired from online text search and image reverse
2.3    Score fusion                                                 search. This means we can effectively reduce false negative
   We approach the problem by experimenting with LR (Lo-            rate and more fake samples are detected.
gistic Regression) and RF (Random Forest) classifiers. As
depicted in Table 1, LR performs less efficient than RF on
the development set. This can be explained as RF suits              Table 2: Verification results on the test set of the
well with non-linearly separable and uneven data, i.e. some         sub-task
Twitter posts do not associate with any meaningful text,                             Recall Precision F1-score
forensic features of videos are not included (all are zeros).             RUN 1        0.5      0.48        0.49
For that reason, we select RF as our classifiers and proceed              RUN 2       0.93      0.49       0.64
to final decision by conducting score level fusion. With the
assumption that a tweet sharing fake images or videos is               Next, results of the main task are reported from three
likely to be fake, higher weight is assigned to the output of       RUNs: i) RUN 1 (required): apply only the second clas-
the first tier, while lower weight to the second tier. In order     sification tier, ii) RUN 2: apply two-tier classification and
to validate our method, we conduct experiments counting             0.8 : 0.2 fusion strategy, answer UNKNOWN to cases where
only scores from classification tier 2 (using post-based and        the output of classification tier 1 is not available due to
user-based features provided by the task), and experiments          online searching errors, iii) RUN 3: apply two-tier classifica-
using 0.8 : 0.2 weighting strategy. Statistics shown in Table       tion and 0.8 : 0.2 fusion strategy, consider only the output
1 confirm the effectiveness of our multimedia assessment tier       of classification 2 to cases where the output of classification
and score fusion strategy.                                          tier 1 is not available due to online searching errors.

Table 1: Verification results on the development set
in terms of F1-score, 100 real and 100 fake samples                 Table 3: Verification results on the test set of the
selected from {Hurricane Sandy, Boston Marathon                     main task
Blast, Nepal Earthquake} for training, 300 real and                                  Recall Precision F1-score
300 fake samples from other events for testing.                           RUN 1       0.55      0.71        0.62
                              LR       RF                                 RUN 2       0.94      0.81       0.87
           Tier 2 scores      0.44    0.54                                RUN 3       0.94      0.74        0.83
           Fused scores       0.81    0.88
                                                                       Results from Table 3, especially RUN 2, again confirms
                                                                    the effectiveness of our proposed method on multimedia as-
3.    RESULTS AND DISCUSSION                                        sessment and fusion strategy. Our method, however, is sub-
  In this section, we report accumulated results on the sub-        ject to online searching errors which happen to videos NOT
task based on our multimedia assessment approach and the            hosted by YouTube.
4.   REFERENCES                                              [6] V. Conotter, D.-T. Dang-Nguyen, G. Boato,
 [1] Error level analysis tutorial.                              M. Menéndez, and M. Larson. Assessing the impact of
     http://fotoforensics.com/tutorial-ela.php. Accessed:        image manipulation on users’ perceptions of
     28/08/2016.                                                 deception. In Proceedings of SPIE - Human Vision
 [2] T. Bianchi and A. Piva. Image Forgery Localization          and Electronic Imaging XIX, volume 9014, 2014.
     via Block-Grained Analysis of JPEG Artifacts. IEEE      [7] W. Li, Y. Yuan, and N. Yu. Passive Detection of
     Transactions on Information Forensics and Security,         Doctored JPEG Image via Block Artifact Grid
     7(3):1003–1017, June 2012.                                  Extraction. Signal Process., 89(9):1821–1829, Sept.
 [3] C. Boididou, K. Andreadou, S. Papadopoulos, D.-T.           2009.
     Dang-Nguyen, G. Boato, M. Riegler, and Y.               [8] S. E. Middleton. Extracting Attributed Verification
     Kompatsiaris. Verifying Multimedia Use at MediaEval         and Debunking Reports from Social Media :
     2015. In MediaEval 2015 Workshop, Wurzen,                   MediaEval-2015 Trust and Credibility Analysis of
     Germany, 2015.                                              Image and Video. In Proceedings of the MediaEval
 [4] C. Boididou, S. Papadopoulos, D.-T Dang-Nguyen, G.          2015 Workshop, 2015.
     Boato, and Y. Kompatsiaris. The CERTH-UNITN             [9] C. Pasquini, C. Brunetta, A. F. Vinci, V. Conotter,
     Participation @ Verifying Multimedia Use 2015. In           and G. Boato. Towards the verification of image
     Proceedings of the MediaEval 2015 Workshop, pages           integrity in online news. In Proceedings of Multimedia
     6–8, 2015.                                                  Expo Workshops (ICMEW), pages 1–6, June 2015.
 [5] C. Boididou, S. Papadopoulos, D.-T Dang-Nguyen, G.     [10] Z. Zhao, P. Resnick, and Q. Mei. Enquiring Minds:
     Boato, M. Riegler, S. E. Middleton, A. Petlund, and         Early Detection of Rumors in Social Media from
     Y. Kompatsiaris. Verifying Multimedia Use at                Enquiry Posts. In Proceedings of the 24th
     MediaEval 2016. In Proc. of the MediaEval 2016              International Conference on World Wide Web, pages
     Workshop, Hilversum, Netherlands, Oct. 20-21 2016.     1395–1405, 2015.