<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Christina Boididou</string-name>
          <email>boididou@iti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Katerina Andreadou</string-name>
          <email>kandreadou@iti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Symeon Papadopoulos</string-name>
          <email>papadop@iti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Duc-Tien Dang-Nguyen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giulia Boato</string-name>
          <email>boato@disi.unitn.it</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Riegler</string-name>
          <email>michael@simula.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yiannis Kompatsiaris</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Information Technologies Institute</institution>
          ,
          <addr-line>CERTH</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Simula Research Laboratory</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <fpage>14</fpage>
      <lpage>15</lpage>
      <abstract>
        <p>This paper provides an overview of the Verifying Multimedia Use task that takes places as part of the 2015 MediaEval Benchmark. The task deals with the automatic detection of manipulation and misuse of Web multimedia content. Its aim is to lay the basis for a future generation of tools that could assist media professionals in the process of veri cation. Examples of manipulation include maliciously tampering with images and videos, e.g., splicing, removal/addition of elements, while other kinds of misuse include the reposting of previously captured multimedia content in a di erent context (e.g., a new event) claiming that it was captured there. For the 2015 edition of the task, we have generated and made available a large corpus of real-world cases of images that were distributed through tweets, along with manually assigned labels regarding their use, i.e. misleading (fake) versus appropriate (real).</p>
      </abstract>
      <kwd-group>
        <kwd>(a) Manipulation</kwd>
        <kwd>(b) Reposting</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>Modern Online Social Networks (OSN), such as Twitter,
Instagram and Facebook, are nowadays the primary sources
of information and news for millions of users and the major
means of publishing user-generated content. With the
growing number of people participating and contributing to these
communities, analyzing and verifying the massive amounts
of such content has emerged as a major challenge. Veracity
is a crucial aspect of media content, especially in cases of
breaking news stories and incidents related to public safety,
ranging from natural disasters and plane crashes to terrorist
attacks. Popular stories have such profound impact on the
public attention that content gets immediately
retransmitted by millions of users, and often it is found to be
misleading, resulting in misinformation of the public audience and
even of the authorities.</p>
      <p>
        In this setting, there is increasing need for automated
real-time veri cation and cross-checking tools. Work has
been done in this eld and techniques for evaluating tweets
have been proposed. Gupta et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] used the Hurricane
Sandy natural disaster case to highlight the role of Twitter in
spreading fake content during the event, and proposed
classi cation models to distinguish between fake and real tweets.
Ito et al. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] proposed a method to assess tweet credibility by
2.
      </p>
      <p>The de nition of the task is the following: \Given a tweet
and the accompanying multimedia item (image or video)
from an event that has the pro le to be of interest in the
international news, return a binary decision representing
veri cation of whether the multimedia item re ects the reality
of the event in the way purported by the tweet." In
practice, participants received a list of tweets that include images
and were required to automatically predict, for each tweet,
whether it is trustworthy or deceptive (real or fake
respectively). In addition to fully automated approaches, the task
also considered human-assisted approaches provided that
they are practical (i.e., fast enough) in real-world settings.
The following considerations should be made in addition to
the above de nition:</p>
      <p>A tweet is considered fake when it shares multimedia
content that does not represent the event that it refers
to. Figure 1 presents examples of such content.</p>
      <p>A tweet is considered real when it shares multimedia
that legitimately represents the event it refers to.
A tweet that shares multimedia content that does not
represent the event it refers to but reports the false
information or refers to it with a sense of humour is
neither considered fake nor real (and hence not
included in the datasets released by the task).</p>
      <p>The task also asked participants to optionally return an
explanation (which can be a text string, or URLs pointing
to resources online) that supports the veri cation decision.
The explanation was not used for quantitative evaluation,
but rather for gaining qualitative insights into the results.</p>
    </sec>
    <sec id="sec-2">
      <title>VERIFICATION CORPUS</title>
      <p>Development dataset (devset): This was provided
together with ground truth and used by participants to
develop their approach. It contains tweets related to the 11
events of Table 1, comprising in total 176 cases of real and
185 cases of misused images, associated with 5,008 real and
7,032 fake tweets posted by 4,756 and 6,769 unique users
respectively. Note that several of the events, e.g., Columbian
Chemicals, Passport Hoax and Rock Elephant, were actually
hoaxes, hence all multimedia content associated with them
was misused. For several real events (e.g., MA ight 370) no
real images (and hence no real tweets) were included in the
dataset, since none came up as a result of the data collection
process that is described below.</p>
      <p>Test dataset (testset): This was used for evaluation. It
comprises 17 cases of real images, 33 of misused images and
2 cases of misused videos, in total associated with 1,217 real
and 2,564 fake tweets that were posted by 1,139 and 2,447
unique users respectively.</p>
      <p>
        The tweet IDs and image URLs for both datasets are
publicly available1. Both consist of tweets collected around a
number of widely known events or news stories. The tweets
contain fake and real multimedia content that has been
manually veri ed by cross-checking online sources (articles and
blogs). The data were retrieved with the help of Topsy and
Twitter APIs using keywords and hashtags around these
speci c events. Having de ned a set of keywords K for each
event of Table 1, we collected a set of tweets T . Afterwards,
with the help of online resources, we identi ed a set of unique
fake and real pictures around these events, and created the
fake and the real image sets IF , IR respectively. We then
used the image sets as seeds to create our reference veri
cation corpus TC T . This corpus includes only those tweets
that contain at least one image of the prede ned sets of
images IF , IR. However, in order not to restrict the tweets
to only those that point to the exact seed image URLs, we
also employed a scalable visual near-duplicate search
strategy as described in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. More speci cally, we used the sets
of fake and real images as visual queries and for each query
we checked whether each image tweet from the T set
exists as an image item or a near-duplicate image item of the
IF or the IR set. To ensure near-duplicity, we empirically
set a minimum threshold of similarity tuned for high
precision. However, a small amount of the images exceeding the
threshold turned out to be irrelevant to the ones in the seed
set. To remove those, we conducted a manual veri cation
step on the extended set of images.
For every item of the aforementioned datasets, we
extracted and made available three types of features:
Features extracted from the tweet itself, for instance
the number of terms, the number of URLs, hashtags,
the number of mentions, etc. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        User-based features which are based on the Twitter
user pro le, for instance the number of friends and
followers, the number of times the user is included in
a Twitter list, whether the user is veri ed, etc. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
Forensic features extracted from the visual content of
the tweet image, for instance the probability map of
the aligned double JPEG compression, the potential
primary quantization steps for the rst six DCT
coefcients of the non-aligned JPEG compression, and the
PRNU (Photo-Response Non-Uniformity) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
4.
      </p>
    </sec>
    <sec id="sec-3">
      <title>EVALUATION</title>
      <p>Overall, the task is interested in the accuracy with which
an automatic method can distinguish between use of
multimedia in tweets in ways that faithfully re ect reality
versus ways that spread false impressions. Hence, given a set
of labelled instances (tweet + image + label) and a set of
predicted labels (included in the submitted runs) for these
instances, the classic IR measures (i.e., Precision P , Recall
R, and F -score) were used to quantify the classi cation
performance, where the target class is the class of fake tweets.
Since the two classes (fake/real) are represented in a
relatively balanced way in the testset, the classic IR measures
are good proxies of the classi er accuracy. Note that task
participants were allowed to classify a tweet as unknown.
Obviously, in case a system produces many unknown outputs,
it is likely that its precision will bene t, assuming that the
selection of unknown was done wisely, i.e. successfully
avoiding erroneous classi cations. However, the recall of such a
system would su er in case the tweets that were labelled as
unknown turned out to be fake (the target class).</p>
    </sec>
    <sec id="sec-4">
      <title>ACKNOWLEDGEMENTS</title>
      <p>We would like to thank Martha Larson for her valuable
feedback in shaping the task and writing the overview paper.
This work is supported by the REVEAL project, partially
funded by the European Commission (FP7-610928).
1https://github.com/MKLab-ITI/image-verification-corpus/
tree/master/mediaeval2015</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Boididou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Papadopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kompatsiaris</surname>
          </string-name>
          ,
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>Schi eres, and</article-title>
          <string-name>
            <given-names>N.</given-names>
            <surname>Newman</surname>
          </string-name>
          .
          <article-title>Challenges of computational veri cation in social multimedia</article-title>
          .
          <source>In Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion</source>
          , pages
          <volume>743</volume>
          {
          <fpage>748</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>V.</given-names>
            <surname>Conotter</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.-T.</surname>
            Dang-Nguyen,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Riegler</surname>
            , G. Boato, and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Larson</surname>
          </string-name>
          .
          <article-title>A crowdsourced data set of edited images online</article-title>
          .
          <source>In Proceedings of the 2014 International ACM Workshop on Crowdsourcing for Multimedia</source>
          ,
          <source>CrowdMM '14</source>
          , pages
          <fpage>49</fpage>
          {
          <fpage>52</fpage>
          , New York, NY, USA,
          <year>2014</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Diakopoulos</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. De Choudhury</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Naaman</surname>
          </string-name>
          .
          <article-title>Finding and assessing social media information sources in the context of journalism</article-title>
          .
          <source>In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '12</source>
          , pages
          <fpage>2451</fpage>
          {
          <fpage>2460</fpage>
          , New York, NY, USA,
          <year>2012</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lamba</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kumaraguru</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Joshi</surname>
          </string-name>
          .
          <article-title>Faking Sandy: characterizing and identifying fake images on twitter during Hurricane Sandy</article-title>
          .
          <source>In Proceedings of the 22nd international conference on World Wide Web companion</source>
          , pages
          <volume>729</volume>
          {
          <fpage>736</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Ito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Toda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Koike</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Oyama</surname>
          </string-name>
          .
          <article-title>Assessment of tweet credibility with LDA features</article-title>
          .
          <source>In Proceedings of the 24th International Conference on World Wide Web Companion</source>
          , pages
          <volume>953</volume>
          {
          <fpage>958</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>E.</given-names>
            <surname>Spyromitros-Xiou s</surname>
          </string-name>
          , S. Papadopoulos, I. Kompatsiaris, G. Tsoumakas,
          <string-name>
            <surname>and I. Vlahavas.</surname>
          </string-name>
          <article-title>A comprehensive study over VLAD and Product Quantization in large-scale image retrieval</article-title>
          .
          <source>IEEE Transactions on Multimedia</source>
          ,
          <volume>16</volume>
          (
          <issue>6</issue>
          ):
          <volume>1713</volume>
          {
          <fpage>1728</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>