<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Social Event Detection at MediaEval 2014: Challenges, Datasets, and Evaluation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Georgios Petkos</string-name>
          <email>gpetkos@iti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Symeon Papadopoulos</string-name>
          <email>papadop@iti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vasileios Mezaris</string-name>
          <email>bmezaris@iti.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yiannis Kompatsiaris</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Information Technologies Institute / CERTH 6</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2014</year>
      </pub-date>
      <fpage>16</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>This paper provides an overview of the Social Event Detection (SED) task that takes place as part of the 2014 MediaEval Benchmark. The task is motivated by the need to mine a common type of real-world activity, social events in large collections of online multimedia. The task has two subtasks, each of which is related to a di erent aspect of such a mining procedure: detection of events (by means of clustering) and retrieval of events, and is performed on a large image collection of more than 470K Flickr images (development and test set). We examine the details of the subtasks, the datasets, as well as the evaluation process.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>The wealth of content uploaded by users on the Internet
is often related to di erent aspects of real-world human
activities. This presents an important mining opportunity and
thus there have been many e orts to analyse such data. For
instance, web content has been extensively used for
applications such as detecting breaking news or monitoring
ongoing stories. A very interesting eld of work in this direction
involves the detection of social events in multimedia
collections retrieved from the web. With social events we mean
events which are attended by people and are represented
by multimedia content uploaded online by di erent people.
Instances of such events are concerts, sports events, public
celebrations and even protests. Mining such events may be
of interest to e.g. professional journalists who would like to
discover new events or new material about known events,
or to casual users that would like to organize their personal
photo collections around attended events.</p>
      <p>
        Indeed, during the last years, the SED problem has
attracted signi cant interest by the research community.
Indicative of this is the fact that the SED task has been part
of the MediaEval benchmark in the last three years
(20112013) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In the following, we are going to present the
details of the subtasks, datasets and evaluation process for the
fourth edition of the task.
      </p>
    </sec>
    <sec id="sec-2">
      <title>TASK OVERVIEW</title>
      <p>This year, the SED task is organized around two subtasks,
the details of which will be provided in the following section.
Participants are allowed to submit up to ve runs for each
of the subtasks. Additionally, participants may opt for
submitting their runs in only the subtask that they would like
to focus on; they are however encouraged to submit their
runs in both subtasks. As will be detailed in the next
section, given a large collection of images, the two subtasks
require participants to: a) perform a full clustering of the
images around events, b) retrieve sets of events according to
speci c search criteria.
3.
3.1</p>
    </sec>
    <sec id="sec-3">
      <title>CHALLENGES</title>
    </sec>
    <sec id="sec-4">
      <title>Subtask 1: Full clustering</title>
      <p>
        In the rst subtask, a collection of images with their
metadata is provided, and participants are asked to produce a full
clustering of the images, so that each cluster corresponds to
a social event. Participants that took part in the 2013
version of the task [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] should be familiar with this subtask as
it is a continuation of the rst subtask from last year. This
subtask may be treated as a typical clustering problem or
with the help of recently introduced \supervised clustering"
approaches [
        <xref ref-type="bibr" rid="ref1 ref3 ref4">4, 1, 3</xref>
        ].
      </p>
      <p>The main challenges of the rst subtask are:</p>
      <p>The number of target clusters is not provided and will
have to be inferred by the clustering methods of the
participants.</p>
      <p>Each photo is accompanied by metadata, which are
potentially helpful for the clustering; however, they are
often missing or are of inconsistent quality and
therefore introduce a multimodal aspect to the problem.
Some of the metadata is noisy. For example, if the
date is incorrectly set at the device of a user, then the
info about the date that his / her pictures were taken
will be incorrect.
3.2</p>
    </sec>
    <sec id="sec-5">
      <title>Subtask 2: Retrieval of events</title>
      <p>In the second subtask, a collection of events is provided;
each event is represented by a set of images with their
metadata, and participants are asked to retrieve those events that
meet some criteria. Please note that this is a new subtask,
appearing for the rst time this year.</p>
      <p>The retrieval criteria will be related to the following:
The location of the event (country, city, venue).
The type of the event (concert, protest, etc.).</p>
      <p>Entities involved in the event (e.g. a band in a
concert).</p>
      <p>For instance, the rst test query asks users to nd all music
events that took place in Canada, whereas another asks for
all conferences, exhibitions and technical events that took
place in the U.K.</p>
      <p>Two datasets will be used in the task. Both are comprised
of images collected from Flickr using the Flickr API. All
images are covered by a Creative Commons license. For both
datasets, the actual image les and their metadata are made
available. The metadata includes the following: username
of the uploader, date taken, date uploaded, title,
description, tags and geo-location. For both datasets, some of the
metadata is not available, for instance, only roughly 20% of
the images come with their geo-location.</p>
      <p>The rst dataset contains 362,578 images and together
with it, we also provide the grouping of these images into
17,834 clusters that represent social events. The second
dataset contains 110,541 images and, contrary to the rst
set, we do not release the grouping of its images into
clusters that represent social events1.</p>
      <p>The rst dataset is used as the development set for both
subtasks and as the test set for the second subtask. We will
refer to this dataset as the development set, although it is
also used for testing in the second subtask. For the rst
subtask, the development dataset provides to the participants
a large number of examples of correct/target image
clusters corresponding to events. For the second subtask, the
development dataset provides the set of events from which
the participants must retrieve the relevant events for each
query. A number of example queries together with the ids
of the relevant events is also provided for development. For
testing, participants are asked to nd those events in the
development dataset, but using a di erent set of criteria.
Importantly, whereas there are 8 development queries, there
are 10 test queries. The 8 development queries have a direct
correspondence to the 8 rst test queries, they have similar
criteria. For instance, whereas one development query asks
for all music events that took place in Copenghagen, the
corresponding test query asks for all music events that took
place in Bucharest. However, there are two additional test
queries, for which a corresponding query is not provided.</p>
      <p>The second dataset is used only in the rst subtask for
testing purposes. That is, participants are asked to nd
image-cluster associations in the second dataset, similar (in
nature) to those in the development set.
5.</p>
    </sec>
    <sec id="sec-6">
      <title>EVALUATION</title>
      <p>For the rst subtask, the submissions will be evaluated
against the ground truth using the following three evaluation
measures:</p>
      <p>F1-Score calculated from precision and recall.</p>
      <p>Normalized Mutual Information (NMI).</p>
      <p>
        Divergence from a random baseline. All evaluation
measures will also be reported in an adjusted form
called \Divergence from a random baseline" [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which
indicates how much useful learning has occurred and
helps detecting problematic clustering submissions.
      </p>
      <p>The ground truth for the rst subtask has been obtained
by taking advantage of machine tags with which users have
labelled the pictures on Flickr. These machine tags associate
the images to distinct events in Last.fm2 and Upcoming3.</p>
      <p>For the second subtask, each event is labelled according
to the search criteria that were listed above (type, location,
1But we plan to release it, after the task is completed.
2http://www.last.fm/
3http://en.wikipedia.org/wiki/Upcoming
etc.) and the correct query results are known by ltering
according to the criteria of each query. The results of the
second subtask will be evaluated using three di erent
evaluation measures: precision, recall and F1-score. The ground
truth has been obtained by taking into account both the
metadata of events from Last.fm and Upcoming and by
manual labelling. In particular, for all events, either they
are Last.fm or Upcoming events, we know their time and
location from the metadata obtained from the respective
API. Additionally, we know that all Last.fm events are music
events and also know the event metadata contains the name
of the relevant artist. Events from Upcoming may belong
to di erent categories (e.g. protest, sports, music, etc.) and
were manually classi ed. Additionally, for Upcoming events
that were classi ed as music events, the relevant artist was
also manually de ned.
6.</p>
    </sec>
    <sec id="sec-7">
      <title>CONCLUSION</title>
      <p>We presented the subtasks, datasets and evaluation
process for the 2014 SED task. Interestingly, this year a new
subtask is introduced: the event retrieval subtask. Thus,
a new dimension is added to the overall SED problem this
year.
7.</p>
    </sec>
    <sec id="sec-8">
      <title>ACKNOWLEDGMENTS</title>
      <p>The work was supported by the European Commission
under contracts FP7-287911 LinkedTV, FP7-318101
MediaMixer and FP7-287975 SocialSensor. We would also like to
thank Timo Reuter for the ReSEED dataset, on which the
development dataset was partly based.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Petkos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Papadopoulos</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kompatsiaris</surname>
          </string-name>
          .
          <article-title>Social event detection using multimodal clustering and integrating supervisory signals</article-title>
          .
          <source>In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR '12</source>
          , pages
          <issue>23:1</issue>
          {
          <issue>23</issue>
          :
          <fpage>8</fpage>
          , New York, NY, USA,
          <year>2012</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>G.</given-names>
            <surname>Petkos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Papadopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Mezaris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Troncy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Reuter</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kompatsiaris</surname>
          </string-name>
          .
          <article-title>Social event detection at MediaEval: a three-year retrospect of tasks and results</article-title>
          .
          <source>In Proceedings of the 2014 Workshop on Social Events in Web Multimedia (in conjuction with ICMR)</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>G.</given-names>
            <surname>Petkos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Papadopoulos</surname>
          </string-name>
          , E. Schinas, and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kompatsiaris</surname>
          </string-name>
          .
          <article-title>Graph-based multimodal clustering for social event detection in large collections of images</article-title>
          . In MultiMedia Modeling International Conference, MMM
          <year>2014</year>
          , Dublin, Ireland, January 6-
          <issue>10</issue>
          ,
          <year>2014</year>
          , Proceedings,
          <string-name>
            <surname>Part</surname>
            <given-names>I</given-names>
          </string-name>
          , pages
          <volume>146</volume>
          {
          <fpage>158</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Reuter</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          .
          <article-title>Event-based classi cation of social media streams</article-title>
          .
          <source>In Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR '12</source>
          , pages
          <issue>22:1</issue>
          {
          <issue>22</issue>
          :
          <fpage>8</fpage>
          , New York, NY, USA,
          <year>2012</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Reuter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Papadopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Petkos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Mezaris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kompatsiaris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          , C. de Vries, and
          <string-name>
            <given-names>S.</given-names>
            <surname>Geva</surname>
          </string-name>
          .
          <article-title>Social event detection at MediaEval 2013: Challenges, datasets, and evaluation</article-title>
          .
          <source>Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop Barcelona</source>
          , Spain,
          <source>October 18-19</source>
          ,
          <year>2013</year>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>C. M. D. Vries</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Geva</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Trotman</surname>
          </string-name>
          .
          <article-title>Document clustering evaluation: Divergence from a random baseline</article-title>
          .
          <source>CoRR, abs/1208.5654</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>