Social Event Detection at MediaEval 2013: Challenges,
                   Datasets, and Evaluation

                   Timo Reuter                     Symeon Papadopoulos,                   Vasileios Mezaris,
           Universität Bielefeld, CITEC               Georgios Petkos                    Yiannis Kompatsiaris
               Bielefeld, Germany                           CERTH-ITI                            CERTH-ITI
               treuter@cit-ec.uni-                        Thermi, Greece                       Thermi, Greece
                   bielefeld.de                   {papadop,gpetkos}@iti.gr              {bmezaris,ikom}@iti.gr
                 Philipp Cimiano                    Christopher de Vries                       Shlomo Geva
           Universität Bielefeld, CITEC              Queensland University of            Queensland University of
               Bielefeld, Germany                          Technology                          Technology
              cimiano@cit-ec.uni-                      Brisbane, Australia                 Brisbane, Australia
                  bielefeld.de                       chris@de-vries.id.au                 s.geva@qut.edu.au

ABSTRACT                                                            be evaluated separately.
In this paper, we provide an overview of the Social Event
Detection (SED) task that is part of the MediaEval Bench-           3.      CHALLENGES
mark for Multimedia Evaluation 2013. This task requires
participants to discover social events and organize the re-         3.1       C1: Full Clustering
lated media items in event-specific clusters within a collec-         “Produce a complete clustering of the image dataset
tion of Web multimedia. Social events are events that are           according to events.”
planned by people, attended by people and for which the
social multimedia are also captured by people. We describe               • Cluster the entire dataset for all images included in
the challenges, datasets, and the evaluation methodology.                  the test set according to events they depict.

1. INTRODUCTION                                                          • As the target number of events is not given, a subchal-
                                                                           lenge is to discover it.
   As social media applications proliferate, an ever-increasing
amount of web and multimedia content available on the Web           The first challenge will be a completely data-driven one in-
is being created. A lot of this content is related to social        volving the analysis of a large-scale dataset, requiring the
events, which we define as events that are organized and at-        production of a complete clustering of the image dataset ac-
tended by people and are illustrated by social media content        cording to events (see Figure 1). The task is a supervised
created by people.                                                  clustering task [4, 3] where a set of training events is pro-
   For users, finding digital content related to social events is   vided. However, the events in training and test are disjoint.
challenging, requiring to search large volumes of data, possi-      This challenge will not specify a particular event or event
bly at different sources and sites. Algorithms that can sup-        class of interest but focuses on grouping images according
port humans in this task are clearly needed. The proposed           to events they are associated to. The ground truth is a sin-
task thus consists in developing algorithms that can detect         gle label, such that no image/video can belong to more than
event-related media and group them by the events they il-           one event. It is challenging that the number of the events
lustrate or are related to. Such a grouping would provide           is not known beforehand and it is up to the participants to
the basis for aggregation and search applications that foster       decide which images are clustered together into one event.
easier discovery, browsing and querying of social events.
                                                                                                            Event 1

2. TASK OVERVIEW                                                           Image
                                                                         documents
   For this year’s edition of the Social Event Detection task,                                                        Event 2
two challenges, C1 and C2, are defined, which are different
compared to SED2012 [2]. For each challenge, a dedicated
dataset of images (and videos in the case of C1) together                                          ?
                                                                                                                                ..


with their metadata (e.g. timestamp, geographic informa-
                                                                                                                                 .


tion, tags) is provided. Participants are allowed to submit                                                                          Event n

up to five runs per task, where each run contains a different
set of results. This could be produced by either a different
approach or a variation of the same approach. Each run will         Figure 1: Clustering of image documents into event
                                                                    clusters
                                                                      There is a required run for C1 which involves using only
Copyright is held by the author/owner(s).                           the metadata. The use of additional data for this run is
MediaEval 2013 Workshop, October 18-19, 2013, Barcelona, Spain      forbidden (e.g. visual information from the images). For
the other runs, additional data can be used (including the         on event-related keywords, and consisted of 27,754 pictures
images). It is allowed to use generic external resources like      (after cleaning). The classification of pictures to event types
Wikipedia, WordNet, or visual concepts trained on other            was performed manually by multiple annotators, while sev-
data. However, it is not allowed to use external data that         eral borderline cases were completely removed. The test set
directly relates to the individual images that are included in     was collected between the 7th and 13th of May 2013, was
the dataset, such as machine tags1 .                               processed using the same procedure as the training set, and
                                                                   consisted of 29,411 pictures. There are eight event types
3.1.1 Subtask: Full Clustering of Media using Videos               in the dataset: music (concert) events, conferences, exhibi-
  “Assign all videos into the event set of the images              tions, fashion shows, protests, sport events, theatrical/dance
you have created in Challenge 1.”                                  events (considered as one category) and other events (e.g.
  This is an extension to Challenge 1. Participants should         parades, gatherings). As in the dataset for Challenge 1,
use their created event clusters and assign the videos to          some features are not present for all pictures: 27.9% of the
them. As for the main task, here we also search for a com-         pictures have geographic information, 93.4% come with a
plete assignment of the videos to events.                          title and almost all pictures (99.5%) have at least one tag.

3.2 C2: Classification of Media into Event Types                   5.    EVALUATION
  “Classify images into event and non-event and into                  We evaluate the submissions with ground truth informa-
event types.”                                                      tion that has been created by human annotators. The results
                                                                   of event-related media item detection will be evaluated using
   • For each image in the dataset decide whether the image        three evaluation measures:
     depicts an event or not (in the latter case assign the
                                                                        • F1 -score, calculated from Precision and Recall (appli-
     no-event label to it).
                                                                          cable to both C1 and C2). [4]
   • For each image in the dataset that is not labelled as no-          • Normalized Mutual Information (NMI). Both will be
     event, decide what type of event it depicts. The avail-              used to assess the overlap between clusters and classes.
     able event types are the following: concert, conference,             (applicable only to C1).
     exhibition, fashion, protest, sports, theatre/dance, other.
                                                                        • Divergence from a Random Baseline. All evaluation
The second challenge will be a supervised classification task,            measures will also be reported in an adjusted measure
which requires learning how event-related media items look                called Divergence from a Random Baseline [1], indicat-
like (both in terms of visual content and accompanying meta-              ing how much useful learning has occurred and helping
data). More specifically, a set of eight event types are de-              detect problematic clustering submissions (applicable
fined, and methods should automatically decide to which                   to both C1 and C2).
type (if any) an unknown media item belongs.
   C2 submissions are subject to the same limitations as the       6.    CONCLUSION
ones of C1 with the difference that it is allowed to use visual       This year’s SED edition decomposes the problem of social
information from the images in all runs.                           event detection into two main components: (a) clustering of
                                                                   media depicting certain social events, (b) deciding whether
4. DATASET                                                         an image is event-related, and if yes, what type of event it is
   The dataset for Challenge 1 consists of pictures from Flickr    related to. Both the scale and the complexity of this year’s
and 1,327 videos from YouTube together with their associ-          dataset make it more challenging and more representative
ated metadata. The pictures were downloaded using the              of real-world problems.
Flickr API. We considered pictures with an upload time be-
tween January 2006 and December 2012, yielding a dataset           Acknowledgments
of 437,370 pictures assigned to 21,169 events. The events          The work was supported by the European Commission un-
were determined by people as described in Reuter et al. [4]        der contracts FP7-287911 LinkedTV, FP7-318101 MediaMi-
and include sport events, protest marches, BBQs, debates,          xer, FP7-287975 SocialSensor and FP7-249008 CHORUS+.
expositions, festivals or concerts. All of them are published
under a Creative Commons license allowing free distribu-           7.    REFERENCES
tion. As it is a real-world dataset, there are some features       [1] C. M. De Vries, S. Geva, and A. Trotman. Document
(capture/upload time and uploader information) that are                clustering evaluation: Divergence from a random baseline.
available for every picture, but there are also features that          2012.
are available for only a subset of the images: geographic in-      [2] S. Papadopoulos, E. Schinas, V. Mezaris, R. Troncy, and
                                                                       I. Kompatsiaris. Social event detection at mediaeval 2012:
formation (45.9%), tags (95.6%), title (97.9%), and descrip-           Challenges, dataset and evaluation. In Proceedings of
tion (37.9%). 70% of the dataset is provided for training              MediaEval 2012 Workshop, 2012.
including the ground truth. The rest is used for evaluation        [3] G. Petkos, S. Papadopoulos, and Y. Kompatsiaris. Social
purposes.                                                              event detection using multimodal clustering and integrating
   The dataset for Challenge 2 is comparable to that of Chal-          supervisory signals. In Proceedings of the 2nd ACM Intern.
lenge 1 except for the fact that the pictures were gathered            Conf. on Multimedia Retrieval, page 23. ACM, 2012.
from Instagram using the respective API. The training set          [4] T. Reuter and P. Cimiano. Event-based classification of
                                                                       social media streams. In Proceedings of the 2nd ACM
was collected between 27th and 29th of April 2013, based
                                                                       Intern. Conf. on Multimedia Retrieval, page 22. ACM, 2012.
1
  A special triple tag to define extra semantic information
for interpretation by computer systems