Social Event Detection at MediaEval 2013: Challenges, Datasets, and Evaluation Timo Reuter Symeon Papadopoulos, Vasileios Mezaris, Universität Bielefeld, CITEC Georgios Petkos Yiannis Kompatsiaris Bielefeld, Germany CERTH-ITI CERTH-ITI treuter@cit-ec.uni- Thermi, Greece Thermi, Greece bielefeld.de {papadop,gpetkos}@iti.gr {bmezaris,ikom}@iti.gr Philipp Cimiano Christopher de Vries Shlomo Geva Universität Bielefeld, CITEC Queensland University of Queensland University of Bielefeld, Germany Technology Technology cimiano@cit-ec.uni- Brisbane, Australia Brisbane, Australia bielefeld.de chris@de-vries.id.au s.geva@qut.edu.au ABSTRACT be evaluated separately. In this paper, we provide an overview of the Social Event Detection (SED) task that is part of the MediaEval Bench- 3. CHALLENGES mark for Multimedia Evaluation 2013. This task requires participants to discover social events and organize the re- 3.1 C1: Full Clustering lated media items in event-specific clusters within a collec- “Produce a complete clustering of the image dataset tion of Web multimedia. Social events are events that are according to events.” planned by people, attended by people and for which the social multimedia are also captured by people. We describe • Cluster the entire dataset for all images included in the challenges, datasets, and the evaluation methodology. the test set according to events they depict. 1. INTRODUCTION • As the target number of events is not given, a subchal- lenge is to discover it. As social media applications proliferate, an ever-increasing amount of web and multimedia content available on the Web The first challenge will be a completely data-driven one in- is being created. A lot of this content is related to social volving the analysis of a large-scale dataset, requiring the events, which we define as events that are organized and at- production of a complete clustering of the image dataset ac- tended by people and are illustrated by social media content cording to events (see Figure 1). The task is a supervised created by people. clustering task [4, 3] where a set of training events is pro- For users, finding digital content related to social events is vided. However, the events in training and test are disjoint. challenging, requiring to search large volumes of data, possi- This challenge will not specify a particular event or event bly at different sources and sites. Algorithms that can sup- class of interest but focuses on grouping images according port humans in this task are clearly needed. The proposed to events they are associated to. The ground truth is a sin- task thus consists in developing algorithms that can detect gle label, such that no image/video can belong to more than event-related media and group them by the events they il- one event. It is challenging that the number of the events lustrate or are related to. Such a grouping would provide is not known beforehand and it is up to the participants to the basis for aggregation and search applications that foster decide which images are clustered together into one event. easier discovery, browsing and querying of social events. Event 1 2. TASK OVERVIEW Image documents For this year’s edition of the Social Event Detection task, Event 2 two challenges, C1 and C2, are defined, which are different compared to SED2012 [2]. For each challenge, a dedicated dataset of images (and videos in the case of C1) together ? .. with their metadata (e.g. timestamp, geographic informa- . tion, tags) is provided. Participants are allowed to submit Event n up to five runs per task, where each run contains a different set of results. This could be produced by either a different approach or a variation of the same approach. Each run will Figure 1: Clustering of image documents into event clusters There is a required run for C1 which involves using only Copyright is held by the author/owner(s). the metadata. The use of additional data for this run is MediaEval 2013 Workshop, October 18-19, 2013, Barcelona, Spain forbidden (e.g. visual information from the images). For the other runs, additional data can be used (including the on event-related keywords, and consisted of 27,754 pictures images). It is allowed to use generic external resources like (after cleaning). The classification of pictures to event types Wikipedia, WordNet, or visual concepts trained on other was performed manually by multiple annotators, while sev- data. However, it is not allowed to use external data that eral borderline cases were completely removed. The test set directly relates to the individual images that are included in was collected between the 7th and 13th of May 2013, was the dataset, such as machine tags1 . processed using the same procedure as the training set, and consisted of 29,411 pictures. There are eight event types 3.1.1 Subtask: Full Clustering of Media using Videos in the dataset: music (concert) events, conferences, exhibi- “Assign all videos into the event set of the images tions, fashion shows, protests, sport events, theatrical/dance you have created in Challenge 1.” events (considered as one category) and other events (e.g. This is an extension to Challenge 1. Participants should parades, gatherings). As in the dataset for Challenge 1, use their created event clusters and assign the videos to some features are not present for all pictures: 27.9% of the them. As for the main task, here we also search for a com- pictures have geographic information, 93.4% come with a plete assignment of the videos to events. title and almost all pictures (99.5%) have at least one tag. 3.2 C2: Classification of Media into Event Types 5. EVALUATION “Classify images into event and non-event and into We evaluate the submissions with ground truth informa- event types.” tion that has been created by human annotators. The results of event-related media item detection will be evaluated using • For each image in the dataset decide whether the image three evaluation measures: depicts an event or not (in the latter case assign the • F1 -score, calculated from Precision and Recall (appli- no-event label to it). cable to both C1 and C2). [4] • For each image in the dataset that is not labelled as no- • Normalized Mutual Information (NMI). Both will be event, decide what type of event it depicts. The avail- used to assess the overlap between clusters and classes. able event types are the following: concert, conference, (applicable only to C1). exhibition, fashion, protest, sports, theatre/dance, other. • Divergence from a Random Baseline. All evaluation The second challenge will be a supervised classification task, measures will also be reported in an adjusted measure which requires learning how event-related media items look called Divergence from a Random Baseline [1], indicat- like (both in terms of visual content and accompanying meta- ing how much useful learning has occurred and helping data). More specifically, a set of eight event types are de- detect problematic clustering submissions (applicable fined, and methods should automatically decide to which to both C1 and C2). type (if any) an unknown media item belongs. C2 submissions are subject to the same limitations as the 6. CONCLUSION ones of C1 with the difference that it is allowed to use visual This year’s SED edition decomposes the problem of social information from the images in all runs. event detection into two main components: (a) clustering of media depicting certain social events, (b) deciding whether 4. DATASET an image is event-related, and if yes, what type of event it is The dataset for Challenge 1 consists of pictures from Flickr related to. Both the scale and the complexity of this year’s and 1,327 videos from YouTube together with their associ- dataset make it more challenging and more representative ated metadata. The pictures were downloaded using the of real-world problems. Flickr API. We considered pictures with an upload time be- tween January 2006 and December 2012, yielding a dataset Acknowledgments of 437,370 pictures assigned to 21,169 events. The events The work was supported by the European Commission un- were determined by people as described in Reuter et al. [4] der contracts FP7-287911 LinkedTV, FP7-318101 MediaMi- and include sport events, protest marches, BBQs, debates, xer, FP7-287975 SocialSensor and FP7-249008 CHORUS+. expositions, festivals or concerts. All of them are published under a Creative Commons license allowing free distribu- 7. REFERENCES tion. As it is a real-world dataset, there are some features [1] C. M. De Vries, S. Geva, and A. Trotman. Document (capture/upload time and uploader information) that are clustering evaluation: Divergence from a random baseline. available for every picture, but there are also features that 2012. are available for only a subset of the images: geographic in- [2] S. Papadopoulos, E. Schinas, V. Mezaris, R. Troncy, and I. Kompatsiaris. Social event detection at mediaeval 2012: formation (45.9%), tags (95.6%), title (97.9%), and descrip- Challenges, dataset and evaluation. In Proceedings of tion (37.9%). 70% of the dataset is provided for training MediaEval 2012 Workshop, 2012. including the ground truth. The rest is used for evaluation [3] G. Petkos, S. Papadopoulos, and Y. Kompatsiaris. Social purposes. event detection using multimodal clustering and integrating The dataset for Challenge 2 is comparable to that of Chal- supervisory signals. In Proceedings of the 2nd ACM Intern. lenge 1 except for the fact that the pictures were gathered Conf. on Multimedia Retrieval, page 23. ACM, 2012. from Instagram using the respective API. The training set [4] T. Reuter and P. Cimiano. Event-based classification of social media streams. In Proceedings of the 2nd ACM was collected between 27th and 29th of April 2013, based Intern. Conf. on Multimedia Retrieval, page 22. ACM, 2012. 1 A special triple tag to define extra semantic information for interpretation by computer systems