=Paper=
{{Paper
|id=Vol-1263/paper5
|storemode=property
|title=Social Event Detection at MediaEval 2014: Challenges, Datasets, and Evaluation
|pdfUrl=https://ceur-ws.org/Vol-1263/mediaeval2014_submission_5.pdf
|volume=Vol-1263
|dblpUrl=https://dblp.org/rec/conf/mediaeval/PetkosPMK14
}}
==Social Event Detection at MediaEval 2014: Challenges, Datasets, and Evaluation==
Social Event Detection at MediaEval 2014: Challenges, Datasets, and Evaluation Georgios Petkos, Symeon Papadopoulos, Vasileios Mezaris, Yiannis Kompatsiaris Information Technologies Institute / CERTH 6th Km. Charilaou-Thermis Thessaloniki, Greece {gpetkos,papadop,bmezaris,ikom}@iti.gr ABSTRACT mitting their runs in only the subtask that they would like This paper provides an overview of the Social Event Detec- to focus on; they are however encouraged to submit their tion (SED) task that takes place as part of the 2014 MediaE- runs in both subtasks. As will be detailed in the next sec- val Benchmark. The task is motivated by the need to mine tion, given a large collection of images, the two subtasks a common type of real-world activity, social events in large require participants to: a) perform a full clustering of the collections of online multimedia. The task has two subtasks, images around events, b) retrieve sets of events according to each of which is related to a different aspect of such a mining specific search criteria. procedure: detection of events (by means of clustering) and retrieval of events, and is performed on a large image collec- 3. CHALLENGES tion of more than 470K Flickr images (development and test 3.1 Subtask 1: Full clustering set). We examine the details of the subtasks, the datasets, In the first subtask, a collection of images with their meta- as well as the evaluation process. data is provided, and participants are asked to produce a full clustering of the images, so that each cluster corresponds to 1. INTRODUCTION a social event. Participants that took part in the 2013 ver- The wealth of content uploaded by users on the Internet sion of the task [5] should be familiar with this subtask as is often related to different aspects of real-world human ac- it is a continuation of the first subtask from last year. This tivities. This presents an important mining opportunity and subtask may be treated as a typical clustering problem or thus there have been many efforts to analyse such data. For with the help of recently introduced “supervised clustering” instance, web content has been extensively used for applica- approaches [4, 1, 3]. tions such as detecting breaking news or monitoring ongo- The main challenges of the first subtask are: ing stories. A very interesting field of work in this direction • The number of target clusters is not provided and will involves the detection of social events in multimedia collec- have to be inferred by the clustering methods of the tions retrieved from the web. With social events we mean participants. events which are attended by people and are represented • Each photo is accompanied by metadata, which are po- by multimedia content uploaded online by different people. tentially helpful for the clustering; however, they are Instances of such events are concerts, sports events, public often missing or are of inconsistent quality and there- celebrations and even protests. Mining such events may be fore introduce a multimodal aspect to the problem. of interest to e.g. professional journalists who would like to discover new events or new material about known events, • Some of the metadata is noisy. For example, if the or to casual users that would like to organize their personal date is incorrectly set at the device of a user, then the photo collections around attended events. info about the date that his / her pictures were taken Indeed, during the last years, the SED problem has at- will be incorrect. tracted significant interest by the research community. In- dicative of this is the fact that the SED task has been part 3.2 Subtask 2: Retrieval of events of the MediaEval benchmark in the last three years (2011- In the second subtask, a collection of events is provided; 2013) [2]. In the following, we are going to present the de- each event is represented by a set of images with their meta- tails of the subtasks, datasets and evaluation process for the data, and participants are asked to retrieve those events that fourth edition of the task. meet some criteria. Please note that this is a new subtask, appearing for the first time this year. 2. TASK OVERVIEW The retrieval criteria will be related to the following: • The location of the event (country, city, venue). This year, the SED task is organized around two subtasks, the details of which will be provided in the following section. • The type of the event (concert, protest, etc.). Participants are allowed to submit up to five runs for each • Entities involved in the event (e.g. a band in a con- of the subtasks. Additionally, participants may opt for sub- cert). For instance, the first test query asks users to find all music events that took place in Canada, whereas another asks for Copyright is held by the author/owner(s). all conferences, exhibitions and technical events that took MediaEval 2014 Workshop, October 16-17, 2014, Barcelona, Spain place in the U.K. 4. DATASETS etc.) and the correct query results are known by filtering Two datasets will be used in the task. Both are comprised according to the criteria of each query. The results of the of images collected from Flickr using the Flickr API. All im- second subtask will be evaluated using three different evalu- ages are covered by a Creative Commons license. For both ation measures: precision, recall and F1-score. The ground datasets, the actual image files and their metadata are made truth has been obtained by taking into account both the available. The metadata includes the following: username metadata of events from Last.fm and Upcoming and by of the uploader, date taken, date uploaded, title, descrip- manual labelling. In particular, for all events, either they tion, tags and geo-location. For both datasets, some of the are Last.fm or Upcoming events, we know their time and metadata is not available, for instance, only roughly 20% of location from the metadata obtained from the respective the images come with their geo-location. API. Additionally, we know that all Last.fm events are music The first dataset contains 362,578 images and together events and also know the event metadata contains the name with it, we also provide the grouping of these images into of the relevant artist. Events from Upcoming may belong 17,834 clusters that represent social events. The second to different categories (e.g. protest, sports, music, etc.) and dataset contains 110,541 images and, contrary to the first were manually classified. Additionally, for Upcoming events set, we do not release the grouping of its images into clus- that were classified as music events, the relevant artist was ters that represent social events1 . also manually defined. The first dataset is used as the development set for both 6. CONCLUSION subtasks and as the test set for the second subtask. We will We presented the subtasks, datasets and evaluation pro- refer to this dataset as the development set, although it is cess for the 2014 SED task. Interestingly, this year a new also used for testing in the second subtask. For the first sub- subtask is introduced: the event retrieval subtask. Thus, task, the development dataset provides to the participants a new dimension is added to the overall SED problem this a large number of examples of correct/target image clus- year. ters corresponding to events. For the second subtask, the development dataset provides the set of events from which 7. ACKNOWLEDGMENTS the participants must retrieve the relevant events for each The work was supported by the European Commission query. A number of example queries together with the ids under contracts FP7-287911 LinkedTV, FP7-318101 Media- of the relevant events is also provided for development. For Mixer and FP7-287975 SocialSensor. We would also like to testing, participants are asked to find those events in the thank Timo Reuter for the ReSEED dataset, on which the development dataset, but using a different set of criteria. development dataset was partly based. Importantly, whereas there are 8 development queries, there are 10 test queries. The 8 development queries have a direct 8. REFERENCES correspondence to the 8 first test queries, they have similar [1] G. Petkos, S. Papadopoulos, and Y. Kompatsiaris. criteria. For instance, whereas one development query asks Social event detection using multimodal clustering and for all music events that took place in Copenghagen, the integrating supervisory signals. In Proceedings of the corresponding test query asks for all music events that took 2nd ACM International Conference on Multimedia place in Bucharest. However, there are two additional test Retrieval, ICMR ’12, pages 23:1–23:8, New York, NY, queries, for which a corresponding query is not provided. USA, 2012. ACM. The second dataset is used only in the first subtask for [2] G. Petkos, S. Papadopoulos, V. Mezaris, R. Troncy, testing purposes. That is, participants are asked to find P. Cimiano, T. Reuter, and Y. Kompatsiaris. Social image-cluster associations in the second dataset, similar (in event detection at MediaEval: a three-year retrospect nature) to those in the development set. of tasks and results. In Proceedings of the 2014 Workshop on Social Events in Web Multimedia (in 5. EVALUATION conjuction with ICMR), 2014. For the first subtask, the submissions will be evaluated [3] G. Petkos, S. Papadopoulos, E. Schinas, and against the ground truth using the following three evaluation Y. Kompatsiaris. Graph-based multimodal clustering measures: for social event detection in large collections of images. • F1-Score calculated from precision and recall. In MultiMedia Modeling International Conference, • Normalized Mutual Information (NMI). MMM 2014, Dublin, Ireland, January 6-10, 2014, Proceedings, Part I, pages 146–158, 2014. • Divergence from a random baseline. All evaluation measures will also be reported in an adjusted form [4] T. Reuter and P. Cimiano. Event-based classification of called “Divergence from a random baseline” [6], which social media streams. In Proceedings of the 2nd ACM indicates how much useful learning has occurred and International Conference on Multimedia Retrieval, helps detecting problematic clustering submissions. ICMR ’12, pages 22:1–22:8, New York, NY, USA, 2012. ACM. The ground truth for the first subtask has been obtained [5] T. Reuter, S. Papadopoulos, G. Petkos, V. Mezaris, by taking advantage of machine tags with which users have Y. Kompatsiaris, P. Cimiano, C. de Vries, and S. Geva. labelled the pictures on Flickr. These machine tags associate Social event detection at MediaEval 2013: Challenges, the images to distinct events in Last.fm2 and Upcoming3 . datasets, and evaluation. Proceedings of the MediaEval For the second subtask, each event is labelled according 2013 Multimedia Benchmark Workshop Barcelona, to the search criteria that were listed above (type, location, Spain, October 18-19, 2013, 2013. 1 But we plan to release it, after the task is completed. [6] C. M. D. Vries, S. Geva, and A. Trotman. Document 2 http://www.last.fm/ clustering evaluation: Divergence from a random 3 http://en.wikipedia.org/wiki/Upcoming baseline. CoRR, abs/1208.5654, 2012.