CUNI at MediaEval 2015 Search and Anchoring in Video Archives: Anchoring via Information Retrieval Petra Galuščáková and Pavel Pecina Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics Prague, Czech Republic {galuscakova,pecina}@ufal.mff.cuni.cz ABSTRACT In the paper we deal with automatic detection of anchoring segments in a collection of TV programmes. The anchor- ing segments are intended to be further used as a basis for subsequent hyperlinking to another related video segments. The anchoring segments are therefore supposed to be fetch- ing for the users of the collection. Using the hyperlinks, the users can easily navigate through the collection and find more information about the topic of their interest. We present two approaches, one based on metadata, the Figure 1: Hyperlinking of the anchoring segments second one based on frequencies of proper names and num- and data segments. bers contained in the segments. Both approaches proved to be helpful for different aspects of anchoring problem: the segments which contain a large number of proper names and segments and find information about related topics easily numbers are interesting for the users, while the segments (see Figure 1). most similar to the video description are highly informative. Data used in the Anchoring sub-task consist of BBC TV programmes broadcast between 01.04.2008 and 31.07.2008. 38 programmes were used for the training and 34 programmes 1. INTRODUCTION for the testing. 90 anchoring segments manually marked in Thanks to a big rise of available multimedia data, the sizes the training data were available for training purposes. The of speech and video archives are recently growing rapidly. Anchoring sub-task was evaluated using crowdsourcing cam- The amount of information stored in such archives is enor- paign. The top 25 results returned by each task participant mous, but the navigation in these archives is often tedious. were manually marked as correct or wrong. Partially over- Therefore, special attention should be paid to proposal and lapping segments returned by the participants were joined tuning of the methods for effective navigation in large multi- before the annotation. We present Precision at 10 (P10), media archives. Effective navigation can not only help users Recall and MRR scores. Details of the task, data, and eval- to find required information easily but also to find addi- uation process are given in the task description [1]. tional information about the topic of their interest using an exploratory search. 2. SYSTEM DESCRIPTION The Search and Anchoring in Video Archives task orga- We used subtitles available for each program in all our nized in the MediaEval Benchmark 20151 explores effective experiments. All videos were first segmented into shorter navigation methods for large video archives. In this paper passages, which can possibly be the anchoring segments. we deal with the Anchoring sub-task of the task. The main The passages were either created regularly each 10 seconds purpose of the sub-task is to automatically label anchor- (Fixed Segmentation)2 or using machine learning-based meth- ing segments. The anchoring segment should be in some ods which automatically detect probable segment ends (ML way remarkable for the archive users so that the users may Segmentation) [4]. Afterwards, created segments were sorted be interested in finding additional information about them. according their likelihood of being the segment of interest Hyperlinking [2, 3] can be further applied on the selected (anchoring segment) for the archive users. segments and more segments similar to the anchoring ones We used two methods for sorting created segments. In the can be retrieved. The user can thus easily browse the archive first method, manually-created metadata available for each using the links between anchoring segments and related data program were converted into the queries. The metadata con- 1 sist of the program name and a short program description. http://www.multimediaeval.org Information retrieval (IR) system was then used to sort the created segments according to their similarity with meta- data describing the program. The most similar segments Copyright is held by the author/owner(s). 2 MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany Each segment was 60 seconds long. Segmentation Preference P10 Recall MRR Fixed — 0.27879 0.25119 0.92609 ML — 0.30303 0.24995 0.90303 Fixed Numbers 0.27879 0.24334 0.89091 Fixed Names 0.31212 0.27030 0.86140 Fixed Numbers and Names 0.31212 0.27061 0.83384 Table 1: Results of the anchoring task: a comparison of segmentation strategies and a preference based on frequencies of proper names and numbers contained in the segments. Best results for each measure are highlighted. are expected to contain a relevant content and to describe 4. CONCLUSION AND FUTURE WORK the video recording well. In the paper, we show several preliminary experiments In the second method, the segments were sorted accord- with segmentation methods and selection process which can ing to the frequency of numbers (words containing at least be used for anchor selection. High MRR score achieved by one digit) and proper names (words containing an upper conversion of metadata into queries and their utilization for case character while the previous word does not end with a retrieving the most informative part of the document proved dot) contained in each segment. Segments containing more this approach to be convenient for the anchor selection. proper names and numbers are intended to contain some Achieved Precision and Recall numbers show that the kind of specific information. combination of this approach with the preference of the seg- In all experiments, we used the Terrier IR Framework ver- ment based on the frequencies of proper names and numbers sion 3.5 [6], and its implementation of the Hiemstra Lan- may further improve the results. However, this combination guage Model [5] for retrieval, with its parameter set to 0.35. would need some detailed tuning. The detection of proper We used Porter stemmer and Terrier’s stopwords list. names should be solved properly using the named entity Information retrieval system was also used for sorting the recognition. Utilization of other information such as fre- segments based on the proper name and number frequencies. quency of the content words, mentions of places and persons As we further plan to combine the score from the retrieval would also need further exploration. based on metadata with the segment preference given by proper name and number frequencies, we first run the re- trieval in the same way as in the case when only metadata 5. ACKNOWLEDGMENTS were used. The weights corresponding to a frequency of This research is supported by the Czech Science Foun- numbers and names were precalculated. Finally, precalcu- dation, grant number P103/12/G084, Charles University lated weights were linearly combined with the weight ac- Grant Agency GA UK, grant number 920913, and by SVV quired as the output of information retrieval. However, in project number 260 224. our preliminary experiments, the combination weight was set to highly prefer the ordering given by proper name and 6. REFERENCES number frequencies. [1] M. Eskevich, R. Aly, R. Ordelman, D. N. Racca, S. Chen, and G. J. F. Jones. SAVA at MediaEval 2015: Search and Anchoring in Video Archives. In Proc. of 3. RESULTS MediaEval, Wurzen, Germany, 2015. The officially evaluated experiments are displayed in Ta- [2] M. Eskevich, R. Aly, D. N. Racca, R. Ordelman, ble 1. The machine learning-based segmentation proved to S. Chen, and G. J. F. Jones. The Search and increase Precision comparing to the fixed-length segmenta- Hyperlinking Task at MediaEval 2014. In Proc. of tion. Thanks to a large number of overlapping retrieved MediaEval, Barcelona, Spain, 2014. segments, which is even larger in the case of the machine [3] P. Galuščáková, M. Kruliš, J. Lokoč, and P. Pecina. learning-based segmentation, this Precision confirms the qual- CUNI at MediaEval 2014 Search and Hyperlinking ity of the selection of the top-retrieved results. The fixed- Task: Visual and Prosodic Features in Hyperlinking. In length segmentation without any proper name and number Proc. of MediaEval, Barcelona, Spain, 2014. preferences achieve the overall highest MRR score. This [4] P. Galuščáková and P. Pecina. Experiments with may confirm our assumption that highly ranked segments Segmentation Strategies for Passage Retrieval in retrieved using metadata are the most informative for the Audio-Visual Documents. In Proc. of ICMR, pages recording. 217–224, Glasgow, UK, 2014. The overall highest Precision and Recall scores are achieved [5] D. Hiemstra. Using language models for information when segments contain the largest number of proper names retrieval. PhD thesis, University of Twente, Enschede, and numbers. This confirms the assumption that when the Netherlands, 2001. segment contains large number of proper names and num- [6] I. Ounis, G. Amati, V. Plachouras, B. He, bers, it will be probably interesting for the archive user. C. Macdonald, and C. Lioma. Terrier: A High Also, the overall largest number of segments marked by the Performance and Scalable Information Retrieval annotators as relevant was retrieved when we sorted created Platform. In Proc. of SIGIR, Seattle, Washington, segments according to how many proper names and num- USA, 2006. bers do they contain.