DCU Linking Runs at MediaEval 2013: Search and Hyperlinking Task Shu Chen Gareth J. F. Jones Noel E. O’Connor INSIGHT Centre for Data CNGL, School of Computing INSIGHT Centre for Data Analytics / CNGL Dublin City University Analytics Dublin City University Dublin 9, Ireland Dublin City University Dublin 9, Ireland gjones@computing.dcu.ie Dublin 9, Ireland shu.chen4@mail.dcu.ie Noel.OConnor@dcu.ie ABSTRACT 2.2 Hyperlinking using Text Annotation We describe Dublin City University (DCU)’s participation This strategy determines the hyperlinks based on two qual- in the Hyperlinking sub-task of the Search and Hyperlinking ity measures: the video-level and the segment-level. The of Television Content task at MediaEval 2013. Two meth- video-level measure aims to determine the relevance between ods of video hyperlinking construction are reported: i) using the video containing the query anchor and other videos con- spoken data annotation results to achieve the ranked hyper- taining potential target segments based on the text tran- link list, ii) linking and merging meaningful named entities scripts. DBpedia Spotlight1 , implementing text annotation in video segments to create hyperlinks. The details of algo- by supervised learning through DBpedia Ontology2 , was rithm design and evaluation are presented. used to extract a set of terms to represent the textual con- tent of each video. The method used to annotate terms in DBpedia Spotlight is based on a TF*ICF model [4], where Keywords TF (Term Frequency) represents the relevance of a term in Hyperlinking, Multimedia Search, Anchor Selection, Infor- the spoken video, and ICF (Inverse Candidate Frequency) is mation Retrieval determined by the relevance of a term in DBpedia Ontology resources [4]. Given the video represented by a set of terms, the similarity score is calculated using a TF-IDF algorithm. 1. INTRODUCTION This paper presents Dublin City University (DCU)’s par- The segment-level similarity uses Apache Lucene 3.6.23 ticipation in the Hyperlinking sub-task of Search and Hyper- to determine the relevance between the query anchor and linking of Television Content task at MediaEval 2013. The the potential target segments. The Lucene standard an- paper is organized as follows: Section 2 describes our auto- alyzer was used with the default stop word list4 to index matic hyperlinking strategies, Section 3 gives experimental ASR transcripts and manual subtitles. The search input results, and Section 4 concludes the paper. query contained all the spoken data contained in the query anchor. The score calculation mechanism uses a combina- tion of a Boolean AND function filter and ranking using the 2. HYPERLINKING STRATEGIES Vector Space Model [3]. The final score used to rank the hy- perlinks was calculated by merging the two results as shown 2.1 Hyperlimking Principles in Equation 1 and Equation 2. In this subsection we describe the principles underlying our approach to the hyperlinking task. The elements in- Score = w1 Sv + w2 Sl (1) volved in the hyperlinking framework correspond to the query anchor, the target segment, and the hyperlink. The query Score = Sv Sl (2) anchors, as the input to the hyperlinking framework, are where Sv is the video-level similarity score, while Sl is the defined in [1]. A target segment is a subset of a video to segment-level similarity score. We use a simple linear fusion which a query anchor is supposed to be linked. For our ap- mechanism to merge the two scores, where the weights w1 proach, a fixed window whose duration is 120 seconds and and w2 are set to 0.5 respectively. the overlap is 30 seconds is used to determine the target segments. The spoken data in the video is available in three 2.3 Hyperlinking using Named Entities transcripts: automatic speech recognition (ASR) transcript- This strategy links named entities contained in query an- s from LIUM Research [6], LIMSI/Vocapia [2] and manual chors and the potential target segments, and then merges subtitles provided by the BBC [1]. Hyperlinks are construct- these entities to construct hyperlinks. Apache OpenNLP5 ed from the query anchor to a set of target segments using 1 different hyperlinking strategies as described in the following https://github.com/dbpedia-spotlight 2 subsections. http://dbpedia.org/Ontology 3 http://lucene.apache.org/ 4 https://lucene.apache.org/core/3 6 2/api/core/ Copyright is held by the author/owner(s). org/apache/lucene/analysis/StopAnalyzer.html 5 MediaEval 2013 Workshop, October 18-19, 2013, Barcelona, Spain http://opennlp.apache.org/ Topic (Anchor) ID 4 12 21 23 27 31 32 39 43 45 MAP 0.8921 0.3733 0.1395 0.4925 0.0060 0.4170 0.5713 0.4127 0.1891 0.5555 P@5 1.000 1.000 0.600 1.0000 0.0000 0.8000 0.8000 1.0000 0.6000 1.0000 P@10 1.000 0.900 0.600 0.9000 0.0000 0.8000 0.7000 1.0000 0.7000 1.0000 P@20 1.000 0.700 0.550 0.8500 0.0000 0.6000 0.6000 0.8500 0.4000 0.9000 Table 1: Mean Average Precision (MAP) and P@N results for different topics in RUN 3 Run ID Method Data Fuse value of each run. This indicates that our hyperlinking s- 1 Text Annotation M+I+S Eq.1 trategy based on spoken data annotation performs better. 2 Text Annotation M+I+S Eq.2 Table 1 shows P@N and MAP value of Run 3. MAP and 3 Text Annotation M+U+S Eq.1 P@N benchmark have received a good result in most run- 4 Named Entities Link M+L+S Eq.4 s except Topic (Anchor) 27, which describes Shakespeare and Global Theatre. A total of two other videos are related Table 2: Run Details (M: Metadata, I: LIMSI, U: to Shakespeare and Global Theatre, while the content is p- LIUM, S: Subtitle, Eq: Equation) resented in terms of a cartoon. The lack of visual elements leads to hyperlinks to cartoon segments, while real users will Run ID 1 2 3 4 notice the unrelatedness between TV shows and cartoons. MAP value 0.2944 0.2935 0.3109 0.0161 P@5 0.7000 0.7067 0.7267 0.0600 4. CONCLUSIONS P@10 0.6567 0.6633 0.6567 0.1067 This paper presented details of DCU’s participation in P@20 0.5450 0.5383 0.5433 0.0733 the TV Data Hyperlinking task of MediaEval 2013. The evaluation shows that annotating spoken data to construct Table 3: Mean Average Precision (MAP) evaluation hyperlinks achieves better results. In our future work, we results will examine the use of visual cues to improve hyperlinking performance. was used to tag words in the ASR transcripts and subti- tles. All noun words tagged as NN, NP, and NNP were 5. ACKNOWLEDGEMENT selected as named entities. To describe and link the named This work is funded by the European Commission’s Sev- entities, a vector space model was constructed by predict- enth Framework Programme (FP7) as part of the AXES ing the surrounding words given the current word. We use project (ICT-269980). word2vec6 to implement a supervised learning mechanism using a Neural Net Language Model to create the vector 6. REFERENCES model of named entities. We use the ASR transcripts of [1] M. Eskevich, G. J. F. Jones, S. Chen, R. Aly, and videos gathered from the blip10000 collection [7] as train- R. Ordelman. Search and Hyperlinking Task at ing data. The word2vec receives each named entity as input MediaEval 2013. In MediaEval 2013 Workshop, and outputs a vector V = {w1 , w2 , ...wk } where wi is a sur- Barcelona, Spain, 2013. rounding word of the current entity learned by training data [2] L. Lamel and J.-L. Gauvain. Speech Processing for and the vector dimensionality k is set to 50, based on the Audio Indexing. In Advances in Natural Language experiment described in [5]. Equation 3 is used to calculate Processing (LNCS 5221), pages 4–15. 2008. the score between different word vectors. [3] Lucene 3.6.2 Document. Apache Lucene - Scoring. T 2(Vi Vj ) https://lucene.apache.org/core/3 6 2/scoring.html, S= (3) |Vi | + |Vj | Dec. 2012. T [4] P. N. Mendes, M. Jakob, A. Garcı́a-Silva, and C. Bizer. where Vi Vj are the total number of words contained in Dbpedia Spotlight: Shedding Light on the Web of both Vi and Vj . |Vi | is the length of the word vector i. All Documents. In Proceedings of the 7th International named entities located at the potential target segments are Conference on Semantic Systems, USA, 2011. merged using Equation 4 to generate the final score to obtain the ranked hyperlink list. [5] T. Mikolov, K. Chen, G. Corrado, and J. Dean. P Efficient Estimation of Word Representations in Vector 0