CUNI at MediaEval 2015 Search and Anchoring in Video
         Archives: Anchoring via Information Retrieval

                                        Petra Galuščáková and Pavel Pecina
                                                 Charles University in Prague
                                            Faculty of Mathematics and Physics
                                         Institute of Formal and Applied Linguistics
                                                   Prague, Czech Republic
                                       {galuscakova,pecina}@ufal.mff.cuni.cz

ABSTRACT
In the paper we deal with automatic detection of anchoring
segments in a collection of TV programmes. The anchor-
ing segments are intended to be further used as a basis for
subsequent hyperlinking to another related video segments.
The anchoring segments are therefore supposed to be fetch-
ing for the users of the collection. Using the hyperlinks,
the users can easily navigate through the collection and find
more information about the topic of their interest.
  We present two approaches, one based on metadata, the           Figure 1: Hyperlinking of the anchoring segments
second one based on frequencies of proper names and num-          and data segments.
bers contained in the segments. Both approaches proved
to be helpful for different aspects of anchoring problem: the
segments which contain a large number of proper names and         segments and find information about related topics easily
numbers are interesting for the users, while the segments         (see Figure 1).
most similar to the video description are highly informative.       Data used in the Anchoring sub-task consist of BBC TV
                                                                  programmes broadcast between 01.04.2008 and 31.07.2008.
                                                                  38 programmes were used for the training and 34 programmes
1.     INTRODUCTION                                               for the testing. 90 anchoring segments manually marked in
   Thanks to a big rise of available multimedia data, the sizes   the training data were available for training purposes. The
of speech and video archives are recently growing rapidly.        Anchoring sub-task was evaluated using crowdsourcing cam-
The amount of information stored in such archives is enor-        paign. The top 25 results returned by each task participant
mous, but the navigation in these archives is often tedious.      were manually marked as correct or wrong. Partially over-
Therefore, special attention should be paid to proposal and       lapping segments returned by the participants were joined
tuning of the methods for effective navigation in large multi-    before the annotation. We present Precision at 10 (P10),
media archives. Effective navigation can not only help users      Recall and MRR scores. Details of the task, data, and eval-
to find required information easily but also to find addi-        uation process are given in the task description [1].
tional information about the topic of their interest using an
exploratory search.                                               2.     SYSTEM DESCRIPTION
   The Search and Anchoring in Video Archives task orga-
                                                                     We used subtitles available for each program in all our
nized in the MediaEval Benchmark 20151 explores effective
                                                                  experiments. All videos were first segmented into shorter
navigation methods for large video archives. In this paper
                                                                  passages, which can possibly be the anchoring segments.
we deal with the Anchoring sub-task of the task. The main
                                                                  The passages were either created regularly each 10 seconds
purpose of the sub-task is to automatically label anchor-
                                                                  (Fixed Segmentation)2 or using machine learning-based meth-
ing segments. The anchoring segment should be in some
                                                                  ods which automatically detect probable segment ends (ML
way remarkable for the archive users so that the users may
                                                                  Segmentation) [4]. Afterwards, created segments were sorted
be interested in finding additional information about them.
                                                                  according their likelihood of being the segment of interest
Hyperlinking [2, 3] can be further applied on the selected
                                                                  (anchoring segment) for the archive users.
segments and more segments similar to the anchoring ones
                                                                     We used two methods for sorting created segments. In the
can be retrieved. The user can thus easily browse the archive
                                                                  first method, manually-created metadata available for each
using the links between anchoring segments and related data
                                                                  program were converted into the queries. The metadata con-
1                                                                 sist of the program name and a short program description.
    http://www.multimediaeval.org                                 Information retrieval (IR) system was then used to sort the
                                                                  created segments according to their similarity with meta-
                                                                  data describing the program. The most similar segments
Copyright is held by the author/owner(s).                         2
MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany           Each segment was 60 seconds long.
                              Segmentation       Preference         P10       Recall     MRR
                                 Fixed               —            0.27879    0.25119    0.92609
                                  ML                 —            0.30303    0.24995    0.90303
                                 Fixed            Numbers         0.27879    0.24334    0.89091
                                 Fixed             Names          0.31212    0.27030    0.86140
                                 Fixed        Numbers and Names   0.31212    0.27061    0.83384

Table 1: Results of the anchoring task: a comparison of segmentation strategies and a preference based
on frequencies of proper names and numbers contained in the segments. Best results for each measure are
highlighted.


are expected to contain a relevant content and to describe        4.   CONCLUSION AND FUTURE WORK
the video recording well.                                           In the paper, we show several preliminary experiments
   In the second method, the segments were sorted accord-         with segmentation methods and selection process which can
ing to the frequency of numbers (words containing at least        be used for anchor selection. High MRR score achieved by
one digit) and proper names (words containing an upper            conversion of metadata into queries and their utilization for
case character while the previous word does not end with a        retrieving the most informative part of the document proved
dot) contained in each segment. Segments containing more          this approach to be convenient for the anchor selection.
proper names and numbers are intended to contain some               Achieved Precision and Recall numbers show that the
kind of specific information.                                     combination of this approach with the preference of the seg-
   In all experiments, we used the Terrier IR Framework ver-      ment based on the frequencies of proper names and numbers
sion 3.5 [6], and its implementation of the Hiemstra Lan-         may further improve the results. However, this combination
guage Model [5] for retrieval, with its parameter set to 0.35.    would need some detailed tuning. The detection of proper
We used Porter stemmer and Terrier’s stopwords list.              names should be solved properly using the named entity
   Information retrieval system was also used for sorting the     recognition. Utilization of other information such as fre-
segments based on the proper name and number frequencies.         quency of the content words, mentions of places and persons
As we further plan to combine the score from the retrieval        would also need further exploration.
based on metadata with the segment preference given by
proper name and number frequencies, we first run the re-
trieval in the same way as in the case when only metadata
                                                                  5.   ACKNOWLEDGMENTS
were used. The weights corresponding to a frequency of              This research is supported by the Czech Science Foun-
numbers and names were precalculated. Finally, precalcu-          dation, grant number P103/12/G084, Charles University
lated weights were linearly combined with the weight ac-          Grant Agency GA UK, grant number 920913, and by SVV
quired as the output of information retrieval. However, in        project number 260 224.
our preliminary experiments, the combination weight was
set to highly prefer the ordering given by proper name and        6.   REFERENCES
number frequencies.                                               [1] M. Eskevich, R. Aly, R. Ordelman, D. N. Racca,
                                                                      S. Chen, and G. J. F. Jones. SAVA at MediaEval 2015:
                                                                      Search and Anchoring in Video Archives. In Proc. of
3.   RESULTS                                                          MediaEval, Wurzen, Germany, 2015.
   The officially evaluated experiments are displayed in Ta-      [2] M. Eskevich, R. Aly, D. N. Racca, R. Ordelman,
ble 1. The machine learning-based segmentation proved to              S. Chen, and G. J. F. Jones. The Search and
increase Precision comparing to the fixed-length segmenta-            Hyperlinking Task at MediaEval 2014. In Proc. of
tion. Thanks to a large number of overlapping retrieved               MediaEval, Barcelona, Spain, 2014.
segments, which is even larger in the case of the machine         [3] P. Galuščáková, M. Kruliš, J. Lokoč, and P. Pecina.
learning-based segmentation, this Precision confirms the qual-        CUNI at MediaEval 2014 Search and Hyperlinking
ity of the selection of the top-retrieved results. The fixed-         Task: Visual and Prosodic Features in Hyperlinking. In
length segmentation without any proper name and number                Proc. of MediaEval, Barcelona, Spain, 2014.
preferences achieve the overall highest MRR score. This           [4] P. Galuščáková and P. Pecina. Experiments with
may confirm our assumption that highly ranked segments                Segmentation Strategies for Passage Retrieval in
retrieved using metadata are the most informative for the             Audio-Visual Documents. In Proc. of ICMR, pages
recording.                                                            217–224, Glasgow, UK, 2014.
   The overall highest Precision and Recall scores are achieved   [5] D. Hiemstra. Using language models for information
when segments contain the largest number of proper names              retrieval. PhD thesis, University of Twente, Enschede,
and numbers. This confirms the assumption that when the               Netherlands, 2001.
segment contains large number of proper names and num-            [6] I. Ounis, G. Amati, V. Plachouras, B. He,
bers, it will be probably interesting for the archive user.           C. Macdonald, and C. Lioma. Terrier: A High
Also, the overall largest number of segments marked by the            Performance and Scalable Information Retrieval
annotators as relevant was retrieved when we sorted created           Platform. In Proc. of SIGIR, Seattle, Washington,
segments according to how many proper names and num-                  USA, 2006.
bers do they contain.