<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>IRISA and KUL at MediaEval 2014: Search and Hyperlinking Task</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anca-Roxana S¸imon</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guillaume Gravier</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pascale Sébillot</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>IRISA</string-name>
          <email>rstname.lastname@irisa.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>INRIA Rennes</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Marie-Francine Moens KU Leuven Celestijnenlaan 200A B-3001 Heverlee</institution>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Univ.</institution>
          <addr-line>Rennes 1</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2014</year>
      </pub-date>
      <fpage>16</fpage>
      <lpage>17</lpage>
      <abstract>
        <p>This paper presents our approach and results in the hyperlinking sub-task at MediaEval 2014. A two step approach is implemented: relying on a topic segmentation technique, the rst step consists in generating potential target segments; then, for each anchor, the best 20 target segments are selected according to two distinct strategies: the rst one focuses on the identi cation of very similar targets using n-grams and named entities; the second one makes use of an intermediate structure built from topic models, which o ers the possibility to control serendipity and to explain the links created.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        This paper presents the joint participation of IRISA and
KUL at the MediaEval 2014 Search and Hyperlinking task,
in which focus is set on the hyperlinking sub-task [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The
goal is thus to create hyperlinks between prede ned anchor
segments and short video segments, called targets, which
should o er complementary information not found at search
time. Targets have to be automatically extracted from videos
of a large collection.
      </p>
      <p>The hyperlinking system we propose consists in a two step
process: rst, all potential target segments are extracted
using a topic segmentation technique; then, the most relevant
targets are selected for each anchor, with the help of content
analysis and similarity measures.</p>
      <p>
        In our 2013 participation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], we focused on precise target
selection and our most e cient system consisted in a direct
comparison between anchor and target segments obtained
through topic segmentation using bags of n-grams to
represent content, thus resulting in links between anchor and
very similar targets. We thus go on exploring this direction.
However, we believe that besides precise target selection, a
very important aspect of hyperlinking is to o er
serendipity and to explain why two video segments are linked. To
address these points, we propose an intermediate structure,
obtained from topic models, that allows an indirect
comparison between anchors and targets. Its rst advantage is that
segments which do not share a consistent part of the
vocabulary but are semantically related can be linked. Moreover
this structure provides a basis to investigate why certain
links are created. Link justi cation is an interesting aspect
2.
      </p>
    </sec>
    <sec id="sec-2">
      <title>SYSTEM OVERVIEW</title>
      <p>
        The aim of our approach is to nd target segments of the
same topic as the anchor, or of related topics. The hyperlink
generation relies on content-based comparisons exploiting
spoken data obtained from automatic transcripts and
manual subtitles [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Data are rst lemmatized and only nouns,
non modal verbs and adjectives are kept. Subsections 2.1
and 2.2 respectively detail the two parts of our system: the
generation of potential target segments and the selection of
the top 20 targets for each anchor.
2.1
      </p>
    </sec>
    <sec id="sec-3">
      <title>Generating potential target segments</title>
      <p>
        Each video in the test collection is partitioned into
topically coherent segments with the generic topic segmentation
algorithm TextSeg [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], which is domain independent, needs
no a priori information, is e cient on speech transcripts and
on segments of varying lengths. Its main drawback concerns
over-segmentation, which in our case is not problematic since
the target segments must not last longer than 2 minutes.
2.2
      </p>
    </sec>
    <sec id="sec-4">
      <title>Selection of hyperlinks targets</title>
      <p>Each anchor segment is compared with each topically
coherent segment previously obtained thanks to similarity
measures. The comparison can be direct or indirect (i.e., using
intermediate structures).</p>
      <p>
        Four methods are proposed to select the hyperlinks
targets. The baseline corresponds to the method for which we
obtained our best results in 2013 (Linear+ngrams); a direct
comparison between segments is done, contents being
represented by bags of unigrams, bigrams and trigrams. The
second method extends the previous one by using the Stanford
Named Entity Recognizer (NER) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], a 3 class (person,
organization and location) entity tagger (Linear+ngrams+NEs).
A Jaccard similarity coe cient is used to evaluate
similarities in terms of shared named entities (NEs). The n-grams
and NEs similarity scores are combined; weights of 0.3 and
0.7 respectively are chosen to favor precise alignments, thus
not rewarding serendipity.
      </p>
      <p>
        For the last two methods an intermediate structure is built
using Latent Dirichlet Allocation (LDA) probabilistic topic
models [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] learned on the manual transcripts of the
development set (1335 hours of video). Each transcript is
represented as a mixture of K latent topics, where a latent
topic is represented as a probability distribution over the
vocabulary. Contrary to the bag of words representation,
this one clusters semantically similar co-occurring terms.
To construct the structure LDA is trained using Gibbs
sampling, standard values for the hyperparameters =50/K and
=0.01 [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] with a varying number of latent topics K: 50,
100, 150, 200, 300, 500, 700. This range for the number of
topics was chosen to learn general to more speci c topics.
Using this structure an indirect comparison between anchor
and target segments can be performed.
      </p>
      <p>Our third method (TopicM ) consists in computing, for
every K, the probabilities of the anchor and target segments
given the topics. For each K and for each anchor-target pair,
two vectors are obtained in which componenti corresponds
to the probability of topici, given the words contained in the
segment. Then a similarity measure is computed between
these vectors, leading to a score for each anchor-target pair.
To select the top targets, for each anchor a linear
combination of the scores resulted for each K is done with more
importance to the most speci c topics.</p>
      <p>For the last method (HierTopicM ), the previous structure
with topic models is extended to form a tree-like hierarchy
between the topics. This hierarchy relies on a similarity
measure between the topics obtained with Ki and Kj , where
Kj &lt; Ki. Thus, each topic learned with K = 700 will be
connected to the most similar topic learned with K = 500
and so on for the other K values. Having this representation
a path for each anchor can be selected, starting with the
bottom of the tree (K = 700) and choosing the topic with
the highest probability given the words in the anchor and
going on with the selection of the parents for that topic until
the rst level in the tree (K = 50) is reached. The targets
are then selected by a linear combination of the probability
values of the topics in an anchor path, given the words in
the target. Thus, in the end only a part of the topics in the
structure will be considered, allowing more precise control
of serendipity and justi cation of links.</p>
      <p>Using the structure of topics (hierarchical or not) to select
the hyperlinks, serendipity can be controlled by giving more
weights to values attained with more general or more speci c
topics. Moreover links can be justi ed, by looking at the top
words of the topics that contributed most in the selection of
those links. This structure can be used a posteriori for other
methods to understand the links creation.</p>
      <p>Considering the evaluation rules, the target segments
proposed for each anchor should have a duration between 10 sec
and 2 min. Therefore, a post-processing of the nal targets
is done to verify these constraints. If a segment is longer
than 2 min, it is resegmented with a sliding window of 2
min and the best scoring window within the segment will
represent the nal target segment. If a segment is shorter
than 10 sec it is combined with the best scoring neighbor,
resulted from the topic segmentation, until the minimum
length is reached.</p>
    </sec>
    <sec id="sec-5">
      <title>RESULTS</title>
      <p>O cial evaluation results are reported in Table 1. Only
the top 10 results were evaluated for each method, via
crowdsourcing on Amazon Mechanical Turk (AMT). Other results
were evaluated automatically based on the evaluation of the
selected results from all participants. The best results are
obtained with the topic segmentation with ngrams on the
manual subtitles (Linear+ngrams). This means that turkers
prefer that the content of the target segments is closely
related to that of the anchor. As anticipated, on the automatic
transcripts the precision decreases. Unexpectedly, adding
NEs to this method and therefore favoring segments about
the same people, places, organizations, diminished the
precision. From a manual assessment of several target segments
proposed using NEs, it seems that having targets speaking
about same people (e.g., Madonna, Beckham) in di erent
circumstances (i.e, shows on di erent subjects: diets,
charity) is not relevant. Possibly, giving less weight to the NEs
shared between anchors and targets, the precision could be
improved.</p>
      <p>Regarding the topic model-based methods, the one that
uses all the topics learned (TopicM ) yields better results,
comparable to those obtained with the Linear+ngrams method.
From a manual assessment of the results it seems that the
topics, even those learned with K = 700 are too general.
The targets that were considered relevant are those for which
the anchor addresses a more general topic (e.g., wildlife).
The problem of generality appears also for the HierTopicM
method. However having only a part of the topics
considered the results are worse than with TopicM. Using such
intermediate structures could be improved by learning more
speci c topics. Still, having the topwords of the topics that
best explain the anchors and the targets can help interpret
and justify the links. Additionally, with an intermediate
structure, anchors are linked to targets even without
sharing much vocabulary.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Blei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Ng</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M. I.</given-names>
            <surname>Jordan</surname>
          </string-name>
          .
          <article-title>Latent dirichlet allocation</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <volume>3</volume>
          :
          <fpage>993</fpage>
          {
          <fpage>1022</fpage>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Eskevich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Aly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. N.</given-names>
            <surname>Racca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ordelman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chen</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G. J. F.</given-names>
            <surname>Jones</surname>
          </string-name>
          .
          <article-title>The Search and Hyperlinking task at MediaEval 2014</article-title>
          . In Working notes of the MediaEval 2014 Workshop, Barcelona, Spain,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Finkel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Grenager</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>Incorporating non-local information into information extraction systems by Gibbs sampling</article-title>
          . In Association for Computational Linguistics, Ann Arbor, USA,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.-L.</given-names>
            <surname>Gauvain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lamel</surname>
          </string-name>
          , and
          <string-name>
            <surname>G. Adda.</surname>
          </string-name>
          <article-title>The LIMSI broadcast news transcription system</article-title>
          .
          <source>Speech Communication</source>
          ,
          <volume>37</volume>
          (
          <issue>1-2</issue>
          ):
          <volume>89</volume>
          {
          <fpage>108</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Guinaudeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-R.</given-names>
            <surname>Simon</surname>
          </string-name>
          , G. Gravier, and
          <string-name>
            <given-names>P.</given-names>
            <surname>Sebillot</surname>
          </string-name>
          . HITS and IRISA at MediaEval 2013:
          <article-title>Search and hyperlinking task</article-title>
          .
          <source>In Working Notes Proceedings of the MediaEval Workshop</source>
          , Barcelona, Spain,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Steyvers</surname>
          </string-name>
          and
          <string-name>
            <surname>T.</surname>
          </string-name>
          <article-title>Gri ths</article-title>
          .
          <source>Probabilistic Topic Models. Handbook of Latent Semantic Analysis</source>
          ,
          <volume>427</volume>
          (
          <issue>7</issue>
          ):
          <volume>424</volume>
          {
          <fpage>440</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Utiyama</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Isahara</surname>
          </string-name>
          .
          <article-title>A statistical model for domain-independent text segmentation</article-title>
          .
          <source>In Association for Computational Linguistics</source>
          , Toulouse, France,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>