<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>EURECOM @ MediaEval 2011 Social Event Detection Task</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xueliang Liu</string-name>
          <email>xueliang.liu@eurecom.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Benoit Huet</string-name>
          <email>benoit.huet@eurecom.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Raphaël Troncy</string-name>
          <email>raphael.troncy@eurecom.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>EURECOM</institution>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2010</year>
      </pub-date>
      <fpage>1</fpage>
      <lpage>2</lpage>
      <abstract>
        <p>In this paper, we present our approach and results of the MediaEval 2011 social event detection (SED) task. We solve the event detection problem in three steps. First, we query all event instances that happened given some condition. Then, an event identi cation model is proposed to measure the relationship between events and photos. Finally, visual pruning and owner re ning heuristics are employed to improve the results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        The Social Event Detection task at MediaEval 2011 aims
at detecting social events that occurred during May 2009
from a dataset composed of images shared on Flickr [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The
strategy we investigate is to nd the event instances that
occurred during this period of time and then try to match
these event instances with photos from the Flickr dataset.
We also study how to employ the visual features and \owner"
metadata from the photos to improve the performance. We
rst detail our approach (Section 2) before presenting and
discussing our results (Section 3). Finally, we conclude the
paper in Section 4.
      </p>
    </sec>
    <sec id="sec-2">
      <title>APPROACH DESCRIPTION</title>
      <p>The challenge of the social event detection task is to nd
the photo clusters that are relevant to events held on a given
location during a particular period of time. We tackle this
problem in two steps: rst, we attempt to retrieve all of
the events that occurred at a given place and time; second,
we use the extracted information about these events and
attempt to match them to the photos metadata in the dataset.
All of the photos that are matched to the same event can
be grouped in one cluster. Besides these two main steps, we
also improve the detection results with visual feature and
\owner" metadata.</p>
    </sec>
    <sec id="sec-3">
      <title>Prior knowledge acquisition</title>
      <p>We known that it is easier and more accurate for the
computer to identify speci c pattern compared with abstract
concept. To nd concert or soccer events that may be
hidden in the dataset, we rst look for all instances of these two
kinds of events held in a given place and time.</p>
      <p>Soccer games and concerts are types of favorite
activities in people's daily life and one can nd substantial
information online about such scheduled events. For example,
FBLeague1 provides the o cial football games that
registered in FIFA2 and UEFA3. From this web site, we obtained
461 football games that occurred in May 2009, among which
6 took place in Roma and Barcelona. These 6 soccer events
are our prior knowledge for the challenge 1.</p>
      <p>For challenge 2, we extract concerts information from event
directories such as Last.fm4, Eventful5, and Upcoming 6.
After manual check, only Last.fm contains descriptions of
events held on the given conditions. Last.fm is a popular
music web site that records concert events held in more than
190 countries. In addition, Last.fm provides an API for the
developer to build their algorithm based on its data. Using
its public API, we found 68 events that took place in the
Paradiso and 3 events in Parc del Forum in May 2009.
2.2</p>
    </sec>
    <sec id="sec-4">
      <title>Event Identification Model</title>
      <p>With the prior knowledge of scheduled events description,
the event detection task changes to a matching problem
where a model can be used to measure the relationship
between events and photos. Here, we consider events as
something happening in some place during sometime. Therefore,
the title, time and location are three key factors that
identify an event. The corresponding photo metadata are text
description, taken time and place. Since these three factors
are independent, we can measure the probability of a given
photo P to be relevant to an event E by</p>
      <p>p(P jE) = p(P:textjE:title)p(P:timejE:time)p(P:geojE:geo) (1)
where: The rst item measures the similarity of a photo text
description with an event title. Since both of them are short
and sparse, the most straightforward way to measure them
is:
p(T ext1jT ext2) = jT ext1 \ T ext2j
jT ext2j
Where the function j j is the total number of words in a
text vector.</p>
      <p>The second item in Equation 1 measures the di erence
between photo taken time and event held time. Here, we
measure the di erence using the Dirac function.
(2)
date(T ime2</p>
      <p>N</p>
      <p>T ime1)
)
(3)
p(T ime1jT ime2) = (
1http://www.fbleague.com
2http://www.www. fa.com
3http://www.www.uefa.com
4http://www.last.fm
5http://www.eventful.com
6http://upcoming.yahoo.com
Where the function date( ) calculates the number of days
for a time span, is the Dirac delta function that takes the
value 1 when and only when the input parameter is zero,
and N is used for scaling (its value will be discussed in the
Section 3).</p>
      <p>The third item in Equation 1 measures the distance
between photo geo tags and event location. The best distance
measure to use seems the L2 distance between the two
locations. However, an important amount of photos do not have
geo tags and when provided, GPS data in the Flickr dataset
can be inaccurate. Consequently, we just use the city/venue
name to measure the location feature and we use the textual
metric formalized in the Equation 2.</p>
      <p>This method nds many photos with a clear description
and association to events. However, text-based matching
brings also noise and it can not deal with photos without
any text description. We employ visual features to remove
the noisy photos and \owner" metadata to nd out relevant
photos without text description.
2.3</p>
    </sec>
    <sec id="sec-5">
      <title>Visual Pruning</title>
      <p>
        Visual pruning is employed to remove the noisy photos
from the results of the Event Identi cation Model [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. We
assume that the photos that are corresponding to the same
event should be similar visually. The method used here
is quite straightforward. Given a set of the photo feature
ffi; i 2 [1; N ]g, the distance between each feature fi and its
mean vector m is measure by the L1 distance.
      </p>
      <p>di = sum(jfi
mj)
(4)
Photos are then sorted according to the distance di. The
bigger the distance and the less similar the photo is with
the photo cluster, so we prune the photos with such a large
distance. Experimentally, we remove the 5% photos that are
far from the center in the visual feature space.
2.4</p>
    </sec>
    <sec id="sec-6">
      <title>Owner Refinement</title>
      <p>
        Owner re nement is another way to improve the detection
results [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. We assume that a person can not attend more
than one event simultaneously. Therefore, all the photos
that have been taken by the same owner during the event
duration should be assigned to the same cluster. Using this
heuristic, it is possible to retrieve photos which do not have
any textual description.
      </p>
    </sec>
    <sec id="sec-7">
      <title>EXPERIMENTS AND RESULTS</title>
      <p>Based on the proposed approach and the events instances
obtained previously, we design our runs as follows:
Challenge 1 :
run1 The parameter N in Equation 3 is set to 3, and
the basic Event Identi cation Model is run.
run2 Owner Re nement is performed on the results
of run1.</p>
      <p>Challenge 2 :
run1 the parameter N in Equation 3 is set to 1, and
the basic Event Identi cation Model is run.
run2 Owner Re nement is performed on the results
of run1.
run3 the parameter N in Equation 3 is set to 3 to
reduce the impact from erroneous taken time, and the
basic Event Identi cation Model is run.
run4 Owner Re nement is performed on the results
of run3.
run5 Visual Pruning and Owner Re nement are
performed on the results of run3.</p>
      <p>A summary of the results is detailed in the Table 1. As
run 1.1
run 1.2
run 2.1
run 2.2
run 2.3
run 2.4
run 2.5</p>
      <p>Results
Events Photos
2 216
2 222
18 1133
18 1172
24 1502
24 1556
24 1546</p>
      <p>NMI
0,2420
0,2472
0,4516
0,4697
0,5987
0,6171
0,6139
shown in the Table 1, 2 events are found for challenge 1 with
216 photos identi ed by the Event Identi cation Model. 6
additional photos are found by the \Owner Re nement"
approach. For the challenge 2, there are mainly two groups
of runs. The rst group (run1,run2) used the parameter
N=1, and 18 events are found from the 69 events set
previous obtained. In the second group (run3, run4, run5), 24
events are found with the parameter N=3. In general, the
results for the challenge 1 are just average since only 6
football games were found as prior knowledge and we suppose
that several other games have been missed. For the
challenge 2, the results are more promising and competitive.
4.</p>
    </sec>
    <sec id="sec-8">
      <title>CONCLUSION</title>
      <p>In this paper, we propose a framework to detect social
events within a media dataset. In our approach, the events
instances are retrieved rst as prior knowledge, and then, an
Event Identi cation Model is used to measure the similarity
of event and photos. In the solution, multi-modality feature
such as text, time, visual feature and \owner" metadata are
used.</p>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgments</title>
      <p>This work is supported by the project AAL-2009-2-049
\Adaptable Ambient Living Assistant" (ALIAS) co-funded by the
European Commission and the French Research Agency (ANR)
in the Ambient Assisted Living (AAL) programme.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Troncy</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Huet</surname>
          </string-name>
          .
          <article-title>Finding Media Illustrating Events</article-title>
          .
          <source>In 1st ACM International Conference on Multimedia Retrieval (ICMR'11)</source>
          , Trento, Italy,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Papadopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Troncy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Mezaris</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Huet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and I.</given-names>
            <surname>Kompatsiaris</surname>
          </string-name>
          . Social Event Detection at MediaEval 2011:
          <article-title>Challenges, Dataset and Evaluation</article-title>
          . In MediaEval 2011 Workshop, Pisa, Italy, September 1-2
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>