=Paper=
{{Paper
|id=Vol-233/paper-17
|storemode=property
|title=Evaluation of a Video Annotation Tool Based on the LSCOM Ontology
|pdfUrl=https://ceur-ws.org/Vol-233/p35.pdf
|volume=Vol-233
|dblpUrl=https://dblp.org/rec/conf/samt/GarnaudSK06
}}
==Evaluation of a Video Annotation Tool Based on the LSCOM Ontology==
Evaluation of a Video Annotation Tool Based
on the LSCOM Ontology
Emilie Garnaud, Alan F. Smeaton and Markus Koskela
Abstract— In this paper we present a video annotation tool to index video, we need to support different ways for the user
based on the LSCOM ontology [1] which contains more than 800 to navigate it in order to complete the annotation process.
semantic concepts. The tool provides four different ways for the In our annotation tool there are four distinct ways to
user to locate appropriate concepts to use, namely basic search,
search by theme, tree traversal and one which uses pre-computed annotate content, described as follows.
concept similarities to recommend concepts for the annotator to
use. A set of user experiments is reported demonstrating the A. Basic search
relative effectiveness of the different approaches.
An alphabetically-ordered list of the ontology and a search
Index Terms— Video annotation, ontology, LSCOM, semantic
concept distances. box to find matching concepts is provided which is simple but
effective when users have a good knowledge of the ontology.
I. I NTRODUCTION
B. Search by themes
In visual media processing, a lot of progress has been made
in automatically analysing low level visual features in order More than 700 concepts of the ontology have been arranged
to obtain a description of the content. However, annotations into 19 different themes such as Arts & Entertainment, Busi-
by humans are still often needed to extract accurate deep ness & Commerce, News, Politics, Wars & Conflicts . . . so an
semantic information from within. Indeed manual tagging of annotator can search for a concept by first selecting a theme
visual content has become widespread on the internet through that seems to fit with the shot.
what is known as “folksonomy” in which human annotators
provide descriptive content tags. C. Recommended concepts
One of the challenges in the area of human annotation is In previous work introduced in [2] we computed similarity
generating consistency across annotations in terms of both the among all pairs of concepts in the LSCOM ontology using
vocabulary used and the way it is used. The common approach a combination of usage co-occurrence as the ontology was
here is to provide users with an ontology, or an organisation used to index a corpus of 80 hours of video, combined with
of allowable semantic tags or concepts. This is popular in visual shot-shot (and by implication, annotation-annotation)
enterprises such as photo and video stock archives where only similarities. We used these concept-concept co-occurrences to
a small number of people actually perform the annotation and generate “recommended concepts” at any point after annota-
thus they are familiar with the ontology and the way it is tion by at least 1 concept. This worked by determining the 15
used. In more open-ended applications such as social tagging concepts most similar to the set of concepts already used to
or tagging by untrained users then ontologies are regarded annotate a shot, and this top-15 was refreshed every time an
as too restrictive and too hard to learn in a short period of additional concept was used in annotating a shot.
time and so such applications favour free form tagging at the
expense of the consistency the use of an ontology brings.
Here, we address the issue of how an untrained user could D. Tree organization
use a pre-defined ontology to index video in the domain of An hierachical version of the ontology has recently been
broadcast TV news. Specifically, we use the LSCOM ontology completed so we introduced some of its elements in our tool
[1], of about 850 concepts to help index media by semantics. by creating an area where a user can navigate among different
trees of the ontology.
II. V IDEO A NNOTATION T OOL
Traditional annotation tools based on a lexicon or ontology III. E XPERIMENTS AND A NALYSIS
usually provide a full list of concepts with no, or very poor We performed preliminary experiments involving 10 native
ways to navigate it. This works quite well for a small lexicon English-speaking users who each annotated 40 shots using
or for users who are trained to use it, but this is not scalable different functionalities of the tool, either in a restricted
to a larger ontology or the case where the users are untrained. timeframe or with unlimited time to complete. To replicate
Thus in order to use the LSCOM or any other large ontology the scenario of an untrained user annotating material on the
internet, our users did not receive any special training in
E. Garnaud is with Institut EURECOM, 2229, Route des Cretês, BP 193 using the annotation tool. Shots to be annotated were selected
– 06904 Sophia Antipolis Cedex, France and A. Smeaton and M. Koskela
are with the Centre for Digital Video Processing, Dublin City University, randomly and people used functionalities in a Latin squares
Glasnevin, Dublin 9, IRELAND. email: Alan.Smeaton@dcu.ie protocol so as not to bias the results. We analyzed four
different aspects of the annotation process namely the overall the same performance but after the first minute people lost time
time spent on annotating, the number of annotations per shot, searching the ontology for additional concepts as they did not
the shot annotation rate, and the number of annotations during have enough knowledge to know when to stop as searching the
the first minute. Results are shown below. The best annotation ontology does not provide any kind of closure to the process.
Search Search + Search + Entire
Only Themes Recmd. Tool
IV. C ONCLUSIONS AND F UTURE WORK
Average time per The approach of using recommended concepts as a way
shot 1m 53 2m 06s 1m 53s 1m 59s
# annotations per
of annotating seems to be promising though the size of our
shot (Avg) 6.9 7.2 11.3 10.9 experiment is small. The “recommended concepts” could be
Annotation rate 6.1 5.8 10.1 9.2 improved by collecting more data to link associated concepts.
Avg annotations Indeed, some associated concepts are really good (like ”store”,
in 1st minute 6.3 5.2 7.7 7.7
”landlines”, ”bank”, ”office” and ”female person” for ”admin-
istrative assistant”) but some others are not, such as (”har-
performance is obtained using the “recommended concepts” bors”, ”boat ship”, ”business people”, ”canal” and ”lakes” for
feature because the time spent in free annotation is the same ”house of worship”).
as the “search only” version (representing the traditional The tool seems to be powerful for various user profiles. For
approach) but the number of annotations is greater when beginners, it helps them to learn the ontology and for experts
recommendations are used. Using the“themes” feature seems it provides a way to annotate concepts that they are not used
to slow down the annotation process without increasing the to annotating which improve their knowledge of the ontology.
number of annotations, probably due to a lack of knowledge
of the ontology and the way concepts had been organised ACKNOWLEDGMENT
into different themes. Also, some shots are really good for This work was supported by Science Foundation Ireland under
annotation by themes but others are not, which is why they grant 03/IN.3/I361, by the EC under contract FP6-027026 (K-
are a good complement to searching for concepts to annotate. Space) and by IRCSET.
We also found an unexpected result from the “entire tool”
experiment which surprisingly doesn’t seem to be the most R EFERENCES
effective ! Once more, this seems to be due to a lack of [1] M. Naphade, J.R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A.
knowledge of the tool by users. Our whole point of us- Hauptmann and J. Curtis. Large-Scale Concept Ontology for Multimedia,
ing untrained users is to replicate the common situation of IEEE Multimedia, 13(3) July-Sept, 2006, pp.86–91.
[2] M. Koskela and A.F. Smeaton. Clustering-Based Analysis of Semantic
untrained users annotating resources on the internet. If we Concept Models for Video Shots In Proc. IEEE International Conference
examine the number of annotations done during the first on Multimedia & Expo (ICME 2006). Toronto, Canada. July 2006.
minute then “recommended concepts” and “entire tool” have