Evaluation of a Video Annotation Tool Based on the LSCOM Ontology Emilie Garnaud, Alan F. Smeaton and Markus Koskela Abstract— In this paper we present a video annotation tool to index video, we need to support different ways for the user based on the LSCOM ontology [1] which contains more than 800 to navigate it in order to complete the annotation process. semantic concepts. The tool provides four different ways for the In our annotation tool there are four distinct ways to user to locate appropriate concepts to use, namely basic search, search by theme, tree traversal and one which uses pre-computed annotate content, described as follows. concept similarities to recommend concepts for the annotator to use. A set of user experiments is reported demonstrating the A. Basic search relative effectiveness of the different approaches. An alphabetically-ordered list of the ontology and a search Index Terms— Video annotation, ontology, LSCOM, semantic concept distances. box to find matching concepts is provided which is simple but effective when users have a good knowledge of the ontology. I. I NTRODUCTION B. Search by themes In visual media processing, a lot of progress has been made in automatically analysing low level visual features in order More than 700 concepts of the ontology have been arranged to obtain a description of the content. However, annotations into 19 different themes such as Arts & Entertainment, Busi- by humans are still often needed to extract accurate deep ness & Commerce, News, Politics, Wars & Conflicts . . . so an semantic information from within. Indeed manual tagging of annotator can search for a concept by first selecting a theme visual content has become widespread on the internet through that seems to fit with the shot. what is known as “folksonomy” in which human annotators provide descriptive content tags. C. Recommended concepts One of the challenges in the area of human annotation is In previous work introduced in [2] we computed similarity generating consistency across annotations in terms of both the among all pairs of concepts in the LSCOM ontology using vocabulary used and the way it is used. The common approach a combination of usage co-occurrence as the ontology was here is to provide users with an ontology, or an organisation used to index a corpus of 80 hours of video, combined with of allowable semantic tags or concepts. This is popular in visual shot-shot (and by implication, annotation-annotation) enterprises such as photo and video stock archives where only similarities. We used these concept-concept co-occurrences to a small number of people actually perform the annotation and generate “recommended concepts” at any point after annota- thus they are familiar with the ontology and the way it is tion by at least 1 concept. This worked by determining the 15 used. In more open-ended applications such as social tagging concepts most similar to the set of concepts already used to or tagging by untrained users then ontologies are regarded annotate a shot, and this top-15 was refreshed every time an as too restrictive and too hard to learn in a short period of additional concept was used in annotating a shot. time and so such applications favour free form tagging at the expense of the consistency the use of an ontology brings. Here, we address the issue of how an untrained user could D. Tree organization use a pre-defined ontology to index video in the domain of An hierachical version of the ontology has recently been broadcast TV news. Specifically, we use the LSCOM ontology completed so we introduced some of its elements in our tool [1], of about 850 concepts to help index media by semantics. by creating an area where a user can navigate among different trees of the ontology. II. V IDEO A NNOTATION T OOL Traditional annotation tools based on a lexicon or ontology III. E XPERIMENTS AND A NALYSIS usually provide a full list of concepts with no, or very poor We performed preliminary experiments involving 10 native ways to navigate it. This works quite well for a small lexicon English-speaking users who each annotated 40 shots using or for users who are trained to use it, but this is not scalable different functionalities of the tool, either in a restricted to a larger ontology or the case where the users are untrained. timeframe or with unlimited time to complete. To replicate Thus in order to use the LSCOM or any other large ontology the scenario of an untrained user annotating material on the internet, our users did not receive any special training in E. Garnaud is with Institut EURECOM, 2229, Route des Cretês, BP 193 using the annotation tool. Shots to be annotated were selected – 06904 Sophia Antipolis Cedex, France and A. Smeaton and M. Koskela are with the Centre for Digital Video Processing, Dublin City University, randomly and people used functionalities in a Latin squares Glasnevin, Dublin 9, IRELAND. email: Alan.Smeaton@dcu.ie protocol so as not to bias the results. We analyzed four different aspects of the annotation process namely the overall the same performance but after the first minute people lost time time spent on annotating, the number of annotations per shot, searching the ontology for additional concepts as they did not the shot annotation rate, and the number of annotations during have enough knowledge to know when to stop as searching the the first minute. Results are shown below. The best annotation ontology does not provide any kind of closure to the process. Search Search + Search + Entire Only Themes Recmd. Tool IV. C ONCLUSIONS AND F UTURE WORK Average time per The approach of using recommended concepts as a way shot 1m 53 2m 06s 1m 53s 1m 59s # annotations per of annotating seems to be promising though the size of our shot (Avg) 6.9 7.2 11.3 10.9 experiment is small. The “recommended concepts” could be Annotation rate 6.1 5.8 10.1 9.2 improved by collecting more data to link associated concepts. Avg annotations Indeed, some associated concepts are really good (like ”store”, in 1st minute 6.3 5.2 7.7 7.7 ”landlines”, ”bank”, ”office” and ”female person” for ”admin- istrative assistant”) but some others are not, such as (”har- performance is obtained using the “recommended concepts” bors”, ”boat ship”, ”business people”, ”canal” and ”lakes” for feature because the time spent in free annotation is the same ”house of worship”). as the “search only” version (representing the traditional The tool seems to be powerful for various user profiles. For approach) but the number of annotations is greater when beginners, it helps them to learn the ontology and for experts recommendations are used. Using the“themes” feature seems it provides a way to annotate concepts that they are not used to slow down the annotation process without increasing the to annotating which improve their knowledge of the ontology. number of annotations, probably due to a lack of knowledge of the ontology and the way concepts had been organised ACKNOWLEDGMENT into different themes. Also, some shots are really good for This work was supported by Science Foundation Ireland under annotation by themes but others are not, which is why they grant 03/IN.3/I361, by the EC under contract FP6-027026 (K- are a good complement to searching for concepts to annotate. Space) and by IRCSET. We also found an unexpected result from the “entire tool” experiment which surprisingly doesn’t seem to be the most R EFERENCES effective ! Once more, this seems to be due to a lack of [1] M. Naphade, J.R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. knowledge of the tool by users. Our whole point of us- Hauptmann and J. Curtis. Large-Scale Concept Ontology for Multimedia, ing untrained users is to replicate the common situation of IEEE Multimedia, 13(3) July-Sept, 2006, pp.86–91. [2] M. Koskela and A.F. Smeaton. Clustering-Based Analysis of Semantic untrained users annotating resources on the internet. If we Concept Models for Video Shots In Proc. IEEE International Conference examine the number of annotations done during the first on Multimedia & Expo (ICME 2006). Toronto, Canada. July 2006. minute then “recommended concepts” and “entire tool” have