An Approach to Self-Annotating Content Adrian Matellanes, Freddy Snijder and Barbara Schmidt-Belz  Abstract—aceMedia content analysis capabilities are centered II. MOTIVATION around the concept of the ACE. The ACE is composed of a content layer, a metadata layer and an intelligence layer. In this A. Content flow paper we show one application of the ACE Intelligence layer and Nowadays, content flows from one device to the other. how its proactive conduct can help in the complex task of adding semantic metadata to multimedia content. Devices are connected to Personal Area Networks (PANs), Local Area Networks (LANs) and Wide Area Networks Index Terms—multimedia content analysis, proactive content, (WANs) allowing multimedia content to be easily transferred self-annotation and shared. Devices supporting multimedia content range from powerful media servers to desktop PCs, set-top-boxes I. INTRODUCTION (thin and thick) and small devices such as mobile phones and Digital multimedia content management is a very complex PDAs. These devices have different characteristics, specific task. Huge efforts in research and development are being uses and processing capabilities. carried out in industry and academia to alleviate this This fact about content flowing from one device to the other complexity and bring solutions to help end-users and motivates the idea of a content item (an ACE in the aceMedia professionals to easily manage their collections of multimedia context) self-annotating itself whenever it reaches a target content. device, given that that device has annotation capabilities. The aceMedia project tries to help tackle this problem with There are different scenarios where content can enrich its a wide range of technologies, from multimedia knowledge metadata as it moves from one device to the other. Different representation [6], multimedia content analysis [3], annotation capabilities can be found in different devices, e.g. personalized search and browsing [4] to content adaptation [8] device A does not have a certain content classification module to cite a few references. Fundamental to aceMedia’s approach that is present in device B, but also the same annotation is the introduction of the Autonomous Content Entity (ACE) modules can have different capabilities depending on the [1]. The ACE is a multimedia object comprising three layers : device on which they reside, e.g. device A's face recognition the first layer is the multimedia content itself, the second layer module may know a different set of persons than device B. is the metadata layer, which includes manual and automatic B. User participation annotations, and the third layer is a programmable layer called Purely automatic annotations have a long way to go to “Intelligence layer” that provides proactiveness to the ACE. provide the user with accurate semantic annotations. The The intelligence layer is envisaged to help in the complex semantic metadata associated with the content can be problem of multimedia content management by enabling the improved with the help of the user. Some of our studies, content items to perform actions on behalf of the user, contrary to some common beliefs, showed that users are wherever they reside. willing to "help the system” with their manual annotations [2]. This paper briefly describes one of the applications of To incorporate users’ manual annotations we will create content proactiveness enabled by aceMedia technologies, proactive content that analyzes its own automatic semantic namely, the creation of self-annotating content. annotations and asks the user pertinent questions to solve Content autonomy is not limited to self-annotation, other some ambiguities or add some unknown information, e.g. a activities based on the ACE intelligence layer are also carried face that has been detected is not known to the face out [9]. recognition module and the ACE asks the user “Who is the person whose face is inside the bounding box?" The user, always in control, can obviously ignore these questions as the system is not strictly depending on them. Manuscript received 18 September 2006. This research was supported by III. THE SELF-ANNOTATING PROCESS the European IST project aceMedia (http://www.acemedia.org) under contract FP6-001765. A. Proactiveness Adrian Matellanes is with Motorola Labs, Basingstoke, UK (phone: +44 1256 484794; e-mail: adrian.matellanes@motorola.com). In the previous section we have seen some motivations for Freddy Snijder is with Philips Research, Eindhoven, The Netherlands (e- giving autonomy and proactiveness to multimedia content mail: freddy.snijder@philips.com). Barbara Schmidt-Belz is with Fraunhofer-FIT, Sankt Augustin, Germany when we try to add manual or automatic semantic annotations (e-mail: Barbara.Schmidt-Belz@fit.fraunhofer.de). to an ACE. It is important to emphasize here that the whole process of self-annotation and the ultimate decision to add resulting semantic annotations are then added to the ACE semantic annotation to an ACE resides in the ACE itself. metadata layer and finally, as explained before, the We will not go into detail about the software architecture Multimedia Reasoning module is called. that enables ACEs to run their programmable Intelligence We have just outlined a very simple implementation of a layer in order to give them autonomy; a brief description of self-annotation ACE. The ACE self-annotating intelligence this can be found in [1]. layer can indeed be programmed to perform more complex tasks and take other decisions such as raising questions to the B. The AnnotationManager runs the analysis user (see Section II) or prevent certain analysis to be Content is analyzed by different content analysis modules performed (because of privacy issues for example). that produce semantic metadata which in turn is added to the ACE metadata layer. A typical, application driven, annotation IV. CONCLUSION process is described in [2]. One of the objectives of aceMedia is to explore advanced In our case of self-annotation, it is important to clarify that content management techniques through the concept of the the ACE intelligence layer is in charge of starting/stopping the ACE and its Intelligence layer. aceMedia has successfully annotation process and decides which, if any, content analysis created a framework for the deployment of Autonomous needs to be run. The ACE programmable intelligence layer Content Entities (ACEs). These ACEs can have proactive does not include the analysis algorithms that analyze and behavior that helps users in their digital media management. produce new metadata. We have briefly outlined in this paper one of the As explained in the previous section, the modules in charge applications of the ACE Intelligence layer, the creation of of analyzing the content and adding new metadata can differ self-annotating content. We have presented the motivation from one device to the other and are offered to the ACE which led us to make ACEs self-annotating as opposed to intelligence layer through a common framework called the being annotated passively. Finally we have outlined the Annotation-Manager. process and workflow of self-annotation. This AnnotationManager interacts with the ACE Within aceMedia we are investigating other applications intelligence layer and runs the requested analysis modules in of the ACE Intelligence layer that are outside the scope of the appropriate order. The AnnotationManager also deals with this paper such as self-organizing ACEs and self-governing dependencies, e.g. a face recognition module may depend on a ACEs. face detection module. Once the AnnotationManager has called the analysis REFERENCES modules requested by the ACE, it will always run the [1] A.Matellanes, T.May, F.Snijder, P.Villegas, E.O.Dijk and A.Kobzhev, Multimedia Reasoning module to ensure metadata “An architecture for multimedia content management” in EWIMT 2005, consistency, remove ambiguities and derive new annotations London, UK. if possible. [2] A.Matellanes, A.Evans, B.Erdal “Creating an application for automatic annotation of images and video” in SWAMM 2006, Edinburgh, UK. C. Self-annotation process [3] N.Dalal and B.Triggs “Histograms of Oriented Gradients for Human Detection” in IEEE CVPR 2005, San Diego, USA, June 2005 In this section we will describe a typical self-annotation [4] S.Bloehdorn, K.Petridis, C.Saathoff, N.Simou, V.Tzouvaras, Y.Avrithis, process. As explained in the previous section, the Self- S.Handschuh, Y.Kompatsiaris, S.Staab and M.G.Strintzis, “Semantic Annotating ACE is in control of the annotation process but it Annotation of Images and Videos for Multimedia Analysis” in ESWC 2005, Heraklion, Greece, May 2005 does not perform the analysis nor the annotations itself. This [5] D. Vallet, M. Fernández, P. Castells, P. Mylonas and Y. Avrithis, way, the ACE can benefit from the different capabilities “Personalized Information Retrieval in Context” in MRC 2006 at AAAI offered by different devices and contexts, see section II. 2006, Boston, USA, July 2006 When an ACE is transferred to a different device, its self- [6] K. Petridis, S. Bloehdorn, C. Saathoff, N. Simou, S. Dasiopoulou, V. Tzouvaras, S. Handschuh, Y. Avrithis, I. Kompatsiaris and S. Staab, annotation process is started. “Knowledge Representation and Semantic Annotation of Multimedia The self-annotating Intelligence layer, asks the device what Content” in IEE Proceedings on Vision Image and Signal Processing, semantic annotation capabilities are present in the device. This Special issue on Knowledge-Based Digital Media Processing, Vol. 153, No. 3, pp. 255-262, June 2006. request is received by the AnnotationManager. [7] J.Malobabic, H.Le Borgne, N.Murphy, N.O'Connor, “Detecting The The AnnotationManager will analyze the kind of content Presence Of Large Buildings” in Natural Images 4th International stored in the ACE, i.e. whether it is a still image, a video clip, Workshop on Content-Based Multimedia Indexing, CBMI 2005, Riga, Latvia, 21-23 June 2005 or any other type of media. Based on this analysis the [8] N. Sprljan, M. Mrak, G. C. K. Abhayaratne, E. Izquierdo, “A Scalable AnnotationManager decides what analysis capabilities it can Coding Framework for Efficient Video Adaptation” in Workshop on offer, e.g. face detection, face recognition, speech recognition, Image Analysis for Multimedia Interactive Services (WIAMIS 2005), Montreux, Switzerland, April 13-15, 2005 knowledge-assisted analysis, etc. [9] P.Charlton and J.Teh, “A self-governance approach to supporting The ACE checks if this type of annotation has already been privacy preference-based content sharing in distributed environments” in performed, and creates a list of the missing annotation SOAS 2006, Erfurt, Germany, 18-20 September 2006. categories it is interested in. The ACE sends this list to the AnnotationManager that calls, in the appropriate order, the analysis modules and solves dependencies if needed. The