Towards a User-aware Enrichment of Multimedia Metadata Ana-Maria Manzat, Romulus Grigoras, Florence Sèdes Université de Toulouse – IRIT UMR 5505, 118 Route de Narbonne 31062 Toulouse, France {Ana-Maria.Manzat, Florence.Sedes}@irit.fr, Romulus.Grigoras@enseeiht.fr Abstract. A recent trend in multimedia information retrieval systems is the integration of users, by their preferences and interests, in the retrieval process. Generally, such systems consider the user only after the query’s execution, while the results’ presentation. We propose to consider the user as a source of metadata, by exploiting his behaviour and to enrich the document’s metadata with a usage metadata. We introduce the concept of temperature, associated to each metadata descriptor, which denotes the popularity of the multimedia document’s metadata. An algorithm for the computation, the increase and the decrease of this temperature is described in details. We present also how this algorithm can be used for the enrichment of each metadata descriptor according to the user’s interactions with the multimedia content and the metadata. Keywords: user’s behaviour, multimedia metadata enrichment, metadata popularity, multimedia systems 1 Introduction Nowadays, we are constantly surrounded by multimedia contents and devices. Thus, we are continuously creating and consuming multimedia data. Usually, before creating a multimedia document, the user has an idea of which kind of information he wants to include in his document and then he searches the multimedia contents that correspond to his needs [1]. Hence, the management of multimedia documents, which includes their storage, indexation and retrieval processes, is very important. A recent trend in the information retrieval domain is the user’s integration in the retrieval process. Thus, the user’s preferences, interests and behaviour are analysed and modelled in order to improve the performance of the system. This improvement is realised by providing better results to a user query and by recommending him other interesting documents accessed by other users which have similar profiles [2]. In this context, we focus on the user’s integration in the metadata management process. We want to provide a solution for the metadata enrichment through their usage and through the user’s interaction with the multimedia document to which they are associated. This enrichment is accomplished through the concept of temperature which is associated to each metadata descriptor related to the multimedia document and to the multimedia document itself. This temperature can be considered as a popularity metadata that is updated each time the document or a part of it is 2 Ana-Maria Manzat, Romulus Grigoras, Florence Sèdes consumed. Thus, more a document is consumed, the hotter it and its metadata get. In this paper we focus on the presentation of: (1) an algorithm that exploits this concept by specifying the manner in which the temperature can be increased or decreased, and (2) the algorithm’s application in several scenarios. This kind of metadata can have several utilizations in: the recommendation systems of a certain document or only a part of it; the execution of the user’s query, by taking into account the document’s temperature in the computation of its score; the creation of the document’s resume to be displayed in the results list; the selection of video’s key-frames according to the user’s profile. The remainder of the paper is structured as follows. We begin with an overview of multimedia metadata and the user’s interaction in the multimedia information systems, in Section 2. Then, in Section 3, we present a metadata framework that includes the concept of temperature. The proposed solution for the metadata enrichment according to their usage is described in Section 4. Finally, some preliminary results and conclusions are given. 2 State of the art From our daily experience, we can deduce that the best way to find certain desired information from a huge collection of documents is to look not at the information itself but rather at a much smaller and more focused set of data. In the context of multimedia retrieval systems, this concise information is the metadata. The metadata can be classified in: (1) content metadata (low-level, high-level, structure, life-cycle, identification and localization and management metadata) and (2) user metadata (user interaction and user context) [3]. During the last years, the number and the heterogeneity of metadata formats increased steeply. The majority of these standards are content centred, e.g., Dublin Core, XMP, MPEG-7, TV-Anytime. In general, an information system in charge with managing and retrieving multimedia contents is composed of [4, 5]: (1) a multimedia collection which contains several multimedia contents; (2) a metadata collection which contains information about the media characteristics (e.g., size, name) and their contents; (3) an indexation engine which includes several indexing algorithms to be applied on the multimedia collection in order to enrich the metadata collection. The indexing algorithms automatically applied on the multimedia contents produce metadata encoded into different standards and formats. These metadata are further employed in the retrieval process. This makes the management of the metadata and the query execution a very important task to be realised by a multimedia information retrieval system. In [6], the metadata is presented in the centre of the multimedia document lifecycle, which makes the metadata creation and management a very important issue in the handling of multimedia documents. In addition, the metadata is consumed and produced at every stage of the document lifecycle [7]. This leads to a constant user interaction with the metadata, in a direct or indirect manner. Thus, the user can be considered as an auxiliary source of metadata, which could improve the metadata obtained from the indexation process. He can produce metadata in an explicit or implicit manner. By attaching annotations and tags [8] to multimedia documents the user is creating explicit metadata. The inconvenient of using this approach for Towards a User-aware Enrichment of Multimedia Metadata 3 enriching the metadata is that, usually, the users are busy and annotating documents demands a lot of time and effort, and, consequently, the created metadata is very poor. In order to obtain more information from users, some other strategies have been developed. One of them is to analyze the user’s behaviour and to infer his interests [9] and his intentions [10]. These interests are used, for example, to adapt the presentation of the multimedia documents [11] and of the query results list [12] or to enrich the user query [13]. Apart from the implicit and explicit metadata we can consider also the attention [14] and usage metadata. This information is associated with the document and not with the user, as for the interests. In [15], the authors propose an algorithm for determining such metadata. The authors determine the popularity of multimedia documents in accordance with the number of users that access the documents. The authors attach this popularity information to entire documents, and not to parts of documents. Also, this information is computed in function of the number of users that access the document, and users’ interests and preferences are not taken into account. The behaviour of the user is also used in other domains, such as the adaptive hypermedia domain [16], where the presentation of the documents is modified according to the user, and the user-centric multimedia databases [17], where the user behaviour is captured through the analysis of the query logs. As could be noticed, the research fields where the user is taken into account are very different and vast, from the presentation’s adaptation to the multimedia information retrieval. The user’s behaviour is studied in order to adapt the documents or the query’s results, but the metadata associated to the multimedia contents are not enriched. Before presenting our approach for the metadata enrichment, we will describe in the next section the metadata framework developed in order to incorporate the notion of temperature. 3 Metadata Framework In the domain of metadata interoperability many studies were carried out [18, 19, 20] in order to provide the possibility to use in the application the different and heterogeneous metadata standards and formats, and also to allow the exchange of metadata between systems and applications. All these approaches are focused on the interoperability problem, and they do not offer any possibility to enrich the metadata in function of their usage. In this paper, we do not focus on the metadata model, but rather on the temperature concept. In order to illustrate this concept we present a preliminary metadata framework that allows the integration of existing metadata models and provides the possibility of enriching them through the usage. Our approach takes into account the users and their behaviour regarding the consumption of the retrieved documents and their associated metadata. The notion of temperature can be applied to any hierarchical metadata model. In our model, Fig 1, we couple each multimedia content with a unique metadata file, that contains the whole set of metadata related to that document. The link between the two documents is done through the documentSrc attribute from the 4 Ana-Maria Manzat, Romulus Grigoras, Florence Sèdes Meta_Document metadata. As a multimedia document can be composed by different media types, its metadata can be formed by many Meta_Documents, each one corresponding to one media from the multimedia content. Each Meta_Document is divided in two parts: (1) General_Metadata, which corresponds to the general metadata, such as the life-cycle and the identification metadata (e.g., the creator, the description); (2) Media_Metadata, which corresponds to the media specific metadata. In order to be as generic as possible and to allow the integration of different existing metadata standards, we decomposed the two parts presented above in Units. Each Unit represents a metadata element, e.g., the author. It has as attributes the name of the metadata, its type and, eventually, a definition or a reference to its definition that is provided into a thesaurus. Depending on the application’s needs, a Unit can be decomposed in one or more Units. The actual value of each Unit is specified in a different element, Value, which has as attribute the source of the value, e.g., the metadata standard that provided the metadata element. Fig. 1. Metadata framework The usage metadata, the temperature, is associated to each element of the metadata format presented above. More precisely, every metadata element from the proposed framework has associated two kinds of temperature: (1) one computed for each group of users that interacts with the multimedia content, and (2) an average one for each metadata element, that is computed in function of each groups’ temperatures. Towards a User-aware Enrichment of Multimedia Metadata 5 In this paper, we do not focus on the determination of the users’ groups that we use in our approach. We consider that these groups are already established and that they can evolve over time. In our work, the different groups can be disjoint or not, a user can belong to at least one group and over time he can migrate from one group to another. An approach for the creation of such groups, based on the users’ interests, is defined in [21]. The advantage of using users groups is that in this way the temperature can be used for personalisation purposes. The algorithms presented in the reminder of the paper work regardless the number of user groups defined; it works as well for single users. 4 Metadata enrichment We consider the user as an important source of implicit metadata, because he can produce metadata by interacting with the multimedia documents he obtains as results to his query. In our proposal we focus on exploiting the user’s behaviour. In order to be able to respond to as many users’ queries as possible, in an information retrieval system many different indexing algorithms are applied. Thus, the multimedia metadata obtained are heterogeneous, from simple low-level features to more complex semantic high-level features. Usually, not all the generated metadata are used in the retrieval process. There are some metadata that are used more often than others. For this reason, we propose to enrich the metadata obtained after the indexation process with the concept of temperature. Thus, the more the documents or their associated metadata are used, the hotter they are. We have attached the temperature to (1) the multimedia document (at the Meta_Document level in the metadata framework presented in the previous section) and also to (2) their associated metadata (the temperature attached to each metadata element in the proposed framework). This popularity metadata can be used, for example, in the query process. In the execution of a query, the popularity metadata is taken into consideration in the computation of the results’ score. This way the popular documents and segments of documents are better ranked. Fig. 2. User’s actions in an information retrieval system 6 Ana-Maria Manzat, Romulus Grigoras, Florence Sèdes The above picture resumes the actions that a user makes when interacting with an information retrieval system. Based on these considerations, we propose to realise the metadata enrichment by taking into consideration the user’s interaction with the metadata associated to query results (step 3 in Fig. 2) and with the multimedia document (step 6 in Fig. 2). First, we describe in Section 4.1 the metadata enrichment algorithm and then, in the next sections we present its concrete application based on the user’s interaction with the results list, Section 4.2, and with the multimedia document, Section 4.3. 4.1 Metadata enrichment algorithm Independently of the manner the decision of the increasing of the temperature is taken, the temperature is computed for each period of time ∆t and it depends on the number of users that have consumed the metadata in that period. The temperature is defined as t in with 0≤ t ≤ 100. The initial value of the temperature of all the documents and of their associated metadata is 0. The algorithm used for the increase of the temperature is presented in Table 1. The parameters of the proposed algorithm are: the metadata whose temperature has to be increased, the number of users that consumed the metadata and the identifier of the group these users belong to. The first step of the algorithm is the computation of the metadata’s temperature corresponding to the user group received as parameter. Afterwards, the average temperature of the metadata element is computed as an arithmetic mean of the temperatures associated to this metadata, corresponding to each user group in the system. For the computation of this average temperature can be use also weighted mean. Each time the temperature of a metadata is modified using the increaseTemperature method, this modification is propagated to all its children metadata. The propagation method is presented in Table 2. It follows the same steps as the first algorithm. The temperature of each child metadata is changed with a value that is directly proportional with the variation of the temperature at the first level and with the level in the metadata hierarchy where the current element is. This propagation can be limited to a certain level in the hierarchy, specified by the MAXLevel constant We apply the same reasoning for the propagation of the temperature to all the ancestors of the metadata element that initiated the process of temperature increasing. The propagation method is presented in Table 3. In the computation of the new temperature we follow the same rules as for the propagation to the child elements. Table 1. The algorithm for the increase of the metadata’s temperature Algorithm 1: increaseMetadataTemperature Input: The metadata, MD, whose temperature has to be increased, the number of users, n, who used the metadata, the identifier of the group, gID, to which the users belong. Output: the metadata with the temperature increased, for all children and ancestors. Δtemp ← computeGroupTemperature(MD, gID, n); md ← setGroupTemperature(MD, gID,Δtemp); setHistory(MD, gID, Δtemp); avgTemp ← computeAvgTemperature(MD); Towards a User-aware Enrichment of Multimedia Metadata 7 md ← setAvgTemperature(md, avgTemp); if MD has children then └ md ← propagateTemperatureDown(md, gID, Δtemp, 1); if MD has parent then └ md ← propagateTemperatureUp(MD.parent, gID, n, Δtemp, 1); return md; Table 2. The algorithm for the propagation of the metadata’s temperature to all its children Algorithm 2: propagateTemperatureDown Input: The metadata, MD, for which we want to increase the temperature of the children; the identifier of the group, gID, for which the temperature has to be increased; the temperature Δtemp, that is used for the computation of the new temperature; the level of the recursive call Output: the metadata with the temperature of all its children increased inc ← computeTemperature(Δtemp); foreach child of MD do │ md ← md U setGroupTemperature(child, gID, inc); │ setHistory(child, gID, inc); │ avgTemp ← computeAvgTemperature(child); │ md ← md U setAvgTemperature(child, avgTemp); │ if level; and . For the last two elements, the temperature will be increased with a smaller value than the first one because they were not expanded until the last leaf. a) b) Fig. 3. a) Metadata displayed with a query result; b) The same metadata after the user interaction with it 4.3 The interaction with the multimedia document After the study of the results list, the user chooses a multimedia document and begins to interact with it: he explores the document, he studies in more details a part of the multimedia content, he spends an important period of time examining the document, etc.. This behaviour illustrates his interest in the document and in its compounds. We augment the temperature of the metadata which correspond to the multimedia document’s compound the user is interested in. When the temperature of a component is changed the temperature of the document and of the other metadata elements that describe the component are modified as well. The information is also propagated to the higher levels in the metadata hierarchy. In order to illustrate this metadata enrichment, we can consider the SMIL presentation from Fig. 4. This presentation is composed of a video and an audio content and the presentation’s slides as images. The organization in time of the presentation and the eventual audio and video segments are presented in Fig. 4. For this example, we also consider an information retrieval system where two users’ groups were identified. Several users belonging to the same group have used the system in the same time and they obtained the same SMIL presentation as a result to their different queries. They all have selected the presentation and have watched it from the 1’20’’ until de 4’50’’. In this case, the temperature of the metadata associated to all the multimedia contents displayed in this period of time will be modified. From the timeline presented in Fig. 4 we can deduce that the video segments seg_Video1 and seg_Video2, the audio segments seg_Audio1, seg_Audio2 and seg_Audio3 and the images img2.jpg and img3.jpg are candidates for having the 10 Ana-Maria Manzat, Romulus Grigoras, Florence Sèdes temperature modified. At this point several strategies can be established for choosing the segments to use for the metadata enrichment. For example, if the compound was watched for at least half of its length, then its temperature will be modified. In this case, the temperature of the audio segments seg_Audio1 and seg_Audio3 will not be modified, because they were listened for less than their length. Another possibility would be to increase the temperature for all the segments that were displayed, with a value proportional with the time that they were watched. Fig. 4. The timeline structure for the SMIL presentation Through the proposed algorithm for the temperature increasing, the temperature of the entire presentation will be increased, as a consequence of the consumption of a part of it. We can note that the more a document is watched, hotter it gets. In the next section, we present some possible utilizations of the temperature concept in the context of a broadcast use case. 5 Implementation and discussions In order to validate our proposal, we have applied the concept of temperature to a web site. In this case we used the algorithm in function of the users’ interaction with the multimedia content. We consider a page of the site as a document and we associate to it metadata. The users’ interactions with the web page (e.g., clicks) are collected into a database. For the tests effectuated we considered only a group of users. We have instantiated the metadata framework by using the XML and XSD technologies and the algorithm was implemented in Java. The Fig. 5 shows the obtained results for a web page temperature computation. The results show that the time granularity is very important in the application of our algorithm. For the same users interactions with the page the temperature obtained for the whole page is different in function of the strategies employed: compute the temperature each 24 hours, each 12 hours, each hour or less than an hour. The decrease strategy is also important when the time granularity chosen for the computation of the temperature is small. These choices are use case dependent. The curves in the Fig. 5. show that these considerations have an influence on the evolution of the temperature. Thus, making the good choice is important in the progress of the temperature. Towards a User-aware Enrichment of Multimedia Metadata 11 Fig. 5. Experiments results 6 Conclusion In this paper, we have presented a modality of multimedia metadata enrichment based on the users’ interaction with the multimedia content and with their associated metadata. This enrichment is done in two steps: (1) in function of the users’ interaction with the metadata and the results list and (2) in function of the users’ behaviour with the multimedia document. We intend to implement and test our proposal in the context of the LINDO project (Large scale distributed INDexation of multimedia Objects) (http://lindo-itea.eu/) in order to determine the best parameters of the algorithm (e.g., time granularity, decrease strategy, the level of propagation). These parameters cannot be set without the intervention of the user, thus we will realise some qualitative interviews with a set of volunteers. In a first time we will implement the second scenario for the computation of the temperature (presented in Section 4.3). After the specification of the parameters we will take the experiments a little further, by using the temperature for the metadata management in a distributed system [23]. Acknowledgments: This work has been supported by the EUREKA Project LINDO (ITEA2 – 06011). References 1. Hardman L., Obrenovic Z., Nack F., Kerhervé B., Piersol K., Canonical Processes of Semantically Annotated Media Production. In: Multimedia Systems Journal, 14(6): 327-340, (2008). 2. Candillier L., Jack K., Fessant F., Meyer F., State-of-the-Art Recommender Systems. In Collaborative and Social Information Retrieval and Access: Techniques for Improved User Modeling, Chevalier M., Julien C., Soule-Dupuy C. (Eds.), IGI Global, 1-22, 2008 3. Pereira F., Vetro A., Sikora T., Multimedia Retrieval and Delivery: Essential Metadata Challenges and Standards, Proceedings of the IEEE 96(4), 721 – 744, (2008) 4. Buckland M. K., Plaunt Ch. On the construction of selection systems. Library Hi Tech, 12, pp. 15-28 (1994). 12 Ana-Maria Manzat, Romulus Grigoras, Florence Sèdes 5. Lancaster F. W.. Information Retrieval Systems. Wiley, New York (1979) 6. Smith J. R., Schirling P., Metadata Standards Roundup, In IEEE MultiMedia, 13(2): 84– 88, (2006). 7. Kosch H., Boszormenyi L., Doller M., Libsie M., Schojer P., Kofler A., The Life Cycle of Multimedia Metadata, IEEE MultiMedia 12(1): 80-86, (2005) 8. Kahan, J., Koivunen, M.-R., Prud’Hommeaux, E., Swick, R. R., Annotea: an open RDF infrastructure for shared Web annotations. Computer Networks, 32(5):589–608, (2002). 9. Kelly D., Teevan J., Implicit feedback for inferring user preference: A bibliography, in SIGIR Forum, volume 37, pp. 18–28, (2003). 10. Lux M., Kofler Ch., Marques O. A classification scheme for user intentions in image search. In the 28th of the international conference extended abstracts on Human factors in computing systems (CHI EA '10), pp. 3913-3918, ACM, (2010) 11. Plesca C., Charvillat V., Grigoras R., User-aware adaptation by subjective metadata and inferred implicit descriptors, in Multimedia Semantics—The Role of Metadata, vol. 101 of Studies in Computational Intelligence, pp. 127–147, Springer, (2008) 12. Kofler Ch., Lux M., Dynamic presentation adaptation based on user intent classification. In the 7th Int. conference on Multimedia (MM '09). ACM, pp. 1117-1118, (2009) 13. Zayani C., Péninou A., Canut M.-F., Sèdes F. An adaptation approach: query enrichment by user profile. In Signal-Image Technology & Internet-Based Systems (SITIS 2006), Hammamet - Tunisie, pp. 24-35, IEEE, (2006). 14. Memmel M., Dengel A., Sharing Contextualized Attention Metadata to Support Personalized Information Retrieval, In the Int. Workshop on Contextualized Attention Metadata: Personalized Access to Digital Resources, ACM/IEEE, pp. 19-26, (2007) 15. Brunie L., Pierson J.M., Coquil D., Semantic collaborative web caching, In the 3rd Int. Conference on Web Information Systems Engineering, (2002) 16. Brusilovsky P.. Adaptive hypermedia. in User Modeling and User-Adapted Interaction, 11(1-2):87–110, (2001). 17. Limam L. , Coquil D, Brunie L., Kosch H., Query Log Analysis for User-Centric Multimedia Databases, in The 2008 International Conference on New Media Technology, I-Media'08, pp.441-444, (2008). 18. Arndt R., Troncy R., Staab S., Hardman L., Vacura M., COMM: designing a well- founded multimedia ontology for the web. In ISWC+ASWC, pp.30-43,(2007). 19. Brut M., Laborie S., Manzat A.-M., Sèdes F., A Generic Metadata Framework for the Indexation and the Management of Distributed Multimedia Contents. In New Technologies, Mobility and Security, pp. 1-5, IEEE Computer Society, (2009) 20. Saathoff C., Scherp A, Unlocking the semantics of multimedia presentations in the web with the multimedia metadata ontology. In WWW 2010, pp. 831-840 21. Tchuente D., Canut M.F., Baptiste Jessel N., Péninou A., El Haddadi A., Visualizing the evolution of users’ profiles from online social networks », The 2010 IEEE International Conference on Advances in Social Networks Analysis and Mining, ASONAM’10. 22. Funahashi T., Fujiwara T., Koshimizu H., Face and eye tracking for gaze analysis. In Int. Conference on Control, Automation and Systems (ICCAS '07), pp. 1337-1341, (2007). 23. Laborie S., Manzat A.-M., Sèdes F., Managing and querying efficiently distributed semantic multimedia metadata collections, In IEEE MultiMedia special issue on multimedia-metadata and semantic management 16, 4, 12-20, (2009)