Why Real-World Multimedia Assets Fail to Enter the Semantic Web ∗ Tobias Bürger Michael Hausenblas Digital Enterprise Research Institute (DERI) Joanneum Research Technikerstrasse 21a Steyrergasse 17 6020 Innsbruck, Austria 8010 Graz, Austria tobias.buerger@deri.at michael.hausenblas@joanneum.at ABSTRACT More recently, the popular attraction was guided away from Making multimedia assets on the one hand first-class objects image sharing to richer content sharing of videos. This can on the Semantic Web, while keeping them on the other hand be seen by the launch of video portals like iFilm.com, Zid- conforming to existing multimedia standards is a non-trivial dio.com or the dozen of other portals that appeared recently task. Most proprietary media asset formats are binary, op- to compete with YouTube. timized for streaming or storage. However, the semantics carried by the media assets are not accessible directly. In Unsurprisingly there is already a portal called VideoRonk3 addition, multimedia description standards lack the expres- trying to combine other portals by providing a MetaSearch siveness to gain a semantic understanding of the media as- interface, which is quite of an help as one does not want sets. There exists an array of requirements both regarding to search on ten or more different sites. However, what is media assets, and the Semantic Web already. Based on a missing is the link between the contents of all these sites, critical review of these requirements we investigate how on- enabling distributed recommendations, cross-linking, etc. tology languages fit into the picture. We finally analyse the usefulness of formal accounts to describe spatio-temporal as- Still, for example a cross-site search on the semantic level pects of multimedia assets in a practical context. is close to impossible. The most obvious reason is due to a lack of metadata coming along with all the content. The Categories and Subject Descriptors power of providing metadata along with content on the Web can be seen at prospering mashups that not just combine H.5.1 [Information Systems]: Multimedia Information APIs—provided by parties as Google4 — but also trying to Systems; I7.4 [Document and Text Processing]: Elec- mashup things on a semantic level. This can be observed for tronic Publishing example at Joost [40]. Having metadata about everything, as video content, blog posts, news feeds and the users of the General Terms system makes this new experience of watching TV through Multimedia Semantics, Semantic Web the Internet possible. To take this even one step further: Would every stream or video available on the Internet be Keywords described more detailed even content on the Internet could Multimedia Assets for the Semantic Web, Multimedia Mod- be matched with user profiles from applications like Joost els, Requirements Analysis and could be offered to watch. As pointed out in [46, 36], high-quality metadata is essential 1. INTRODUCTION for multimedia applications. Our recent work within initia- Today a huge explosion of content can be experienced on tives [47] and research projects5 has shown, there is a need the Web generated by, and for the home users [33]: An in- for going beyond current metadata standards to annotate creasing number of people produce media assets (as photos, media assets. video clips, etc.), and share them on popular sites as Flickr1 , and YouTube2 . Current XML-based standards [24] are diverse, often pro- ∗Tobias Bürger is also affiliated with Salzburg Research prietary and not ad hoc interoperable; cf. also [45]. In 1 http://www.flickr.org SALERO, for example, we are facing the problem to offer 2 http://www.youtube.com a semantic search facility over a diverse set of multimedia assets, e.g., image, videos, 3D objects or character anima- tions. The same is true for the Austrian project GRISINO6 where we aim to realize a semantic search facility for cultural heritage collections. 3 http://www.videoronk.com 4 http://code.google.com/apis/ 5 such as in the EU project SALERO, http://www.salero.info 6 http://www.grisino.at Automating the handling of metadata for these collections 2. REQUIREMENTS FOR THE DESCRIP- and automating linkage between parts of these collections is TION OF MULTIMEDIA ASSETS hard as the vocabularies to describe them are mostly diverse Requirements for multimedia content descriptions have been and do not offer facilities to attach formal descriptions. researched in a number of papers [17, 46, 36, 6] before and investigations of the combination of multimedia descriptions A Motivating Scenario. Imagine a person that wants to with features of the Semantic Web are yet numerous [27, watch the recent clips similar to the ones of his favourite ex- 3, 42, 44, 2]. In the following, we give a summarisation perimental artist. Tons of clips are potentially distributed of the proposed requirements and add two additional ones on the Web making a search for them time consuming and (Authoring & Consumption and Performance & Scalability). laborious. Thus a central facility to search for and negotiate content is needed. This facility should allow to formulate a search goal, including the characteristics, the subject mat- ter, a maximum price, and the preferred encoding and file Representational Issues. A basic prerequisite is the for- mal grounding and neutral representation of the format used format of the clip. In a next step, all portal offerings will to describe multimedia assets. be scanned in order to retrieve and negotiate content that matches the users’ intention. Note that also parts of a video may match his intention, which means that videos need to • Neutral Representation: The ideal multimedia meta- be fine granular and sufficiently well enough described. data format has a platform and application indepen- dent representation, and is both human and machine In order for this scenario to work, the descriptions of (1) the processable; goal formulation, (2) the description of the media content by all content owners and (3) the negotiation semantics have • Formal Grounding: Knowledge about media assets must to be compatible. Three important focal points of these be represented in formal languages, as it must be in- semantic descriptions are: terpretable by machines to allow for automation. • Expressivity for high level semantic descriptions of con- tent as typical users are not thinking in terms of colour Extensibility & Reusability. It is requested that the for- histograms and spatial / temporal constructs. The mat at hand is extensible, e.g., via an extension mechanism characteristics of the media should be described de- as found in MPEG-7. It should be possible to integrate or tailed enough. reference existing vocabularies [24]. • The need for rules: To effectively identify the part of the content that matches the users’ intention, rules are Multimedia Characteristics and Linking. The format needed to map high level semantic concepts to spatial should reflect the characteristics of media assets, hence allow and temporal segments of the video (eg., because rat- linking between data and annotations: ings and classifications could only apply to parts of the content, ie., a scene including crime is only suitable for adults) • Description Structures. The format should support de- scription structures at various levels of detail, includ- • Fine grain semantic descriptions as of bandwidth, user ing a rich set of structural, cardinality, and multimedia effort, or cost reason to transfer the whole content is data-typing constraints; not possible. Thus parts of the content should be de- scribed detailed enough. • Granularity. The language has to support the defi- nition of the various spatial, temporal, and concep- tual relationships between media assets in a commonly To reach out, we want to provide answers to the question: agreed-upon format; Why do we need rich semantic descriptions of media assets on the Web, and (why) is there a need to bundle these de- • Linking. It has to facilitate a diverse set of linking scriptions together with the multimedia assets? Simultane- mechanisms between the annotations and the data be- ous, we want to provide answers to the questions: How can ing described, including a way to segment temporal descriptions be provided? Why are the metadata features media. of multimedia standards not enough? Consequently, we elaborate on the answer to the question Authoring & Consumption. A major drawback of exist- stated in the title of this paper “Why Multimedia Assets ing metadata approaches is its lacking support for authors Fail to Enter the Semantic Web? ” by first collecting the re- in creating annotations along with the lacking benefits of quirements for the description of multimedia assets (section generated annotations. 2), secondly by analysing the environment (section 3), and thirdly by collecting requirements for multimedia assets on the Semantic Web (section 4). In section 5 we analyse ex- • Engineering support. Appropriate tools are a prerequi- isting ontology languages for their usefulness regarding the site for uptake of new vocabularies. There is the need requirements and conclude in section 6 with a discussion of for at least authoring and consumption environments the question stated in the title of this paper and give a brief making use of the vocabularies to demonstrate their outlook on the open issues. usefulness. • Deployment. Multimedia Assets need to be exchange- of the existing metadata (sub-symbolic level - symbolic level able, and there must be ways to deploy descriptions - semantic level) is a necessary prerequisite for multimedia along with the assets. assets to enter the Semantic Web successfully. Secondly, from the requirements gathered in section 2 and the envi- ronmental analysis done in section 3 we deduce the following characteristics for multimedia assets on the Semantic Web: Performance & Scalability. The language should yield descriptions that can be stored, processed, exchanged and queried effectively and efficiently. Formality of Descriptions. Formal descriptions are the basic building blocks of the Semantic Web. To enable auto- MPEG-7. MPEG-7 [35] is a powerful and flexible way to matic handling like retrieval, and negotiation of multimedia describe media assets at several levels of granularity; on the assets formality of descriptions is a pre-requisite. other hand MPEG-7 bears some intrinsic complexity and Three different (semantic) levels of multimedia metadata interoperability issues [4, 46, 36, 43]. Due to the fact that can be identified [17]: (1) At the subsymbolic layer covering MPEG-7 standard is not grounded on formal semantics for the raw multimedia information typically binary formats are the descriptions, variability in the syntactic representation used which are optimized for storage or streaming and which of the descriptions may cause interoperability issues. mostly do not provide metadata. (2) The symbolical layer provides an additional structural layer for the binary essence 3. ENVIRONMENT ANALYSIS: stream. For this level standards like MPEG-7, Dublin Core THE SEMANTIC WEB or MPEG-21 can be used. The semantics of the informa- A good starting point for the analysis of our targeted host- tion encoded with these standards are only specified within ing environment—the Semantic Web—is the Architecture of each standards framework. (3) Therefore the semantic and the World Wide Web [28], in which its three main building logical layer is needed to provide the semantics for the sym- blocks are discussed: identification, interaction, and data bolical layer. This layer should be formally described using formats. The Semantic Web, as an extension of the well- ontology languages as proposed in this paper. known Web roughly has the following characteristics: Efficient layering and referencing of descriptions. It is • It is a highly distributed system. Identification of re- necessary to support different levels of meaning attached sources is based on URIs—for both data and services; to multimedia assets, i.e., meaning at the bit-level, tradi- tional metadata and semantic (high-level) information. As • There is no single, central “registry”, viz. authorities there are already widely adopted standards available for the are decentralised ; data and metadata are under control description of multimedia assets, the semantic layer must of a lot of distinct individuals (companies, standardis- be efficiently put upon those traditional description layers ation bodies, private, etc.) and should not aim to replace it. Furthermore semantic • Alike in the Web fundamental building blocks are re- descriptions from these traditional layers shall be re-used. lations between data, whereas the relations in the Se- As content, parts of content, and traditional and semantic mantic Web are named, may be of any granularity and descriptions may be distributed, efficient referencing mech- allow the automatic interchange of data; anisms for multimedia content must be present. • Contribuser 7 inhabit it; each participant may play dif- Based on recent discussions8 we give a summarisation of ferent roles at once: consuming content and contribut- possible approaches in the following. The multimedia asset ing via comments, links, etc. is denoted with A, for the multimedia metadata (M3) format, such as MPEG-7, we write M, the ontology (language) is • Finally, there exists a number of standards. Such as written as O, and finally an external reference mechanism9 RDF allowing formal definitions of the intended mean- are labelled with R. The linking is depicted with ,→: ing, SPARQL for querying, RDF(S), OWL or SKOS to classify content and OWL, WSML, or RIF for describ- • M ,→ A. the content is referenced from the M3 format; ing logical relationships. the ontology layer has to deal with it, separately; • M ,→ O. The M3 format references the ontology layer; Any multimedia metadata format that is after the successful application on the Semantic Web has to be in-line with the • O ,→ M. The ontology layer references the M3 format; above listed characteristics. While some requirements, as • O ,→ A. The ontology layer references the content di- formats (e.g. XML) are rather easy to meet, other can pose rectly; serious problems regarding the integration into the Semantic Web. • O, M ,→R A. The ontology layer and the M3 format use a common reference mechanism to link to the content. 4. MULTIMEDIA ASSETS ON THE However, it has to be noted that there is no standardised SEMANTIC WEB way for the layering or the referencing, yet. Firstly, addressing the environmental requirements together 8 with an efficient layering of the semantic descriptions on top http://lists.w3.org/Archives/Public/ public-xg-mmsem/2007Apr/0002.html 7 9 a portmanteau word; contributor and user http://www.annodex.net/TR/URI_fragments.html Interoperability among descriptions. Many formats used ing, and deployment of multimedia assets along with their in various communities cause interoperability problems when associated metadata. In the following the most important dealing with multimedia content. To overcome this, an RDF areas of engineering support are listed: based semantic layer should be added on top of these numer- ous formats to ease their semantic and syntactic integration. However, there are some open problems regarding the inte- • Edit & Visualise. To aid the engineer in handling gration of existing annotation standards and semantic ap- the annotations, editor tools, and IDEs10 are needed. proaches [46, 36]: The stack of Semantic Web languages These may include validator services11 , converter or and technologies provided by the W3C is well suited to the mapper, and visualisation modules. formal, semantic descriptions of the terms in a multimedia • Libraries & Applications. When developing applica- document’s annotation. But, as also pointed out in [41], the tions, the availability of APIs is a core requirement. Semantic Web based languages lack the structural advan- In special for Semantic Web applications, interface and tages of the XML-based approaches. Additionally, there is mapping issues are of importance [19]. a huge amount of work already done on multimedia docu- ment annotation within the framework of other standards. • Deployment Multimedia containers as HTML, SMIL, This is why a combination of the existing standards is the etc. require the metadata either being referenced from most promising path for multimedia document description within the media assets, or being embedded into it. in the near future. As the data model needs to be RDF—in contrast to existing, flat (tags, etc.) technologies—upcoming ap- proaches as RDFa [1] need to be utilised thoroughly. Subjectivity and granularity of descriptions. Opinions and views on the content differ among users because of their 5. FORMAL DESCRIPTIONS OF MULTI- personal background, culture or previous experiences. As many users are potential contributors to descriptions of as- MEDIA ASSETS In this part ontology languages which are thought to be used sets, opinions may differ. Many of these opinions sometimes for the advanced requirements which were identified in the do not serve to a unique whole opinion. This is why it should sections before are introduced. In its core it comprises a a be possible to separately attach these opinions to multime- comparison of two families of ontology languages against the dia assets and keep them separate. requirements postulated in section 4. The reader is invited to note that not all of the existing Trust and IPR issues. The Web consists of decentralized languages have the same expressiveness and not all have the authorities and a huge number of contribusers. As descrip- same inferential capabilities. Further, the underlying knowl- tions of content—especially in the new changing Web 2.0 edge representation paradigms can differ (eg., Description environment—are subject to vandalism, there need to be Logics, Logic Programming, etc.). Corcho and Gomez-Perez ways to guarantee the validity of the descriptions and to [20] present a framework that allows for analysing and com- secure descriptions that are just read-only for a user group. paring the expressiveness and reasoning capabilities of on- Popular portals like Flickr or YouTube show that there is no tology languages, which can be used in the decision process. need to own content in order to annotate it. Furthermore The process of choosing and selecting the appropriate ontol- copyright is critical when dealing with multimedia content. ogy language includes questions about e.g. the expressive- ness, inference mechanisms, translators or exchange formats offered for an ontology language.We are going to take these Functional Descriptions. Sometimes the fact that meta- questions into consideration and simultaneously verify if the data is created to support some specific function is forgotten languages meet the requirements discussed in section 4. when summarizing the requirements for a metadata schema. For the metadata creator it should be clear beforehand for what purpose the metadata will be used and what benefits 5.1 Ontology Languages A number of logical languages have been used for the de- he gains from it [34], ie., using this part of the metadata scription of different kinds of knowledge (i.e. ontologies and scheme enhances retrieval, raises social attention or helps rules) on the Semantic Web: First Order Logic, Description you protect your assets. Logics, Logic Programming and Frame-based Logics. Each This in turn also applies to the consumer of the metadata, of which allow the description of different statements and functional descriptions of what type of information can be each imply different complexity results for certain reasoning inferred from the attached metadata or what type of ac- tasks with these languages. tions can be performed on the content are essential: this is especially true for information that is obfuscated prior to a In this section we want to introduce two of the most promis- possible negotiation phase of the content. ing ontology language families, ie., the OWL- and the WSML- family of languages. The OWL family of languages is a standardisation effort of the W3C and the WSML family of Engineering Support. The presence of metadata is a pre- languages is an effort of the WSMO working group, whereas requisite to make multimedia assets accessible, and deploy- WSML is a formal language for the description of ontologies able on the Semantic Web, hence to enable their automated and Semantic Web Services. Other ontology languages like processing. From a developers perspective, there must be 10 tools and standards enabling an integrated authoring, test- as for example http://www.topbraidcomposer.com/ 11 http://phoebus.cs.man.ac.uk:9999/OWL/Validator F-Logic [30], OIL [7] or DAML+OIL12 were not taken into semantics; from a theoretical perspective, RIF Core corre- consideration because their lack of support for recent W3C sponds to the language of definite Horn rules. As standardi- recommendations like RDF. sation is still in its infancy, we will not go further into detail regarding rules, but one has to note that the careful inte- 5.1.1 Web Ontology Language (OWL) Family gration of ontology languages is an issue to be addressed; The Web Ontology Language (OWL) family was designed for example the usage of DL concepts in a rule has to be in a W3C standardisation process because of the need for well-defined. an ontology language that can be used to formally describe the meaning of terminology used in Web documents, thus, 5.3 Comparing Formal Descriptions Regard- making it easier for machines to automatically process and integrate information available on the Web. This language ing the Requirements should be layered on top of XML and RDF (W3C’s Resource In the following a high-level comparison of formal descrip- Description Framework13 ) in order to build on XML’s ability tion paradigms for multimedia assets is performed. We chose to define customized tagging schemes and RDF’s approach OWL+RIF on the one side, and WSML/OWL-Flight on the to representing data. other to achieve a somehow realistic scenario; the result can be found in Table 117 : The table indicates for which re- Currently OWL 1.114 is under development; it extends OWL quirement an ontology language (resp. OWL / WSML) can DL in several ways: the underlying DL now is is SROIQ, be utilised to overcome the identified shortcomings of tradi- which provides increased expressive power with respect to tional approaches and thus fulfill the requirements stated in properties and cardinality restrictions. Further, OWL 1.1 4. has user-defined datatypes and restrictions involving datatype predicates, and a weak form of meta-modelling known as Requirement OWL 1.1 WSML-/ punning. + RIF OWL- Flight The usage of rules in combination with DL has been inves- Formal Description ++ ++ tigated for some time [14, 21]—in the Semantic Web stack, Layering of Descriptions + + it is expected that a rule language will complement the on- Interoperability ++ + tology layer. Granularity - - Trust & IPR issues - - Functional Descriptions - * 5.1.2 The WSML family of languages Engineering Support ++ + The activities of the WSMO Working group15 have yielded Datatype Support + ++ proposals of new ontology languages, namely WSML (WSML- Core, WSML-DL, WSML-Flight, WSML-Rule, WSML-Full), Table 1: Comparison of Formal Descriptions for Me- OWL- (”OWL minus”) [8] and OWL Flight [10]. In [16] dia Assets. unique key features of WSML in comparison of other lan- guage proposals are presented. Compared to OWL key fea- tures include (1) WSML offers one syntactic framework for a set of layered languages, and (2) it separates between con- In the following, we elaborate in detail on each of the items ceptual and logical modelling. An overview of the different in Table 1, and argue therefore our findings regarding the variants of the WSML framework can be found in [32]. One comparison of OWL 1.1 + RIF vs. WSML/OWL-Flight. has to note that WSML-Flight incroporates a rule langage while still allowing efficint decidable reasoning and WSML- Rule allows unsafe rules. The relation of WSML to OWL is 5.3.1 Formal Description discussed in [9]. Both OWL and WSML provide a framework for the formal (machine-processable) description of ontologies. An ontol- 5.2 Rules ogy in WSML consists of the elements concept, relation, in- Due to well-known limitation of the expressive power of the stance, relationInstance and axiom. The primary elements Description Logics language family [25, 26], the need for a of an OWL ontology concern classes and their instances, richer set of descriptions w.r.t. properties emerges. As rule properties, and relationships between these instances. The systems are widely deployed, the harmonisation efforts have formality of the descriptions is based on logics that allow not been successful so far. A relatively new W3C initia- machines to reason on the information. Whereas OWL is tive, the Rule Interchange Format Working Group, is now based on Description Logics, the WSML family members after defining a core rule language for exchanging rules. This are based on different logic languages (ie. Description Log- Rule Interchange Format Core16 (RIF Core) language aims ics, Logic Programming or First Order Logic). at achieving maximum interoperability while preserving rule Despite the fact, that OWL is more widely adopted and 12 DAML+OIL Reference Description, see: used we believe that WSML with its layered framework is http://www.daml.org/2001/03/reference conceptually superior to OWL. A major difference between 13 http://www.w3.org/TR/rdfprimer/ ontology modeling in WSML and ontology modeling in OWL 14 http://owl1_1.cs.manchester.ac.uk/owl_ specification.html 17 ++ . . . good support, + . . . available , - . . . not supported, 15 http://www.wsmo.org * . . . supported because of WSML’s constructs for the de- 16 http://www.w3.org/TR/rif-core/ scription of Semantic Web Services is that WSML separates conceptual modelling for the non- 5.3.4 Granularity expert users, and logical modeling for the expert user as it— As stated above, when referring to granularity, we under- unlike OWL—uses an epistemology, which abstracts from stand the support of the definition of various spatial, tem- the underlying logical language making the surface syntax poral, and conceptual relationships regarding annotations. nicer. Even if an application later requires OWL, one is able In this sense, OWL and WSML meet the minimal require- to use WSML tools to convert ontologies that reside in pop- ments, but do not explicitly address this issue. Depending ular logic/language fragments automatically into equivalent on the granularity, obviously scalability and performance is- OWL ontologies. Furthermore the WSML family framework sues come along. In this respect, again, OWL and WSML enables one to choose exactly the language with the needed can be perceived comparable. expressiveness to be used, and later allows an easy switch to another family member as a consequence of its common grounding. WSML Rule and WSML Flight also include 5.3.5 Trust and IPR rule-support. Thus, unlike with OWL, no additional rule In an interdependent, interconnected environment as the Se- language is needed. mantic Web, two important aspects immediately arise: data provenance and trust [5]. Requirements regarding trust is- sues gathered from [37, 18] contain costs and benefits w.r.t. 5.3.2 Layering of Descriptions implementation, technology-driven vs. social networking, An array of existing multimedia metadata (M3) formats etc. have been used for years in diverse application areas. How- ever, when one aims at using these formats (as MPEG-7, Both WSML and OWL do not have explicit provisions for ID3, etc.) in the context of the Semantic Web, the options handling trust and IPR issues, respectively. are limited. Hence, to enable an efficient layering of RDF- based vocabularies on top of existing multimedia metadata, 5.3.6 Functional Descriptions one may use hybrid techniques. OWL and the WSML’s part for the description of ontologies do not have support for such kind of descriptions. As a result of our works in the media semantics area, we re- However, WSML is a language for the specification of on- cently proposed the RDFa-deployed Multimedia Metadata tologies and different aspects of Web services. As such it (ramm.x) specification [22]. ramm.x is a light-weight frame- not only provides means for modeling and description of on- work allowing existing multimedia metadata to hook into the tologies but also functional (service) descriptions, i.e. the Semantic Web using RDFa [1]. Ontologies based on WSML description of a service capability by means of precondition, and OWL are typically used in ramm.x to formalise a M3 assumptions, postconditions and effects [29]. format; this is especially important due to their interoper- ability features (see 5.3.3). 5.3.7 Engineering Support A different but as well Web compatible approach is described Tool Support for WSML and especially OWL is constantly in [31]. There, the authors propose the concept of seman- growing. However, the amount of tools available for OWL tic documents; semantic documents include any informa- [48] and WSML [13] can drastically not be compared. As tion regarding the document and its relationships to other OWL is a W3C Recommendation, the support for it is huge. documents. The concept is realised by including XMP de- scriptions in PDF documents which can be rendered in any 5.3.8 Data Type Support browser with available plugins. XMP is a format for embed- The reader is invited to note that both OWL and WSML ding metadata in documents using RDF. ground their datatype support on XML Schema. In WSML, XML Schema primitive datatypes, simple types and XML 5.3.3 Interoperability Schema derived datatypes are supported [39]; OWL adopts To adhere to the architecture of the WWW, OWL uses (1) the RDF(S) specification of datatypes [38], though some URIs for naming and (2) RDF to provide extensible descrip- XML Schema built-ins are problematic. tions. (3) OWL builds on RDF and RDF Schema and adds additional vocabulary for describing properties and classes. 6. CONCLUSIONS & OUTLOOK (4) The datatype support for OWL is grounded on XML The first question we kept open is ”What are real-world mul- Schema. timedia assets”? Real-world multimedia assets are multi- media objects which can be currently found embedded in WSML has a number of features which allow to integrate it HTML pages on the Web, as images, videos, etc. We see seamlessly in the Web: (1) WSML uses IRIs18 [15] for the three main reasons why media assets fail to enter the Se- identification of resources. (2) WSML adopts the names- mantic Web: pace mechanism of XML, and WSML and XML Schema datatypes are compatible. (3), WSML has an XML- and RDF based syntax for exchange over the Web. 1. There is a lack of the critical mass of annotated content To reach compatiability between WSML and OWL, WSML which is mainly due to the large scale automation of has a set of defined translators between OWL and WSML (semantic) visual analysis has not gone that far. This [11, 12]. is why the user is the central person in the process in order to provide manual annotations. Motivating user to attach complex annotations to content is not easy 18 IRIs are the successors of URIs to achieve. 2. Current traditional and Web 2.0 based approaches to Storage and Retrieval Methods and Applications for multimedia annotation are not useful to achieve the Multimedia, pages 284–295, San Jose, California, USA, goals of the Semantic Web: The most important as- 2005. pects that the Semantic Web intends to solve are (i) [5] C. Bizer and R. Oldakowski. Using context- and Annotation, (ie., how to associate metadata to a re- content-based trust policies on the Semantic Web. In source), (ii) Information Integration (ie., how to in- Proceedings of the 132th international World Wide tegrate information about resources), and (iii) Infer- Web conference on Alternate track papers & posters, ence (ie., reasoning over known facts to unleash hidden pages 228–229. ACM Press, 2004. facts). [6] T. Bürger and R. Westenthaler. Mind the gap - Existing multimedia metadata standards as MPEG-7 requirements for the combination of content and can be used to annotate but keep a certain amount knowledge. In Poster Proceedings of the SAMT 2006 of ambiguity amongst these annotations. As it is a Conference, Athens, Greece, 2006. standard it allows easy integration based on it (a re- [7] F. D., van Harmelen F., H. I., M. D., and P.-S. P. Oil: quirement for that is that everyone adheres to this An ontology infrastructure for the semantic web. standard!) but inference is not possible with the in- IEEE Intelligent Systems and their applications, formation attachable to a MPEG-7 file. The problem 16(2):38–44, 2001. with tagging is manifold; there are open issues, such [8] J. de Bruijn and A. P. (eds.). OWL− . WSML as consistency among tags, reconciliation of tags, and Deliverable D20.1v0.2 WSML Working Draft how to associate tags with parts of the tagged content. 05-15-2005, This huge amount of uncertainty will not allow reliable http://www.wsmo.org/TR/d20/d20.1/v0.2/, 2005. information integration, nor allow to reason on it. [9] J. de Bruijn, H. Lausen, A. Polleres, and D. Fensel. 3. As we argued in this paper, more requirements have The Web Service Modeling Language WSML: An to be fulfilled, which can not be solely solved by tra- Overview. In ESWC, pages 590–604, 2006. ditional or Web 2.0 based approaches and which make [10] J. de Bruijn (ed.). OWL Flight. D20.3v0.1 OWL more formalized descriptions of content necessary. How- Flight WSML Working Draft 23-08-2004, ever, before not being able to attach these directly to http://www.wsmo.org/2004/d20/d20.3/v0.1/, 2004. the media being described, multimedia assets will not [11] DERI. OWL - WSML Translator v1.0. http://tools. be able to enter the Semantic Web. deri.org/wsml/owl2wsml-translator/v0.1/, 2007. [12] DERI. WSML - OWL Translator v1.0. http://tools. deri.org/wsml/wsml2owl-translator/v0.1/, 2007. Regarding deployment of M3 format on the Semantic Web, [13] DERI. WSML Tools. http://tools.deri.org/wsml/, we recently proposed to use ramm.x in the Cultural Heritage 2007. domain [23]. [14] F. M. Donini, M. Lenzerini, D. Nardi, and A. Schaerf. AL-log: Integrating Datalog and Description Logics. Acknowledgements Journal of Intelligent Information Systems, The research leading to this paper was partially supported 10(3):227–252, 1998. by the European Commission under contract FP6-027026, [15] M. Duerst and M. Suignard. Internationalized “Knowledge Space of semantic inference for automatic an- Resource Identifiers (IRIs). IETF RFC 3987, 2005. notation and retrieval of multimedia content - K-Space” and http://www.ietf.org/rfc/rfc3987.txt. SALERO (contract number FP6-027122). [16] D. Fensel, H. Lausen, A. Polleres, J. de Bruijn, M. Stollberg, D. Roman, and J. Domingue. Enabling 7. REFERENCES Semantic Web Services: The Web Service Modeling [1] B. Adida and M. Birbek. RDFa Primer 1.0 - Ontology. Springer, 11 2006. Embedding RDF in XHTML. W3C Working Draft, [17] J. Geurts, J. van Ossenbruggen, and L. Hardman. W3C RDF in XHTML Taskforce, 2007. Requirements for practical multimedia annotation. In [2] R. Arndt, R. Troncy, S. Staab, L. Hardman, and Proceedings of the Workshop on Multimedia and the M. Vacura. COMM: Designing a Well-Founded Semantic Web, May 2005, Heraklion, Crete, pages Multimedia Ontology for the Web. In Proceedings of 4–1, 2005. the 6th International Semantic Web Conference [18] J. Golbeck, B. Parsia, and J. Hendler. Trust Networks (ISWC’2007), Busan, Korea, November 11-15, 2007, on the Semantic Web. In Proceedings of Cooperative (forthcoming), 2007. Intelligent Agents 2003, 2003. [3] T. Athanasiadis, V. Tzouvaras, K. Petridis, [19] N. M. Goldman. Ontology-Oriented Programming: F. Precioso, Y. Avrithis, and Y. Kompatsiaris. Using a Static Typing for the Inconsistent Programmer. In Multimedia Ontology Infrastructure for Semantic Proceedings of the Second International Semantic Web Annotation of Multimedia Content. In Proc. of 5th Conference - ISWC 2003, pages 850–865, 2003. International Workshop on Knowledge Markup and [20] Gomez-Perez, Fernandez-Lopez, and Corcho-Garcia. Semantic Annotation (SemAnnot ’05), Galway, Ontological Engineering. Springer, Berlin, 2004. Ireland, November 2005, 2005. [21] B. Grosof, I. Horrocks, R. Volz, and S. Decker. [4] W. Bailer, P. Schallauer, M. Hausenblas, and Description Logic Programs: Combining Logic G. Thallinger. MPEG-7 Based Description Programs with Description Logics. In 12th Infrastructure for an Audiovisual Content Analysis International World Wide Web Conference and Retrieval System. In Proceedings of SPIE - (WWW’03), Budapest, Hungary, 2003. [37] K. O’Hara, H. Alani, Y. Kalfoglou, and N. Shadbolt. [22] M. Hausenblas, W. Bailer, and T. Bürger. Deploying Trust Strategies for the Semantic Web. In ISWC Multimedia Metadata on the Semantic Web - Workshop on Trust, Security, and Reputation on the RDFa-deployed Multimedia Metadata (ramm.x). Semantic Web, 2004. Specification, ramm.x Working Group, 2007. [38] J. Z. Pan. Description Logics: Reasoning Support for [23] M. Hausenblas, W. Bailer, and H. Mayer. Deploying the Semantic Web. PhD thesis, School of Computer Multimedia Metadata in Cultural Heritage on the Science, The University of Manchester, 2004. Semantic Web. In First International Workshop on [39] D. Roman, H. Lausen, and U. Keller. Web Service Cultural Heritage on the Semantic Web, collocated Modeling Ontology (WSMO), WSMO Deliverable with the 6th International Semantic Web Conference D2v1.0., WSMO Working Draft 20 September 2004, (ISWC07), Busan, South Korea, 2007. September 2004. [24] M. Hausenblas, S. Boll, T. Bürger, O. Celma, [40] L. Simons. RDF at the Venice Project. C. Halaschek-Wiener, E. Mannens, and R. Troncy. http://www.leosimons.com/2006/ Multimedia Vocabularies on the Semantic Web. W3C rdf-at-the-venice-project.html, 2006. Blog Post. Incubator Group Report, W3C Multimedia Semantics [41] G. Stamou, J. van Ossenbruggen, J. Z. Pan, and Incubator Group, 2007. G. Schreiber. Multimedia annotations on the semantic [25] I. Horrocks, P. F. Patel-Schneider, S. Bechhofer, and web. IEEE MultiMedia, 13(1):86–90, 2006. D. Tsarkov. OWL rules: A proposal and prototype [42] R. Troncy. Integrating Structure and Semantics into implementation. Journal of Web Semantics, Audio-visual Documents. In Proceedings of the 2nd 3(1):23–40, 2005. International Semantic Web Conference (ISWC’03), [26] I. Horrocks, P. F. Patel-Schneider, and F. van volume LNCS 2870, pages 566–581, 2003. Harmelen. From SHIQ and RDF to OWL: The [43] R. Troncy, W. Bailer, M. Hausenblas, P. Hofmair, and making of a web ontology language. Journal of Web R. Schlatte. Enabling Multimedia Metadata Semantics, 1(1):7–26, 2003. Interoperability by Defining Formal Semantics of [27] J. Hunter. Adding Multimedia to the Semantic Web - MPEG-7 Profiles. In 1st International Conference on Building an MPEG-7 Ontology. In First International Semantics And digital Media Technology (SAMT’06), Semantic Web Working Symposium (SWWS’01), pages 41–55, Athens, Greece, 2006. Stanford, California, USA, 2001. [44] C. Tsinaraki, P. Polydoros, and S. Christodoulakis. [28] I. Jacobs and N. Walsh. Architecture of the World Integration of OWL ontologies in MPEG-7 and Wide Web, Volume One. TVAnytime compliant Semantic Indexing. In http://www.w3.org/TR/webarch/, 2004. Proceedings of the 16th International Conference on [29] U. Keller, H. Lausen, and M. Stollberg. On the Advanced Information Systems Engineering (CAiSE), Semantics of Functional Descriptions of Web Services. 2004. In The Semantic Web: Research and Applications [45] V. Tzouvaras (ed.). Multimedia Annotation (Proceedings of ESWC 2006), pages 605–619, 2006. Interoperability Framework; MMSEM XG Report. [30] M. Kifer and G. Lausen. F-logic: a higher-order http://www.w3.org/2005/Incubator/mmsem/wiki/ language for reasoning about objects, inheritance, and Semantic_Interoperability, 2007. scheme. In B. L. J. Clifford and D. Maier, editors, [46] J. van Ossenbruggen, F. Nack, and L. Hardman. That Proceedings of the 1989 ACM SIGMOD international Obscure Object of Desire: Multimedia Metadata on Conference on Management of Data, pages 134–146, the Web (Part I). IEEE Multimedia, 11(4), 2004. New York, NY, 1989. ACM Press. [47] W3C. Multimedia Semantics Incubator Group. [31] H. Kim, H. Kim, J. H. Choi, and S. Decker. http://www.w3.org/2005/Incubator/mmsem/, 2007. Translating Documents into Semantic Documents [48] W3C. Semantic Web Development Tools. using Semantic Web and Web 2.0. In Proceedings of http://esw.w3.org/topic/SemanticWebTools, 2007. the 1st Semantic Authoring and Annotation Workshop (SAAW2006), 2006. [32] H. Lausen, J. de Bruijn, A. Polleres, and D. Fensel. WSML - a Language Framework for Semantic Web Services. In Proceedings of the W3C Workshop on Rule Languages for Interoperability, 2005. [33] J. Markoff. Web content by and for the masses. New York Times Online, June 2005. [34] A. Morgan and M. Naaman. Why we tag: motivations for annotation in mobile and online media. In CHI ’07: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 971–980, New York, NY, USA, 2007. ACM Press. [35] MPEG-7. Multimedia Content Description Interface. Standard No. ISO/IEC n◦ 15938, 2001. [36] F. Nack, J. van Ossenbruggen, and L. Hardman. That Obscure Object of Desire: Multimedia Metadata on the Web (Part II). IEEE Multimedia, 12(1), 2005.