Metadata Based Annotation Infrastructure offers Flexibility and Extensibility for Collaborative Applications and Beyond Marja-Riitta Koivunen Ralph Swick World Wide Web Consortium World Wide Web Consortium MIT Laboratory for Computer Science MIT Laboratory for Computer Science marja@w3.org swick@w3.org ABSTRACT of possibilities that extend beyond basic annotation In this position paper, we describe three user scenarios that capabilities are opened. benefit from metadata based annotation infrastructure. We This paper describes a simple collaborative annotation explain how a basic annotation schema can be extended to scenario and then broadens the scope of the annotations in a support new scenarios. We also describe and evaluate some couple of additional scenarios. We briefly explain the basic other features and modifications that are useful when metadata infrastructure for annotations that is provided by implementing these scenarios. The most laborious part in our system, known as Annotea [1], and the features that are the scenarios is the design and implementation of new user needed to support the additional scenarios. interfaces; the metadata infrastructure itself easily supports the needs of the different applications and new schemas. 2 SCENARIOS We present three scenarios describing the use of Keywords annotations in different illustrative contexts. The first Annotation infrastructure, metadata, collaboration scenarios scenario explains the use of annotations for basic 1 INTRODUCTION collaboration, the second one shows an interpretation of The World Wide Web is a collaborative space that lets shared bookmarks as annotations, and the last scenario users share their thoughts, their work, their images, and examines the use of annotations for communicating other aspects of their life by publishing Web pages. But evaluation results. publishing is not enough; feedback and interaction is 2.1 Scenario: Using Annotations for Collaboration needed for collaboration. E-mails and netnews distributed University of Oslo organizes a seminar focusing on writing and archived in discussion lists are two of the earliest and research reports and collaboration. The goal of the seminar most important collaborative applications of the Internet. is not only to produce a report but also to learn from other Other applications such as irc1, Netmeeting2, and "buddy students' use of research methods and collaborative list" applications provide real time sense of presence, techniques and their approaches to problem solving. communication and sharing of resources. One student group elects to write a report on the Sharing content through Web pages is important but also is communication of whales. They collaborate by using the limited as readers can seldom share comments or questions Web to publish new material, to search and share hypertext by writing back to the pages, even when they are members links to references and to annotate the material they of a closed collaborative group. Instead, with the Web uncover. The group's discussions of their research material today we still observe much effort spent by users on is facilitated through a threading mechanism that links forming and trying to understand different e-mail together some of their annotations in chronological order. conventions for commenting on documents that are on-line They use an annotation (metadata) server dedicated to this in the Web. seminar in conjunction with other annotation servers to Shared annotations that do not require write access to the which they normally subscribe. annotated page can support very rich communications about The group gathers lists of references on a shared Web page. the Web pages. When these annotations are seen as The lists include an estimation of the papers' relevance and metadata about the pages or parts of them, and when the a preliminary categorization of the reference. As the metadata vocabulary is grounded in semantically rich students read each paper they mark the paper as interesting ontologies that are themselves published in the Web, a lot or uninteresting and refine the categorization. They use annotations to mark or question unclear text, point out 1 http://www.ietf.org/rfc/rfc2812.txt interesting perspectives, add keywords and share other 2 general comments with each other. http://www.microsoft.com/windows/netmeeting/ Later they dedicate one person to write more detailed vocabulary to the annotation vocabulary. The EARL replies to selected research questions pointed out in the vocabulary is a superset of the annotation vocabulary, so annotations and write a short summary. This starts fruitful Kim includes some style rules that instruct presentation discussions in the context of the reference document and the clients in the rendering of the extra properties of the EARL new summaries. By using annotations to conduct their metadata. commentary on their reading, the group avoids contention When students view their pages they see the EARL report for write access to a single shared document and potential items as annotations on the pages as a result of processing loss of data from conflicting updates. the inferencing rules. Now they can address the 2.2 Scenario: Using Annotations for Shared accessibility issues in the pages and add additional metadata Bookmarking to the annotations to note them as fixed or to request help In the first stage of gathering references for their report on from Kim. When Kim helps the group, he sends a mail to whale communication, the group uses traditional Web the mailing list explaining the problem and adds a link to search tools to locate references on the Web. They create the EARL annotation so that others in the group can benefit 'bookmark' annotations in their dedicated seminar from the example. annotation server to those references that appear relevant. When the work is done the group can run the accessibility When they create these bookmarks they also select a evaluation tools again. The document author can choose to category from a list of categories defined by a shared delete the earlier report annotations at this time or she may ontology or, if no existing category is a good match, they just mark them as obsolete. The group may also freeze a define new categories, adding each such category to a copy of the evaluated page with the original annotations. special seminar ontology that is stored in their shared Web space. The classification category is more metadata about 3 ANNOTEA METADATA INFRASTRUCTURE the bookmark annotation, one of a variety of such The metadata infrastructure of the Annotea project makes it extensions that the group can store with their metadata. easy to support the annotation scenarios presented above. The Annotea infrastructure provides flexibility and an easy When a user goes to a bookmarked page she sees the framework to extend the annotation capabilities to other existing bookmarks as annotations. The user can also ask applications. The basic infrastructure and the extensions for a list of bookmarks, in which case, a page is needed for the previous scenarios are discussed in the dynamically created showing bookmarks under different following sections. categories. The user may query all the bookmark annotations on the annotation servers or filter the list to 3.1 Basic Annotea Annotations show only certain bookmarks. The user may also ask for In the first scenario, the students annotate Web pages and just the bookmarks that belong to the concepts in a given use reply threads as supported by the Annotea ontology. infrastructure. 2.3 Scenario: Using Annotations to Present Evaluation Annotea sees annotations as metadata about a whole Results document or a part of a document. This metadata is written Kim is a teaching assistant in a collaborative seminar. He in RDF/XML [2], and can be stored in annotation servers wants to make sure that the students remember that the using the HTTP protocol. An annotation client queries readers of their documents may have different physical or annotations related to a document from one or several cognitive abilities in receiving and interacting with the annotation servers and presents them in document context. information. Kim uses the Web Accessibility Initiative3 The Annotea annotation model uses multiple RDF schemas guidelines and some automatic tools for assessing the e.g. Dublin Core4 (dc:) with the Annotation schema to markup used within Web pages. These accessibility define the basic annotation properties (see Figure 1). The assessment tools rely on EARL, a metadata language annotates property refers to the annotated document, the expressing what is or may be wrong in a page, citing by context property refers to the actual place of the annotation URI the specific guideline that describes the accessibility within the document, the body property contains the content issue. of the annotation, the dc:title property is a descriptive Kim stores the EARL analysis of each document in the annotation title. The other properties further describe the same annotation server that holds the seminar's other annotation. annotations. Kim also adds to the server some inferencing rules that represent a transformation from the EARL 4 http://dublincore.org/ 3 http://www.w3.org/WAI/ Annotation “This is rdf:type XDoc.html great” annotates dc:title 3ACF6D754 context dc:creator body Ralph postit.html created dc:date 2000-01-10T17:20Z 2000-01-10T17:20Z Figure 1: The basic annotation schema With RDF it is also easy to add new properties to the 3.2 Extending the Annotation Schema for Reply annotations. The DAML+OIL5 ontology construction Threads vocabulary [3] provides a framework for describing new Annotea has a concept of a reply that relates to an properties with precise semantics and placing those annotation or another reply. Replies can form discussion semantics in the Web. threads that start from an annotation. The reply schema looks similar to an annotation schema. It 5 http://www.daml.org/2001/03/daml+oil-index has two new properties, the reply-to property, which defines which annotation or reply was the previous one in the thread, and the root-of-thread, which is the first annotation in the thread. The generic metadata-based design of our annotation server made it easy to incorporate these additional properties. Reply 3ACF6D754 “I totally rdf:type root-of-thread postit.html agree” reply-to dc:title 2BCA7D661 annotation: context dc:creator annotation: Jose reply.html annotation: body created 2000-01-10T17:20Z Figure 2: The reply schema 3.3 Using Annotea for Shared Bookmark Annotations bookmark annotations can be presented as annotations on Shared bookmarks can be easily seen as annotations of type the pages with a special icon to visually differentiate them. bookmark. In addition, they need a category property. For that an icon property can be added to the metadata. Again, no changes are needed to our annotation server. The Addition of new properties for annotation schemas use several ontologies simultaneously to describe different necessitates a user interface change so that the client can aspects of their annotations. present them. The presentation style for a property can be Most work in the scenarios is needed in the customization described in the same metadata framework as properties of of the user interfaces for the different annotation properties. We expect to work on a schema for describing applications. More research is needed to ease the presentation characteristics as part of future development. presentation of the metadata, especially new properties Existing ontology construction applications provide user from ontologies the application (or user) may not have interfaces for ontology definitions and these are well suited previously seen. to the definition of categories for classifying bookmarks. The RDF model provides a convenient mechanism on The generic metadata approach to describing bookmarks which to layer client-side or server-side inferencing for naturally lends itself to supporting a variety of views on the mapping between ontologies. Further work to build bookmark database. User-customizable queries can select effective end-user tools to take advantage of this capability bookmarks by any criteria desired. is in progress. 3.4 Accessibility Evaluation Report Items as Annotea ACKNOWLEDGMENTS Annotations We thank Jose Kahan, Eric Prud’hommeaux, Art Barstow, Annotations can also be used to present automatically Eric Miller and other W3C staff for their many ideas that generated report items, such as accessibility evaluation have contributed to this paper. We also thank Charles items or markup validation items. If the report items are McCathieNevile for his help when we were experimenting described in the metadata format it is straight-forward to with EARL scenarios. map them to an annotation schema. For instance, the EARL report item reporting an accessibility problem has semantics REFERENCES that map easily into an annotation of a part or the whole of 1. José Kahan, Marja-Riitta Koivunen, Eric the evaluated Web page. This mapping can be expressed as Prud'Hommeaux, Ralph R. Swick, Annotea: An Open a collection of inference rules over the properties produced RDF Infrastructure for Shared Web Annotations, in by the EARL tools. Proc. of the WWW10 International Conference, Hong Kong, May 2001 The generic metadata framework provides the necessary (http://www10.org/cdrom/papers/488/index.html). flexibility to decide on a case by case basis whether to archive, delete, or revise annotations when a document is 2. Ora Lassila and Ralph R. Swick, eds., Resource reprocessed through the evaluation tool. The tool can Description Framework (RDF) Model and Syntax maintain state information for successive runs in the same Specification, W3C Recommendation, February 1999 metadata store. (http://www.w3.org/TR/1999/REC-rdf-syntax- 19990222). 4 CONCLUSIONS A metadata based annotation infrastructure such as Annotea 3. Frank van Harmelen, Peter F. Patel-Schneider and Ian can easily support a broad range of different annotation Horrocks, eds, Reference description of the DAML+OIL needs. The generic property mechanism of RDF allows us (March 2001) ontology markup language, Joint United to construct ontology-neutral data stores. Applications can States / European Union ad hoc Agent Markup Language Committee (http://www.daml.org/2001/03/reference.html).