=Paper= {{Paper |id=Vol-209/paper-1 |storemode=property |title=Translating Documents into Semantic Documents using Semantic Web and Web2.0 |pdfUrl=https://ceur-ws.org/Vol-209/saaw06-full05-kim.pdf |volume=Vol-209 |dblpUrl=https://dblp.org/rec/conf/semweb/KimKCD06 }} ==Translating Documents into Semantic Documents using Semantic Web and Web2.0== https://ceur-ws.org/Vol-209/saaw06-full05-kim.pdf
   Translating Documents into Semantic Documents using
                 Semantic Web and Web2.0
        Hak Lae Kim                           Hong Gee Kim                           Jae Hwa Choi                      Stefan Decker
Digital Enterprise Research Seoul National University                           Dankook University                    Digital Enterprise
Institute, National University 28-22 Yeonkun-dong                                 29, Anseo-Dong                     Research Institute,
      of Ireland,Galway             Jongro-gu                                Chonan, Chungnam, Korea                National University of
IDA Business Park, Galway,         Seoul, Korea                                  +82-41-550-3368                       Ireland,Galway
            Ireland              +82-2-7707452                                 jchoi@dankook.ac.kr                   IDA Business Park,
      +353-91- 495016           hgkim@snu.ac.kr                                                                        Galway, Ireland
    haklae.kim@deri.org                                                                                               +353-91- 495016
                                                                                                                   stefan.decker@deri.org


ABSTRACT                                                                      applications or software components to manage electronic
                                                                              documents in a Desktop, but it is very difficult to organize
                                                                              documents in a consistent way and to search expected ones in a
Managing metadata of documents is a difficult and slippery for                precise way.
desktop users. A wide variety of technologies have been applied
for supporting requirements of metadata management, ranging                   There have been many efforts [2, 3, 5, 6, 13, 19, and 23] to reduce
from the acquisition, creation, maintenance, retrieval, reuse, and            the complexity of metadata operations by implementing automatic
publishing of metadata.                                                       tools for acquisition, extraction, storage, and annotation. The
                                                                              Social Semantic Desktop [1] and Web2.0 are also reliable
We introduce essential concepts of a semantic document and                    technologies trying to promise solutions for metadata
implement the necessary functionality of metadata managing                    management.
process. We also propose that three tasks are required to facilitate
unambiguous representation of metadata in documents: using                    The Social Semantic Desktop is a new computing paradigm that
XMP to store metadata with the file itself, using ontologies to               provides an advanced way to create, automate and structure
represent semantic concepts and using Social Web services to                  information and “the technology convergences including the
interact with web based resources. So our approach allows a user              social network and community services, P2P services” [1, 3]. It
to interact and share the resources among a Desktop and Web                   could be provided for the transformation of a typical desktop
more easily.                                                                  system into a collaborative environment that supports both
                                                                              personal computing and information sharing via social and
                                                                              organizational channels [17]. There are several approaches in this
Categories and Subject Descriptors                                            direction such as Haystack 1 , Gnowsis 2 , IRIS 3 etc.
I.7.1 [Document and Text Editing]
                                                                              Web2.0 comprises technologies and services to enable users to
                                                                              collaborate and share social contents. From the technical point of
General Terms                                                                 view, it includes social software, content syndication, messaging
Management, Documentation,             Design,    Reliability,    Human       protocol such as weblogs, wikis, podcasts, RSS feeds etc. Social
Factors, Languages.                                                           softwares are not only focused on connecting people, but also on
                                                                              sharing data. Therefore, it plays an important role in building
Keywords                                                                      social networking on the web. There exist well-known Web2.0
Semantic Document, Semantic Desktop, Web2.0, Folksonomy,                      sites like Flickr 4 , del.icio.us 5 , Technorati 6 and the majority of
Semantic Web etc.                                                             such sites are connecting people into communities creating
                                                                              networks of shared experience using folksonomy and RSS [10]. In
                                                                              general terms, a folksonomy represents the set of tags containing
1. INTRODUCTION                                                               one or more keywords. Users create tags using their own
                                                                              knowledge then other people use same terms and the content is
Managing electronic documents in a Desktop is a more
challenging task for end users [5]. There are many kinds of
                                                                              1
                                                                                  http://haystack.lcs.mit.edu/
 Permission to make digital or hard copies of all or part of this work for    2
                                                                                  http://nepomuk.semanticdesktop.org/xwiki/bin/Main1/
 personal or classroom use is granted without fee provided that copies are
                                                                              3
 not made or distributed for profit or commercial advantage and that              http://www.openiris.org/
 copies bear this notice and the full citation on the first page. To copy     4
                                                                                  http://www.flickr.com
 otherwise, or republish, to post on servers or to redistribute to lists,
                                                                              5
 requires prior specific permission and/or a fee.                                 http://del.icio.us
 SAAW’06, November 6, 2006, Athens, GA, USA                                   6
 Copyright 2006 ACM 1-58113-000-0/00/0004…$5.00.
                                                                                  http://www.technorati.com
linked. Hence the Social Web Services contains all features of         XMP. It is possible to reuse and share for other users easily. (iii)
web services and social software through a folksonomy.                 We provide a user-friendly interface to extract or create metadata
                                                                       and efficient navigation through ontology and tags.
1.1 Problems
                                                                       1.3 Outline of the paper
As illustrated by a Semantic documentation of Section 2, desktop
environments have critical problems to manage [6]:                     The main part of this paper is about how desktop systems can use
                                                                       resources to enrich metadata in document. So we decide to use the
Heavyweight cognitive activity. The hierarchical file structure of     Social Semantic Desktop and Web2.0 technologies for making
desktop systems allows users to find the documents easily, but         semantic documents in a Desktop. Especially we focus on PDF
also reminds users of their respective task. There are, however,       (Potable Document Format) which is the most well-known
some critical limitations within the file structure for managing the   document format and on XMP which represents embedded
information resources within a Desktop application. Users              metadata in PDF.
regardless of their behavior need to remember their document’s
name, the directory it was saved in, the saved time amongst other      The remaining of this paper is structured as follows: Section 2
details. Because most activities are doing by human themselves         defines a Semantic Documentation and proposes the Semantic
this behavior requires heavyweight cognitive activity.                 Document Model for our research. Section 3 then explains the
                                                                       design principles. Section 4 describes the system architecture and
Multiple semantics. The hierarchy file system doesn’t provide          the metadata managing process for a semantic document. Finally,
multiple semantics for a single directory. How could a user save a     the paper concludes with Section 5.
paper about a conference and a location? A user could create a
“Conference_Location” or “ConferenceLocation” folder as its
name. It is a slightly ambiguous approach and doesn’t reflect          2. Semantic Documentation
multiple semantics correctly. In other words, a computer cannot        2.1 Semantic Document
process the inter-relationships between file names and directory
names if their naming is different.
                                                                       Lawrence (Lawrence et al., 2004) defines that a semantic
Poor updatability and interoperability. Compared with web              annotation is “the process of mapping instance data” to a
content, Desktop content is difficult to modify without an owner’s     semantic structure such as an ontology. A semantic document
intervention. If the users spend a significant amount of time          includes any information regarding the document and its
adding and/or modifying documents, the updatability of desktop         relationship with other documents [27]. A semantic annotation of
content might be high. However, the majority of people don’t           documents formally identifies concepts and relations between
spend their time adding additional information to the document.        concepts in documents, and is intended primarily for use by
Also it is hard to share documents with other users despite P2P or     machines [28]. Therefore, a semantic annotation is a key notion
instant messenger, both of which are supposed to provide file          and a basic technology for the realization of a semantic annotation.
sharing services.                                                      It is augmentation of data to facilitate automatic recognition of the
Editing problem. The metadata-oriented approaches provide              underlying semantic structure such as document structure (title,
enriched functionalities such as managing, searching and even          section, paragraph, etc.), linguistic structure (dependency,
sharing information in information systems. There exist a variety      coordination, thematic role, conference, etc.), and so forth.
of metadata schemes as de facto standards such as RDF, Dublin          Basically it is based on the semantically links between
core, vCard. But these approaches are not a panacea. The               information stored within a document and the ontology.
operations over metadata are complex and time-consuming.               Ontologies are conceptualizations of a domain that typically are
Moreover, a metadata is stored separately from the document and        represented using domain vocabulary.
is connected by external references or links like XPointers. When
a document are edited, deleted, or copied, however, it is the
maintenance of the links that become a problem. This problem           2.2 PDF and XMP
has been termed the editing problem by the Open Hypermedia
community. A straightforward solution to “editing problem” [4]             PDF is an open document format developed by Adobe. Most
is to embed the metadata in the document itself.                       authors and publishers use it to store and to view documents.
                                                                       There are some advantages of using PDF format as the basis for
1.2 Contributions                                                      semantic documents. PDF supports on-line viewing and printing
                                                                       while containing semantic information linked to the document
We present three contributions. (i) We propose the architecture        itself [26] and provides extensible ways to add new information
and implement the tool to interact between a Dekstop and Web. It       inside document using XMP.
bootstraps the management of metadata and stimulates a user to             In a nutshell, XMP (eXtensible Metadata Platform) is a format
participate in information management activity. (ii) We propose        for embedding metadata in documents. It is a labeling technology
how desktop documents can be enriched using existed                    that allows users to embed data about a file, known as metadata,
technologies like Semantic Web and Web2.0. Ontology and                into the file itself [10, 11, and 15]. It consists of a data model, a
Folksonomy based metadata are important part of our system. A          storage model, and schemas. A data model is a useful and flexible
generated metadata by a user can be saved in document itself as        way of describing metadata in documents. It defines the kinds of
metadata values and concepts that can be represented. A storage
model, as the implementation of the data model, includes the
serialization of the metadata as a stream of XML and XMP
Packets, a means of packaging the data in files [10]. Also
schemas are predefined sets of metadata property definitions that
are relevant for a wide range of applications, including all of
Adobe’s editing and publishing products, as well as for
applications from a wide variety of vendors.
The specific serialization syntax is important. As long as the
mapping to the data model is well defined, it is reasonably easy to
convert between different ways to write the metadata [11]. XMP
makes use of the Resource Description Framework (RDF), which
is based on XML. By adopting the RDF standard, XMP benefits
from the documentation, tools, and shared implementation
experience that come with an open W3C standard [7-10].

2.3 Semantic Document Model

In this section, we describe the Semantic Document Model where
users are managing metadata of their documents. Most users are
doing their information management activity with both desktop
and web applications; here, we describe a conceptual model for        Figure 1 Semantic Document Model
managing metadata using desktop resources and resources of
social web sites. Firstly, the Semantic Document Model consists
of a number of ontologies to define a metadata structure.
Basically we propose the document schema ontology 7 for
                                                                      3. Design Principles
describing metadata of document. It can be locally maintained,
interlinked and highly structured semantic information of each        In this section, we describe basic design principles, which are
document. We propose the document type ontology to describe           founded on the general problems sketched in the introduction
publication’s type of research communities and relevant concepts      above. Table 1 depicts simple processes for semantic document
- proceedings, thesis, article, technical reports etc. Domain         and requirements for solving problems. The key functions or
ontology describes a certain subject which is closely related to a    process are extraction, creation, storage, index, and search. An
content of document. It might be extended by users as they need.      overview of the matrix is given in Table 1. It shows functions are
Furthermore, users are able to get valuable piece of tags from        mainly used to answer challenges set forth in the introduction.
various roots like the social web sites, user’s blogs.
Figure 1 shows the Semantic Document Model which defines                                  Table 1 Design Principles
types of information. Basically it contains a physical information
and basic content metadata of a document which supports by                Processes    Extraction     Creation      Storage      Search
conventional file systems. Also a semantic document consists of       Problems                                      & Index
social information and ontological information.
                                                                      Heavyweight           X              X                        X
                                                                      cognitive
                                                                      activity
                                                                      Poor                  X                           X           X
                                                                      updatability
                                                                      Multiple                             X                        X
                                                                      semantics
                                                                      Editing                                           X           X
                                                                      problem


                                                                      Extraction. In order to reduce heavyweight cognitive activities of
                                                                      a user, the extraction process allows semi-automatic or automatic
                                                                      methods. Basically, the results of the this process can involve
                                                                      with a metadata of documents, physical information such as file
                                                                      name, size, and date etc. In addition, this process should extract a
                                                                      metadata from weblogs or social web services.
7
    http://www.blogweb.co.kr/research/ontology
                                                          Figure 2 Architecture


Creation. To generate or modify metadata users can use various         meaning through collaborative work on the Web. Although
sources such as ontologies, tags, and even physical information.       ontology and folksonomy have different approaches to make
Users can define their own knowledge structures which are called       meanings, they can both supplement each other in the process of
domain ontology. Also tagging is one of new approaches to create       creating metadata and searching it.
metadata. In order to allow for the creating this metadata, the           Basically metadata of a document is extracted from the
process must be supported by tools.                                    document itself. The Metadata Extractor can parse and deliver
Storage & Index. A document metadata must be existed in the            metadata inside the document to the Metadata Explorer. Also
document itself to avoid the editing problem. And the metadata         users are able to get valuable piece of information from various
should have URIs of web resources. It becomes a starting point to      roots like folksonomies, user’s blogs, or even ontologies when
connect on the Web.                                                    they would create metadata. Then all kinds of metadata should be
Search. This process must cover ontology-based and tag-based           saved in certain PDF file itself as XMP.
search. The search results must be connected other resources as            Each document including metadata is built and is stored the
URIs. For example, a user identified the tags at a particular time,    index automatically. It allows user to search using the domain
with URIs of web resources. But when they search, they can get         ontology or tags. Search results would contain relevant data such
unintended results with the tags because tags or folksonomies are      as raw file information, ontology concepts, and tags from embed
self-evolutionary. It can be solved the problems of Poor               metadata. If users want to see web resources with relevant results,
updatability and interoperability in a Desktop.                        they may be getting all lists of the terms from specific blogs or
                                                                       social web services sites.
4. Implementation                                                      In order to solve general problems and support the processes
                                                                       mentioned the introduction above, we provide core UIs such as
Figure 2 illustrates our architecture designed in response to the      the Metadata Explorer, Ontology Editor, and Tag Generator etc.
opportunities for functionality identified in the previous section.    tool support is essential component of the semantic document
In this architecture, metadata of documents is created by two          approach.
different sources, based on the ontologies and folksonomies. The
                                                                            z     Metadata Extractor : extract metadata from a
idea behind the methods is based on the following observations.
                                                                                  document
Ontologies are “intentional models” of information models of
information contents with a well-defined logical basis which can            z     Metadata Explorer: view, create, and modify metadata
be used for reasoning [13]. A folksonomy provides a shared
     z    Ontology Editor: view, edit an ontology                    Insert ontology concepts. Users can define their own ontology
     z    Tag Generator: create, view tags                           using the Ontology Editor. It provides functionalities for editing
                                                                     and browsing ontology and allows users to define and update
     z    Search : keyword, ontology based search                    ontology in a tree structures. The Subject item which describes
In the following subsection we explain the concrete realization      [dc:subject] in Dublin Core, related to a specific domain ontology
and processes.                                                       in our system. The Type item which describes [dc:type] in the
                                                                     document type ontology concerns a document type. Users select a
4.1 Metadata Extraction                                              node to insert it into the subject or type item in the Metadata
                                                                     Explorer from the Ontology Editor.

Metadata Extraction is an internal process. Users do not need to
know how it works since XMP is machine readable metadata. The
XMP handler extracts a XMP metadata using Jena RDF API and
display each items in the Metadata Explorer (see Figure 3).




                                                                                           Figure 4 Tag Generator
                                                                     Insert tags. To add certain tags we provide several functions.
                                                                     Users can add tags from social web services using the TagCloud 8
                                                                     interface. It shows folksonomy from Flickr or Del.icio.us etc. In
                                                                     addition, if users want to create tags automatically, they would
                                                                     create tags using the Tag Generator (see Figure 4). It is based on
                                                                     the Yahoo’s Content Analysis web service 9 which is a context
                                                                     extraction web service. This service allows retrieval of terms that
                                                                     were extracted from a given text [13]. Tags which users selected
                                                                     will be added in Keyword item in the Metadata Explorer.
                                                                         After inserting relevant items, it can be saved in the file as
                                                                     well-defined data in RDF format. One of the main advantages of
                                                                     serializing XMP as RDF is that this has potential possibility for
                                                                     reaching ubiquity as the cross-platform container for machine
                                                                     readable/processible metadata [20].
                  Figure 3 Metadata Explorer
                                                                     Ontological concepts and tags can be assigned to a document; the
The Metadata Extractor can automatically extract embedded            document in desktop no longer has to be in a single folder.
metadata if documents have pieces of information and the             Eventually it can be solve the restriction of multiple semantics in
Metadata Explorer shows the items of metadata. It allows users to    desktop. In addition, the tags contain relevant URIs or feeds on
add or modify metadata directly in the fields as it allows editing   the Web. It can be evolved itself without any human interruption.
items. Unfortunately some items (subject, tags etc) should be        It means desktop documents can be evolved through connecting
added manually. In following section, we describe two kinds of a     the Social Web services.
way to add metadata in document. Since it provides user-friendly
interface, a user would be saved their time and effort to create
metadata.

4.2 Metadata Creation
                                                                     8
                                                                         http://www.tagcloud.com
                                                                     9
                                                                         http://developer.yahoo.net/search/content/V1/
                                                      Figure 5 Unified Search View


                                                                      one computer at this moment. So if users want to make multiple
4.3 Indexing and Search                                               one, they should select upper level folder.
                                                                         Users may search for more specific information regarding the
We build an index using XMP which already embedded in PDF             topics or keywords, but are not sure how to narrow their search.
file. We use Jena 10 to parse the XMP data and Jakarta Lucene 11 to   Although they are typing in several terms, they cannot sure
index metadata. This is the most popular document indexing and        results. Our tools are able to help users in narrowing down their
search library available for Java and .Net. Since Lucene by itself    search range using the Ontology Editor and to search related
will accept and process only plain text, some kind of adapter must    items using the results.
be used that can extract plain text from PDF files in order for
those files’ content to be added to a Lucene index. This process is   Ontology-Based Search. The search component executes a
done using the XMP Parser class module in Jena. With                  search across the ‘subject’, ‘title’, ‘keyword’ and ‘description’
Jena/Jakarta Lucene user can select a folder they want to build an    metadata fields as well as the text of PDF files. If a user cannot
index. This is quiet simple. User clicks the Browser button, and      find a start term, he or she can use the Ontology Editor. The
then chooses the folder. But we don’t provide multiple indexes in     search results display the ‘file name’, ‘title’, ‘description’, ‘date’,
                                                                      ‘format’ and ‘weighted score’ and ‘format’ metadata fields. The
                                                                      weighted score is a weighted primary according to the subject
10
     http://jena.sourceforge.net                                      filed in the metadata. The Ontology Viewer is used for a refined
11
     http://lucene.apache.org/java/docs/                              searching. If user chooses several terms in the Ontology Editor,
then results change automatically. It allows user to combine any       [3] Sauermann, L., The gnowsis semantic desktop for information
fields such as subject, title, description.                               integration, In: 1st Workshop on Intelligent office appliances,
                                                                          2005
                                                                       [4] Leslie. C, Timothy. M.B, and Arouna. W, The Case for
Tag-Based Search. This function gathers RSS feeds from a set of           Explicit Knowledge in Documents”, DocEng’04, 2004.
selected remote tags. When a user chooses a keyword in their           [5] H.L. Kim, H.G. Kim, and K.M. Park, Ontalk:Ontology-Based
results, it collects the related feeds with the selected keyword       Personal Document Management System, WWW2004.
from the remote web blog. The data is collected simultaneously
when the search executes. Currently we selected a list of RSS          [6] H.L. Kim, H.G. Kim, and Decker,S., Semantic Documentation
feeds consisting of several web blog sites. The tag-based search          using Semantic Web Technologies and Social Web Services,
interacts with the information published in user’s blog. It tries to      In:Proc. International Conference on Next Generation Web
enrich users’ metadata with associated information in web.                Services Practices (NWeSP'06), 2006
                                                                       [7] Jenneke. F, Johan. P, Wray. B, Tag-Based Navigation for
Figure 5 shows the search results which includes file information,        Peer-to-Peer Wikipedia, WWW2006, 2006.
ontology, and folksonomy. That is, our tool provides unified           [8] Adobe, XMP SDK Overview, 2001.
search views. Firstly, a user can see physical information of files.   [9] Gray. K, A Manager’s Introduction to Adobe eXtensible
Even though the Window Explorer already provides this function,           Metadata Platform, the Adobe XML Metadata Framework,
it is useful because the Result View includes not only a file name,       Adobe Whitepaper, 2001
folder, but also content’s title, keywords, concepts. Secondly, if a   [10] Adobe, XMP Specification, 2005. available at:
user want to see more detail metadata information, they click each        http://partners.adobe.com/public/developer/en/xmp/sdk/xmpsp
list in results, and then it opens the Metadata Explorer. Finally, a      ecification.pdf
user is able to reuse keywords, which attach raw files as metadata,    [11] Alan. L, Duane. N, OpenDocument metadata and XMP,
of the clouds in blog. If a user wants to see blog entries with           2005,           available          at:       http://www.oasis-
relevant results, she clicks the term of keywords in results and          open.org/archives/office/200512/msg00009.html
then she can get all list of the term – “clicked term”.                [12] Hopkins, I., Vassileva, J, Beyond keywords and
                                                                          hierarchies, .Journal of Digital Information Management 3
                                                                          (2005) 139–145
5. Conclusions and Future Work                                         [13] Stuckenschmidt, H., Harmelen F. V, Ontology-Based
                                                                          Metadata Generation from Semi-Structured Information,
                                                                          In:Proc. 1st international conference on knowledge capture(K-
    This paper describes a means for managing a semantic
                                                                          CAP’01), 2001, pp 440-444.
document by leveraging two kinds of metadata: ontology based
                                                                       [14] Kraft, R., Maghoul, F., Chang, C. C, Y!Q: Contextual Search
and tag-based. In order to enable documents to be unambiguously
                                                                          at the Point of Inspiration, In:Proc. CIKM’05 , 2005.
used by human and machine, metadata should be represented with
                                                                       [15] Johnson, A, XMP Blaster: Embedding Metadata into Digital
explicit part of documents. The document schema ontology
                                                                         Photographs,
contains ontological concepts as well as social collective tags.
                                                                         http://www.mines.edu/Academic/courses/math_cs/macs370/FS2
Furthermore metadata could be existed embedded object in the
                                                                         004/FinalReports/FinalWhite.pdf
document rather than being separated with it. An embedding
metadata could be stayed with file content itself regardless of        [16] Kevin Broccoli, Improving Information Retrieval with
moving, modifying the file. The documents would then be                  Human Indexing,
indexed and be searched by semantic tools. Hence making                  http://www.intranetjournal.com/features/humanindex-1.shtml
semantic documentation an explicit and embed part of the               [17] Mander. R, Salomon. G, and Wong. Y.Y, A ‘pile’ metaphor
document makes the metadata managing process easier to support.          for supporting casual organization of information, In:Proc.
We have focused mainly on PDF format. But we have plan to                Conf. on Hum. Factors in comp. sys., 1992, pp 627-634
process different format like JPEG, GIF, Microsoft Office formats
etc. Our future work plans include a more detailed focused on the      [18] James. H, Abby. G, Why can’t I manage academic papers
mechanisms to interact and feedback between Desktop and Web.             like MP3s? The evolution and intent of Metadata standards,
The approach, model, and techniques of this research will be             2004
explored in our future work.                                           [19] Handschuh, S., Staab, S.: Authoring and Annotation of Web
                                                                       Pages in CREAM. In:Proceedings of the Eleventh International
6. ACKNOWLEDGMENTS                                                     World Wide Web Conference, Honolulu, Hawaii, USA.2002.
We also thank our colleague Dr. Handschuh for his continued            [20] Tallis, M.: Semantic Word Processing for Content Authors.
guidance and his assistance with information for this paper.           In: Proceedings of the Knowledge Markup & Semantic
                                                                       Annotation Workshop, Florida, USA. (2003) Part of the Second
7. REFERENCES                                                          International Conference on Knowledge Capture, K-CAP 2003.
[1] Decker, S., Frank, M. The networked semantic desktop, In:          [21] Fillies, C., Wood-Albrecht, G., Weichardt, F.: A Pragmatic
   Workshop on application design, development and                     Application of the Semantic Web using SemTalk. In: Proceedings
   implementation issues in the semantic web. 2004.                    of the Eleventh International World Wide Web Conference,
[2] D.Quan, D.Huynh, and D.R. Karger., Haystack: A Platform            Honolulu, Hawaii, USA. (2002) 686-692
   for Authoring End User Semantic Web Applications, In
                                                                       [22] Ontoprise GmbH: OntoOffice Tutorial.
   International Semantic Web Conference 2003, 2003
                                                                       http://www.ontoprise.de/documents/tutorial ontooffice.pdf (2003)
[23] Carr, L., Miles-Board, T., Wills, G., Woukeu, A. and Hall, W.   [28] J. Heflin, J. Hendler and S. Luke: SHOE: A Knowledge
(2004) Towards a Knowledge-Aware Office Environment. In              Representation Language for Internet Applications, Technical
Proceedings of 5th International Conference on Practical Aspects     Report CS-TR-4078 (UMIACS TR-99-71), 1999.
of Knowledge Management (PAKM 2004) LNAI 3336, pp. 129-              [29] Guoren, W., Bin, W., Donghong, H., and Baiyou, Q.: Design
140, Vienna, Austria. Karagiannis, D. and Reimer, U., Eds.           and Implementation of a Semantic Document Management
[24] Martin, P & Eklund, P: Embedding Knowledge in Web               System, Information Technology Journal 4 (1): 21-31, 2005
Documents, In: Proceedings of the 8th Int. World Wide Web Conf.      [30] Uren, Victoria; Cimiano, Philipp; Iria, Jose; Handschuh,
(WWW’8), Toronto, May 1999, 1403-1419                                Siegfried; Vargas-Vera, Maria; Motta, Enrico; Ciravegna, Fabio.;
[25] Anita, D., W., Gerard, T.: The ABCDE Format: Enabling           Semantic Annotation for Knowledge Management: Requirements
Semantic Conference Proceedings,                                     and a Survey of the State of the Art, Journal of Web Semantics 4
[26] Henrik Eriksson: A PDF Storage Backend for Protégé,             (1):14-28, 2006
http://protege.stanford.edu/conference/2006/submissions/abstracts    [31] Lawrence Reeve, Hyoil Han: Technical Report: Semantic
/9.4_Protege-2006-Eriksson.pdf                                       Annotation Platforms,
[27] S. Staab, A. Maedche, and S. Handschuh.: An annotation          http://www.pages.drexel.edu/~lhr24/pubs/2004SemanticAnnotatio
framework for the semantic web. In Proceedings of the First          nTechnicalPaper.pdf, 2004.
Workshop on Multimedia Annotation, Tokyo, Japan, January 30-
31, 2001.