<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SALT: Semantically Annotated LTEX A</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tudor Groza</string-name>
          <email>tudor.groza@deri.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Siegfried Handschuh</string-name>
          <email>siegfried.handschuh@deri.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hak Lae Kim</string-name>
          <email>haklae.kim@deri.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Digital Enterprise Research Institute IDA Business Park</institution>
          ,
          <addr-line>Lower Dangan Galway</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>11</lpage>
      <abstract>
        <p>Machine-understandable data constitutes the basis for the Semantic Desktop. We provide in this paper means to author and annotate Semantic Documents on the Desktop. In our approach, the PDF file format is the basis for semantic documents, which store both a document and the related metadata in a single file. To achieve this we provide a framework, SALT that extends the Latex writing environment and supports the creation of metadata for scientific publications. SALT lets the scientific author create metadata while putting together the content of a research paper. We discuss some of the requirements one has to meet when developing such an ontology-based writing environment and we describe a usage scenario.</p>
      </abstract>
      <kwd-group>
        <kwd>LATEX</kwd>
        <kwd>semantic annotation</kwd>
        <kwd>semantic document</kwd>
        <kwd>authoring</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>The vision of the Semantic Desktop aims on the integrated
personal information management as well as on information
distribution and collaboration. This will be enabled by the use of
ontologies, semantic metadata, which is machine-understandable data,
and semantic web protocols. Hence, semantic metadata constitutes
the basis for the Semantic Desktop. To author and annotate
semantic documents on the desktop is one mean to create semantic
metadata.</p>
      <p>In this paper we provide means to author and annotate Semantic
Documents on the Desktop. In our approach, the PDF file format
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.</p>
      <p>Copyright 200X ACM X-XXXXX-XX-X/XX/XX ... $5.00.
is the basis for semantic documents, which stores both a document
and the related metadata in a single file. To achieve this we provide
a framework, SALT that extends the Latex writing environment and
supports the creation of metadata for scientific publications. SALT
lets the scientific author create metadata while putting together the
content of a research paper.</p>
      <p>
        Previous work in the creation of semantic metadata and
annotation of documents is mainly concentrated on the annotation of
HTML documents for the semantic web. Most of these HTML
annotation tools [
        <xref ref-type="bibr" rid="ref14 ref26 ref5">14, 26, 5</xref>
        ] were following an a-posteriori
annotation step. In order to provide metadata about the contents of a web
page, the author must first create the content and second annotate
the content in an additional, a-posteriori, annotation step.
      </p>
      <p>
        The a-posteriori approach is reasonable when the annotator is not
the owner of the web document, as it is a common use case in the
web. However, a-posteriori annotation puts an additional load on
the author, when he is identical with the annotator. As a way out
of this problem is the possibility to easily combine authoring of a
document with the creation of the metadata describing its content.
First steps towards this for HTML documents in the web context
are described in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        HTML is the document format for the web and thus research on
semantic annotation is centered around this. But, an important and
dominant format on the desktop is the portable document format.
PDF can be seen at the moment as the de facto standard in terms
of electronic publishing, especially in the research area. However,
we observed that there exists a small number of solutions for
aposteriori semantic annotation of PDF documents ([
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]). Also – to
our knowledge – there is no clear defined approach yet for a priori
PDF annotation.
      </p>
      <p>Our approach proposes a method for creating a priori
annotations for PDF documents, by exploiting the rich environment
provided by LATEX. We support the method with a document ontology
mapping the internal structure of the document, an rhetorical
structure ontology describing the argumentative structure of research
papers, and an annotation ontology gluing the annotation to the
document and providing additional metadata information. The
annotation process takes place while writing and the actual integration is
realized at syntax level by exploiting regular LATEX command plus
the introduction of special annotation commands. The final result is
represented by a semantic PDF document encapsulating instances
of the aforementioned ontologies.</p>
      <p>In the following we describe the preliminaries of this work
(Section 2), sketch a use-case in Section 3. Then, we give an overview
(Section 4) of the annotation and publication process. In Section 5,
we describe the modularization of the used ontologies and
introduce the annotation syntax. Before we conclude, we give a overview
of related work and discuss some aspects of our solution.</p>
    </sec>
    <sec id="sec-2">
      <title>PRELIMINARIES</title>
      <p>In this Section we provide definitions of important terms we use
subsequently and we explain basic design decisions.
2.1</p>
      <p>Terminology
• Semantic Document A semantic document includes any
information regarding the document and its relationship with
other documents. In our cases this is a PDF document
enriched with semantic annotations. A Semantic Document
can explicitly refer to another document by using
ontological relations. For example, document A refers to a claim in
Document B – by refereing to the URI of the claim – and
provide counter arguments.
• Semantic Annotation The term Semantic Annotation
describes a process as well as the outcome of the process. Hence
it describes i) the process of the addition of semantic data or
metadata to the document given an agreed ontology and ii) it
describes the semantic data or metadata itself as a result of
this process. In our context semantic annotation is a set of
instantiations attached to a PDF document. We distinguish
i) instantiations of RDF classes, ii) instantiated properties
from one class instance to datatype instance – also called
attribute instance, and ii) instantiated properties from one class
instance to another class instance.
• Annotation Ontology: We use this term to denote a
vocabulary which relates instance of an document ontology with
annotations. The annotation could be instances of an
arbitrarily ontology. In our case these are either instances of i)
the rhetorical structure ontology or ii) a domain ontology
associated to the topic of the document (e.g. about biology).
The annotation ontology describes what an annotation is and
which relations are possible between the subject and the
object of annotation. Further, the annotation contains attributes,
which are useable to describe the metadata of a document,
such as author, title of the document (cf. Section 5.1.2).
• Document Structure and Type Ontology: in our context
is a explicit shared formal specification of a document. This
contains the document structure, the type, organization and
the relationship between documents and other concepts. We
will call this ontology Document Ontology (cf. Section 5.1.1)
for short.
• Rhetorical Structure Ontology: We use this term to
denote a vocabulary modeling the rhetorical structure of the
text (RST) inside a document (cf. Section 5.1.3). RST
captures the roles of every part of the text and tries to provide a
plausible reason for its presence. RST describes the text on a
generic and on a specific level.</p>
      <p>
        – The generic level describes parts of a scientific
document such as motivation, background, scenario or
contribution. The generic level is thus a modification and
extension of the ABCDE format[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Apposed to the
ABCDE format, we did not have an application for the
Annotation and Entities part, since this is covered by
our Annotation Ontology. But we missed other parts
such as Motivation and Scenario.
– The specific level is denoting rhetorical relations, for
example, Concession, Circumstance or Means (cf. [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ])
and thus allows a fine-grained description of the
argumentation in a scientific document.
2.2
      </p>
    </sec>
    <sec id="sec-3">
      <title>HTML, PDF and XMP</title>
      <p>
        While HTML documents offer the possibility of accessing their
composing objects, like the text or images, because of its implicit
structured text-based format, not the same thing can be said about
PDF documents. They have a totally different internal organization
representing a combination of several types of complex objects and
streams [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] together with their associated properties. Thus,
postcreation analysis of the content depends on a handful of parameters,
as accessing rights, image analysis or text retrieval algorithms
accuracy.
      </p>
      <p>
        A similar situation can be found also when analyzing the
annotation support. HTML documents enable metadata (annotation)
storage directly inside them, without the need of complex operations
(including instances of ontologies). In the PDF documents case,
this support is split between capturing metadata using a limited set
of DublinCore [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] elements, in the XMP [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] field and creating
annotations in forms of, for example, notes or markups. There is
no natural way of embedding instances of ontologies in PDF
documents, without either changing the document internal structure,
which can be done using Adobe SDK [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], either re-modeling the
XMP field. Our approach follows the second possibility, by
encapsulating in the XMP field instances of the document, the annotation
ontology and the rhetorical structure ontology, as well as arbitrary
annotations of the user.
3.
      </p>
    </sec>
    <sec id="sec-4">
      <title>USE-CASE</title>
      <p>In the following, we describe a use-case 1 that is supported by
SALT and that has guided our development of the framework. The
use-case requires the generation of metadata given by a PDF
document.</p>
      <p>The Use-Case shows how a semantic document enables an easy,
low-effort information distribution, collaboration and integration
for the purpose of an innovative online workshop proceeding. The
goal is not only to ease the process of the creation of the online
proceedings but also to provide added-value to the reader of the
proceedings. In a way that the scientific contributions in the papers
are easier to read and browse in the online proceedings.</p>
      <p>The process for the online publication of the accepted workshop
papers is usually done manually. The editor creates typically a list
1The use-case is inspired by discussions with Anita de Waard, see
also http://wiki.ontoworld.org/index.php/ABCDEF
composed of the authors and the titles and links the corresponding
PDF document to it 2.</p>
      <p>However, additional information can easily be retrieved given
that each scientific author will utilize the SALT framework for the
writing of his scientific document. SALT enables a combination of
automatic retrieved annotation based on i) an analysis of the used
Latex commands, ii) annotation from the user about the rhetorical
structure of the document, and iii) arbitrary annotation of the
document. Hence, among other the semantic metadata will describe the
underlaying ideas in the paper which can easily be exploited when
presenting the proceeding.</p>
      <p>Figure 1 depicts the information workflow in the current
scenario. We assume that the accepted papers are enriched with our
rhetorical structure ontology, thus we take advantage of it the first
processing phase and generate an individual HTML page for each
paper, containing the usual metadata plus the annotations captured
by rhetorical structure. The second phase of the process, iterates
over all created pages and generates an entry point in the form of
an index page.</p>
      <p>The index page gives a short overview of all papers, but more
information – generated from the metadata – is available. Readers
can quickly glance through the contribution and skip to the section
they are interested in. For example, the context of each paper is
shown, the background and the contribution, but also the individual
claims are available.</p>
    </sec>
    <sec id="sec-5">
      <title>ANNOTATION AND PUBLISHING</title>
      <p>We implemented SALT and the workshop proceedings
publication scenario as two independent modules. The first module creates
and embeds the metadata into the document, while the second one
is using them to achieve the needed functionality. In figure 2 we
present the organization of the two modules together with the
thirdparty used components. Following, we will detail them separately.
4.1</p>
    </sec>
    <sec id="sec-6">
      <title>The SALT Process</title>
      <p>The SALT module is responsible for embedding the instances
of the mentioned ontologies into the resulting PDF document. In
order to reach the final result, there are a series of processing steps
that need to be taken described as following.</p>
      <p>Syntax analysis and annotation extraction. This first step takes
as input the LATEX document, parses it (Parser component)
and extracts the annotations present in it (Syntax analyzer
component), based on the three types of syntactic
modifications detailed in Section 5.2. The result of this analysis
process is a second LATEXdocument (in an intermediary stage)
2For examples see the workshop online proceedings at
CEURWS.org
and two sets of metadata: one which serves the population
of the semantic layer, i.e. the ontologies with instances and
the other creating the foundation for the PDF visual notes.
Based on the output provided by this step, the following 2
steps could be theoretically performed in a parallel manner.
PDF notes embedding. The Syntax Transformer component takes
the second metadata set (as described above) and based on its
analysis creates the appropriate PDF visual notes, by
making use of the special commands provided by LATEX. All the
annotations are then introduced into the LATEX intermediary
document in their original positions (extracted together with
the original annotations).</p>
      <p>
        Annotation analysis and ontology instantiation. In parallel to the
previous step, the first metadata set is also analyzed. In this
case, the focus is on the N3-like statements introduced in the
usual LATEX commands and on the commands pointing to the
rhetorical structure of the document. Using the Syntax
Analyzer in combination with Jena’s [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] N3 to RDF transformer,
the result of this step is the creation of the appropriate
instances of the ontologies, in RDF format.
      </p>
      <sec id="sec-6-1">
        <title>Final PDF document compilation. This final step has as input the</title>
        <p>
          LATEX intermediary document and the instances created in the
previous step. Its goal is to combine a PdfLatex compiler (in
our case MiKTeX3) with the XMP LATEX package [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] and
transform the input set from LATEX to PDF. The resulted PDF
document will have incorporated the instances of the two
ontologies and the visual notes.
        </p>
        <p>The whole module is packed as a stand-alone component and
it can be used from a command line interpreter or integrated as a
library in a writing environment.
4.2</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>The publishing Process</title>
      <p>The publishing module takes as input a PDF document, or a list
of PDF documents and provides as output one or several HTML
documents. The transformation process contains the following steps:
• extraction of the instances of the ontologies embedded in the</p>
      <p>PDF document(s)
• interpretation of the extracted metadata
• creation of the HTML documents based on some preferences
expressed by the user</p>
      <p>The first step is realized using the BFO PDF library 4 which
provides the means for metadata extraction from PDF files. The
resulted stream is passed to the Metadata Extractor component which
separates the instances of the document, annotation and rhetorical
ontologies and prepares them for interpretation.</p>
      <p>For publication, the user can specify a series of parameters
dealing with visual aspects of the publication, like font sizes,
positioning or color, and with content aspects, for example, which
annotated parts (or metadata) should be published. All these preferences
are taken into account when interpreting the extracted instances and
applied during the creation of the HTML documents. The whole
process is iterative, starting from the first specified file to the last
one. The finishing touch done by the HTML Builder is the creation
of the index file pointing to all previously created HTML
documents.
3http://www.miktex.org/
4http://big.faceless.org/products/pdf/</p>
    </sec>
    <sec id="sec-8">
      <title>THE SYNTACTIC AND SEMANTIC</title>
    </sec>
    <sec id="sec-9">
      <title>LAYERS</title>
      <p>As briefly discussed in Section 2.2 one can embed annotations
in PDF documents by filling the XMP field with DublinCore
metadata elements or by making use of notes, bookmarks or markups.
We propose a method for the creation of Semantic Documents by
exploiting and extending the two aforementioned approaches. The
actual transformation combines two interlinked layers: a semantic
layer and a syntactic layer.</p>
      <p>The semantic layer consists of the three ontologies, the document
ontology, the annotation ontology and the rethorical structure
ontology (cf. 2.1). The metadata based on these ontologies is places
in the XMP field and thus extending the regular DublinCore
elements of a PDF document.</p>
      <p>The syntactic layer proposes the enrichment of the LATEX
syntax with i) an analysis of the used commands, ii) the provision of
additional commands and iii) arbitrary annotation of the document
based on N3 statements. This level has the goal to create a semantic
bridge between the actual document and its metadata.</p>
      <p>The motivation for introducing these two layers relies in the
necessity of a much richer platform for embedding semantic
annotations, which should also profit by the visual impact offered by the
usual PDF annotation means. Following we will detail both the
semantic and syntactic layers.
5.1</p>
    </sec>
    <sec id="sec-10">
      <title>The semantic layer</title>
      <p>The goal of the semantic layer is to define a proper semantic
framework supporting the entire annotation process. We used three
levels, each level represented by an ontology:
Document structure level capturing the ordinary structure of the
document.</p>
      <p>Annotation level , creating the bridge between the rhetorical
structure and ordinary structure. It also captures additional
metadata about the document.</p>
      <p>Rhetorical level which models the document in terms of rhetorical
elements and builds its rhetorical structure.</p>
      <p>An overall image of the organization of the semantic layer is
presented in Figure 3. In the following we will detail each of the three
ontologies.
5.1.1</p>
      <sec id="sec-10-1">
        <title>The Document Ontology</title>
        <p>
          The document ontology, depicted in Figure 4, captures the
structural layout of the document and to maintain instances of the
annotated parts of the document. This represents an intermediary
solution, until we will be able to use the XPointer framework [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] (cf.
Section 7).
        </p>
        <p>The motivation behind this level of decomposition is given by
the need of to instantiate the annotated parts of the text. Also, the
sentence represents at the moment the finest granularity for
creating annotations and the referenced base for the construction of
rhetorical structure. As an example, an populated instance of the
document ontology will contain instances for all the words
annotated during the writing process.
5.1.2</p>
      </sec>
      <sec id="sec-10-2">
        <title>The Annotation Ontology</title>
        <p>As mentioned before, the main role of the annotation ontology
(Figure 5) is to relate the document ontology and the rhetorical
structure ontology. Conceptually, the rhetorical structure represents
an annotation of the ordinary structure. Thus, one is able to enrich
the document with rhetoric elements by attaching semantic
annotations to it. In ontological terms, this would translate to creating
instances of the Annotation concept and attaching them to the
appropriate parts of the text.</p>
        <p>A second role of the ontology is to provide metadata about the
publication as a whole. This part can be seen as an alignment to
the DublinCore initiative, showing also our support for it. Each of
the concepts, part of the metadata, has a direct correspondence in
a DublinCore element. For the future, we intend to maintain this
alignment by extending the ontology in parallel with the evolution
of the DublinCore schema.
5.1.3</p>
      </sec>
      <sec id="sec-10-3">
        <title>The Rhetorical Structure Ontology</title>
        <p>The rhetorical structure ontology represents a perfect union
between the knowledge captured by the rhetorical relations created
between some parts of the text, the rhetorical structure modeling
the positioning of the contained information chunks and the
argumentative support providing the mean for building a stable
foundation for the rhetoric elements. Following, we will analyze the three
mentioned sides of the ontology.</p>
        <p>
          The first side of the ontology deals with modeling the
information chunks present in the document as rhetoric elements. This
approach has its roots in the Rhetoric Structure of the Text (RST)
theory [
          <xref ref-type="bibr" rid="ref22">22</xref>
          ], which describes the text in terms of the rhetoric
relations existing between a Nucleus (modeled by us as the Claim) and
a Satellite (in our case, the Explanation). Although the theory
contains around 30 such relations, we considered only the ones which
have a bigger impact (and relevance) when annotating a scientific
document (e.g. Antithesis, Concession or Means). The main role
of these rhetoric relations (modeled by us as concepts) is to
provide a reason for the existence of the claims and the explanations in
the text. Furthermore, we considered their placement in the frame
created by the rhetorical structure (captured by the second side of
the ontology) as a natural integration and thus we introduced a
relation between the rhetorical relation concept and rhetorical structure
concept.
        </p>
        <p>
          The second side of the Rhetorical Structure Ontology takes care
of capturing the rhetorical structure of the document. It represents
an extension of the ABCDE format proposed in [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ] that stands for:
Annotation, Background, Contribution, Discussion, Entities.
        </p>
        <p>As a starting point, this organization reflects a good image of
a typical scientific document. But we argue that it is not enough.
Therefore we propose its modification and extension with a small
number of concepts, giving birth to a comprehensive rhetorical
structure which could be adopted for all scientific documents.</p>
        <p>The modification is the replacement of the Annotation concept
with the Abstract concept, since the whole rhetorical structure
represents in the end an annotation of the document. In terms of
extension, we propose the introduction of the following concepts:
Motivation, Scenario and Conclusion, which have as foundation
rhetorical relations, but we considered that by using them as concepts
of the rhetorical structure, we are able to model a complete best
practice structure for scientific documents.</p>
        <p>The two sides of the ontology described above are part of the a
priori annotation process. This third side, deals with the
discussions in terms of Arguments and CounterArguments, that can be
initiated based on the existing claims. The motivation relies on
building a stable foundation for the claims by augmenting them
with positive and negative argumentations. Therefore, we have
foreseen the need for a posteriori annotations modeling these
discussions and provided as part of the Rhetorical Structure Ontology
the Argument and CounterArgument concepts, together with
their subconcepts and relations.</p>
        <p>In order to have a full understanding of how the result of the
annotation process looks like from the Rhetorical Structure Ontology
point of view, we provided in Figure 7 an example of instantiation.
The example shows how a part of the text can be modeled in terms
of rhetorical elements, and how can the rhetorical relations be
created.</p>
        <p>Consider the given phrase: ... the visual system resolves
confusion by applying some tricks that reflect a built-in knowledge of
properties of the physical world. The writer splits it into the Claim
and the Explanation, and therefore instantiates two rhetorical
elements, which can be further identified by their unique associated
ID. Now, based on the definition of the Means rhetorical relation5,
the writer can make the reader aware of it, and thus emphasize his
idea, by creating an instance of the concept modeling this relation.
This instance is then linked with the appropriate concept from the
rhetorical structure, exemplified in this case by Contribution.
In terms of argumentative discussions, the example shows how can
the claim be afterwards linked to instances of positive or negative
arguments and how are the counter arguments instances modeled
in relation to the initial arguments.
5.2</p>
      </sec>
    </sec>
    <sec id="sec-11">
      <title>The syntactic layer</title>
      <p>The second layer introduced for embedding annotations into the
PDF documents, is the syntactic layer. Since we are targeting a
priori annotations, created manually during the writing process,
our approach proposes the enrichment of the LATEX syntax in three
ways:
• through command syntax extension
• by embedding N3-like statements in usual commands
• by introducing new commands</p>
      <p>Our goal for this modified syntax structure is to have a lightweight
form and as close as possible to the usual one, in order to avoid an
overkill for the ordinary users. Therefore, the first two types of
modification, i.e. command syntax extension and N3-like
statements integration, maintain the syntactical core of the command,
while the third one introduces simple new commands having
similar syntax as the usual ones. The resulted mixture of commands
has the most natural LATEX form possible.</p>
      <p>Command syntax extension. The syntax extension process was
developed for the commands which have as main goal the
structuring of the document. Therefore, commands like
abstract, section or subsection were extended with a new field
meant for assigning comments – free text annotation – to the
corresponding part of the document. The field is delimited
by a pair of curly brackets.</p>
      <p>Example: section{Introduction}{[...]}</p>
      <sec id="sec-11-1">
        <title>N3-like statements integration. This second type of modifications</title>
        <p>
          is the usage of N3-like statements in conjunction with LATEX
commands. These statements model information about the
5The Means rhetorical relation states that the Explanation presents
a method or instrument which tends to make realization of the
Claim more likely
document as subject, or about an arbitrary subject. This
enrichment was inspired by the N3 notation [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] and we believe
that it represents a lightweight and easy enough notation to
be adopted for creating semantic annotations in a scientific
document.
        </p>
        <p>New commands. In order to be able to manually annotate the
document with the rhetorical structure, we introduced a series of
new commands, similar to the usual LATEXones. For
example, \Background or \Motivation. All the newly introduced
commands support also the extension described above.</p>
        <p>Figure 8 depicts the result of the overall annotation process using
SALT. The first operation is the parsing of the existing LATEXdocument
and the metadata extraction from the usual commands and the
N3like statements. In the figure these are represented by the Author
and Title commands and the N3 statements about the topic of the
paper, having as foundation the SWRC ontology.</p>
        <p>The second operation is the instantiation of the Document
Ontology (also presented in the figure), based on the document’s
structural information. Following, SALT analyzes the command
extesions (like the one for Use-case section in the figure) and the newly
introduced commands and environments (like claim, explanation or
the scenario environment). It builds the rhetorical structure based
on them and represents it as an RDF graph. To be remarked that
each rhetoric element has a label attached (here, c1, e1 and
respectively p1) with the purpose of future referencing.</p>
        <p>The final step is embedding the necessary information in the PDF
document for the creation of the visual notes. The example shows
the visual note attached to the Use-case section and the visual notes
representing the beginning and the end of the Scenario rhetorical
branch. The latter cotains also the information about the rhetorical
relations found as part of this branch.</p>
        <p>In general, the three main phases of the overall process are:</p>
      </sec>
      <sec id="sec-11-2">
        <title>The creation of the semantic annotations and thus the document</title>
        <p>enrichment during the authoring process.</p>
        <p>The ontology instantiation from the created annotations, together
with the creation of the semantic links between the three
levels of the semantic layer.</p>
        <p>The visual representation of some of the annotations in the
resulted PDF document.</p>
        <p>In conclusion, we make a short analysis of the modifications.
In the first case, switching from the usual LATEXcommands to the
extended ones by adding the annotation field should be
straightforward, and should be considered an enrichment rather than a way
for confusing the ordinary users. The second modification, i.e. the
introduction of the N3-like statements, enables the author to insert
arbiters annotations. The last category of modifications represented
by the addition of new commands, was necessary in order to
represent the rhetorical structure of document.
6.</p>
      </sec>
    </sec>
    <sec id="sec-12">
      <title>RELATED WORK</title>
      <p>
        To ease the reasoning or retrieval of documents published on the
Desktop or Web, the documents should be classified in a way that
users find helpful and meaningful. There exist several activities
focused on semantic annotation as a way to enrich a document,
making it machine-readable and also accessible to humans. The
Writing in the Context of Knowledge(WiCK) project aims to
produce a novel writing tool to help authors improve the coherence and
consistency of the documents they are creating by helping to
assimilate key knowledge in each new document[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. CREAM is a
comprehensive framework which is specialized for populating HTML
pages with ontological concepts. It allows authors to build the
documents by dragging and dropping concepts and property from the
ontology browser to a text editor [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
      </p>
      <p>
        Most activities have proposed their own semantic structure based
on ontologies. Ontological structures allows not only fundamental
values for semantic annotation, but also for additional possibilities
such as inferencing or semantic retrieval [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]. The Semantic Web
Research Community (SWRC) ontology is originating from
OntoWeb, which can be used to provide detailed information about
research work. It models the Semantic Web research community
included researchers, publications, tools, and topics.
      </p>
      <p>
        Generally speaking, semantic documents include any
information regarding the document and its relationship with other
documents [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Therefore, a semantic annotation of documents
formally identifies concepts and relations between concepts in
documents, and is intended primarily for use by machines[
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]. There
are several efforts relevant related semantic documents such as
SemanticWord[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], OntoOffice[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and SemTalk[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Eriksson[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
propose the PDF backend approach which is to use PDF as the basis
for Protege storage backend. It allows users to store ontologies and
knowledge bases inside PDF files. In some previous work however,
an ontological information or metadata would exist in a different
place than the document itself. XMP is a formats for embedding
knowledge in documents[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Adobe’s XMP[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] is a labeling
technology that allows RDF constructs to be embedded in HTML, PDF
documents and all Adobe formats.
      </p>
      <p>
        In terms of the rhetorical structure of the text, [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] provides a
deep analysis of the application domains in which it is used, e.g.
computational linguistics, cross-linguistic studies or dialogue and
multimedia. From our perspective, the work done by Geurts et.
all[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and Uren et. all[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] seems interesting, because they are
among the only reference – to our knowledge – which try to model
the rhetorical structure as an ontology.
      </p>
      <p>
        [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] describes a framework for sensemaking tools in the context
of the Scholarly Ontologies Project. Their starting point is
represented by the requirements for a discourse ontology, having as
foundation the structure of the claim. The resulted ontology finds
its roots in the CCR (Cognitive Coherence Relations) Theory and
models the rhetorical links in terms of similarity, causality or
challenges. Their goal is to create and visualize claim networks
using scholarly documents (represented as HTML files) using a
central knowledge server. One of our future goals is also to create
such knowledge networks, but using active reference embedded in
the semantic document as an opposition to their central approach.
Also, our solution places the annotations in their natural
environment, i.e. as part of the document to which they are attached, and
thus transforming it into a semantic document.
      </p>
      <p>
        The second mentioned interesting reference was [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. It models
the process of transforming semantic graphs into multimedia
presentations, using domain knowledge and discourse analysis. Their
work is focusing more on using parts of the text for presentation
purposes, as compared with our, which provides a method for
enriching the normal documents with semantic annotations, based
also on discourse analysis.
      </p>
      <p>
        In this paper, we propose the document ontology to express much
richer semantics in documents including the extension of the ABCDE
format[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] for semantic structures of the document. From a
representational and technical perspective, our approach differs from
other approaches, in that ontologies support more sophisticated
modeling for specifying relations of scientific documents. Moreover, an
embedding technology using XMP provides efficient sharing
support which makes it possible to share about the document itself.
7.
      </p>
    </sec>
    <sec id="sec-13">
      <title>DISCUSSION</title>
      <p>In this Section we will raise some of the most interesting issues
that appeared while researching the concepts presented in the
current paper. Although the list could be much longer, we will resume
ourselves to two of them: i) document instance maintenance and
ii) object identification and reference, the latter being the source of
problem also for the first one.</p>
      <p>
        Our current approach solves the document instance maintenance
issue by creating an instance for every annotated information chunk,
the finest granularity being the word. The main reason is the
(general) lack of a proper reference mechanism inside the PDF
document, especially when created from LATEX. Analyzing the provided
solution, we could argue that it presents an possible advantage for
a future development but in the same time also a quite clear
disadvantage. The advantage consists in the possibility of representing
the whole documents as instances of the document ontology and
then using the instances for versioning purposes and semantic diff
operations. Obviously, the semantics of the diff operation has to be
firstly defined as part of a proper context, maybe in a similar way as
realized in [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. The disadvantage of this approach is the explosion
in space of the document, considering the number of triples that
need to be created for each word.
      </p>
      <p>
        The second issue deals with object identification and reference.
PDF documents have an internal organization represented by
treebased structures of complex objects and streams [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], together with
their associated properties. Post-creation analysis of the document,
and thus the reconstruction of this internal organization, represent a
hard task, due to the dependency on a handful of parameters, such
as accessing rights, image analysis or text retrieval algorithms’
accuracy. As a consequence, object referencing inside the document
becomes also hard to accomplish.
      </p>
      <p>On the other hand, we are dealing with a priori annotations,
which makes the situation even more complex. The annotation
process takes place during the writing process, in the LATEX
environment, and thus, the targeted PDF document does not even exist
yet. Still, to be able to reference the annotated parts of the
document, we adopted the following solution: The document structure
was captured in the document ontology, and therefore giving us
the means of referencing the information chunks having a sentence
granularity. For referencing inside the sentence (word granularity)
we introduced a base and an offset, pointing to the needed part of
the sentence.</p>
      <p>
        As a future improvement of this process, i.e. reference inside
the document, we intend build a DOM-like model (or a B-Tree
model) of the LATEX document and map its structure to the
treebased internal structure of the PDF document. This approach would
give us the following advantages:
• In terms of identification, we would be able to provide a
unique identification for each information chunk, in the
context of the document.
• In terms of reference, we would have the opportunity of
using the XPointer framework [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] in conjunction with the
document’s model.
      </p>
      <p>
        The combination of the two afore mentioned issues could start a
new direction for creating semantic knowledge networks using
information chunks from documents, by means of active references,
rather than the existing static links. One would we able to directly
embed a certain information object, or discuss a certain claim, in
her scientific document, by providing only its active reference. The
resulted semantic network tends come close to Ted Nelson’s Xanadu
vision[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
8.
      </p>
    </sec>
    <sec id="sec-14">
      <title>CONCLUSION</title>
      <p>In the paper we have described the authoring and annotation of
a semantic documents to provide semantic annotation for the
desktop. SALT leaves semantic data where it can be handled best,
together with the document. Thus SALT provides a means to create
Semantic Documents in a comparatively simple and intuitive way
to use for LATEX authors.</p>
      <p>To attain this objective we have defined a SALT process, the
appropriate Ontologies and the architecture. We have incorporated
the means for rhetorical markup of a document that allows for
example the scientific author to explicit markup his contribution and
the claims he made and the support for this claims. This explicit
annotation provides, as shown in our scenario, a innovate and
improved presentation and navigation of online proceedings.
Furthermore, it will enables other authors to explicit and directly reference
these claims and other related information. In the end this will lead
to interconnected Semantic Documents.</p>
      <p>For the future, there is a long list of open issues concerning the
authoring of semantic PDF documents – from the more mundane,
though important ones (top) to far-reaching ones (bottom):
1. PDF referencing, as we described it in Section 7
2. Creation of semantic knowledge networks using PDF
document, by active references, also introduced in Section 7.
3. Automatic derivation of markup.
4. Other information structures (or formats), for example,
incorporating not only the annotations created on the text, but
also the ones created for the pictures, part of the Semantic
Document.</p>
      <p>We believe that these options make SALT a rather intriguing
approach on which a considerable amount of scientific semantic
documents might be build.</p>
    </sec>
    <sec id="sec-15">
      <title>Acknowledgments</title>
      <p>This work is funded by the European Commission 6th Framework
Programme in context of the EU IST NEPOMUK IP - The
Social Semantic Desktop Project, FP6-027705. Special thanks to Big
Faceless Organization (big.faceless.org) for providing the
PDF library used in the metadata analysis process. Further we
thank Anita de Waard for fruitful discussions at ISWC 2005 and
ESWC 2006.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>DublinCore</given-names>
            <surname>Metadata</surname>
          </string-name>
          <article-title>Initiative</article-title>
          . http://dublincore.org/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Tim</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          .
          <article-title>An readable language for data on the web - notation 3, 1998</article-title>
          . http://www.w3.org/DesignIssues/Notation3.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Carr</surname>
          </string-name>
          ,
          <string-name>
            <surname>T.</surname>
            Miles-Board,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Wills</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Woukeu</surname>
            , and
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Hall</surname>
          </string-name>
          .
          <article-title>Towards a Knowledge-Aware Office Environment</article-title>
          . In D. Karagiannis and U. Reimer, editors,
          <source>Proceedings of 5th International Conference on Practical Aspects of Knowledge Management (PAKM</source>
          <year>2004</year>
          ),
          <source>volume LNAI 3336</source>
          , pages
          <fpage>129</fpage>
          -
          <lpage>140</lpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Carroll</surname>
          </string-name>
          , I. Dickinson,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dollin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Reynolds</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Seaborne</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Wilkinson</surname>
          </string-name>
          .
          <article-title>Jena: Implementing the Semantic Web Recommendations</article-title>
          .
          <source>Technical Report HPL-2003-146</source>
          , Hewlett-Packard,
          <year>Dec 2003</year>
          . http://www.hpl.hp.com/techreports/2003/HPL-2003- 146.html.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Ciravegna</surname>
          </string-name>
          , Alexiei Dingli, Daniela Petrelli, and
          <string-name>
            <given-names>Yorick</given-names>
            <surname>Wilks</surname>
          </string-name>
          .
          <source>User-System Cooperation in Document Annotation Based on Information Extraction</source>
          . volume
          <volume>2473</volume>
          , pages
          <fpage>122</fpage>
          +,
          <year>January 2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6] Anita de Waard and
          <string-name>
            <given-names>Gerard</given-names>
            <surname>Tel</surname>
          </string-name>
          .
          <source>The ABCDE format - Enabling Semantic Conference Proceeding. In Proceedings of 1st Workshop:</source>
          ”
          <fpage>SemWiki2006</fpage>
          - From Wiki to Semantics”, Budva, Montenegro,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Henrik</given-names>
            <surname>Eriksson. A PDF Storage</surname>
          </string-name>
          <article-title>Backend for Protege</article-title>
          .
          <source>In Proceedings of the 9th Protege International Conference</source>
          , Stanford, California, USA,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>C.</given-names>
            <surname>Fillies</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Wood-Albrecht</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Weichardt</surname>
          </string-name>
          .
          <article-title>A Pragmatic Application of the Semantic Web using SemTalk</article-title>
          .
          <source>In Proceedings of the Eleventh International World Wide Web Conference</source>
          , Honolulu, Hawaii, USA., pages
          <fpage>686</fpage>
          -
          <lpage>692</lpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Joost</given-names>
            <surname>Geurts</surname>
          </string-name>
          , Stefano Bocconi, Jacco van Ossenbruggern,
          <string-name>
            <given-names>and Lynda</given-names>
            <surname>Hardman</surname>
          </string-name>
          .
          <article-title>Towards Ontology-driven Discourse: From Semantic Graphs to Multimedia Presentations</article-title>
          .
          <source>Technical report, Centrum voor Wiskunde en Informatica (INS-R0305)</source>
          ,
          <source>May</source>
          <volume>31</volume>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>Ontoprise</given-names>
            <surname>GmbH. OntoOffice Tutorial</surname>
          </string-name>
          ,
          <year>2003</year>
          . http://www.ontoprise.de/documents/tutorial ontooffice.
          <source>pdf.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Grosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Maler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Marsh</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Walsh</surname>
          </string-name>
          .
          <source>XPointer element() Scheme</source>
          ,
          <year>2003</year>
          . http://www.w3.org/TR/xptr-element/.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>W.</given-names>
            <surname>Guoren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Bin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Donghong</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Q.</given-names>
            <surname>Baiyou</surname>
          </string-name>
          .
          <article-title>Design and Implementation of a Semantic Document Management System</article-title>
          .
          <source>Information Technology Journal</source>
          <volume>4</volume>
          ,
          <issue>1</issue>
          :
          <fpage>21</fpage>
          -
          <lpage>31</lpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Handschuh</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          .
          <article-title>Authoring and Annotation of Web Pages in CREAM</article-title>
          .
          <source>In Proceedings of the 11th International World Wide Web Conference, WWW</source>
          <year>2002</year>
          , Honolulu, Hawaii, May 7-
          <issue>11</issue>
          ,
          <year>2002</year>
          , pages
          <fpage>462</fpage>
          -
          <lpage>473</lpage>
          . ACM Press,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Handschuh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Maedche. CREAM - Creating Relational</surname>
          </string-name>
          <article-title>Metadata with a Component-Based, Ontology-Driven Annotation Framework</article-title>
          .
          <source>In Proceedings of the First International Conference on Knowledge Capture (K-Cap</source>
          <year>2001</year>
          ) , pages
          <fpage>76</fpage>
          -
          <lpage>83</lpage>
          , Victoria,
          <string-name>
            <given-names>B.C.</given-names>
            ,
            <surname>Canada</surname>
          </string-name>
          ,
          <year>October 2001</year>
          . ACM Press.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Adobe</surname>
            <given-names>Systems</given-names>
          </string-name>
          <string-name>
            <surname>Incorporated</surname>
          </string-name>
          .
          <article-title>Adobe Acrobat SDK</article-title>
          . http://partners.adobe.com/public/developer/acrobat/sdk/ index.html.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Adobe</surname>
            <given-names>Systems</given-names>
          </string-name>
          <string-name>
            <surname>Incorporated</surname>
          </string-name>
          . Extensible Metadata Platform. http://www.adobe.com/products/xmp/.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Adobe</surname>
            <given-names>Systems</given-names>
          </string-name>
          <string-name>
            <surname>Incorporated. PDF Reference - Adobe Portable Document Format</surname>
          </string-name>
          ,
          <year>April 2004</year>
          . http://partners.adobe.com/public/developer/en/pdf/ PDFReference16.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>Ted</given-names>
            <surname>Nelson</surname>
          </string-name>
          . Literary Machines:
          <article-title>The report on, and of, Project Xanadu concerning word processing, electronic publishing, hypertext, thinkertoys, tomorrow's intellectual... including knowledge, education and freedom</article-title>
          . Mindful Press, Sausalito, California,
          <year>1981</year>
          edition: ISBN 089347052X,
          <year>1981</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Maarten</given-names>
            <surname>Sneep</surname>
          </string-name>
          .
          <source>The XMP inclusion package</source>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Maedche</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Handschuh</surname>
          </string-name>
          .
          <article-title>An annotation framework for the semantic web</article-title>
          .
          <source>In Proceedings of the First Workshop on Multimedia Annotation</source>
          , Tokyo, Japan, January
          <volume>30</volume>
          -31
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>Maite</given-names>
            <surname>Taboada</surname>
          </string-name>
          and
          <string-name>
            <given-names>William C.</given-names>
            <surname>Mann</surname>
          </string-name>
          .
          <article-title>Applications of Rhetorical Structure Theory</article-title>
          .
          <source>Discourse Studies</source>
          ,
          <volume>8</volume>
          , No.
          <volume>4</volume>
          :
          <fpage>567</fpage>
          -
          <lpage>588</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Maite</given-names>
            <surname>Taboada</surname>
          </string-name>
          and
          <string-name>
            <given-names>William C.</given-names>
            <surname>Mann</surname>
          </string-name>
          .
          <article-title>Rhetorical Structure Theory: looking back and moving ahead</article-title>
          .
          <source>Discourse Studies</source>
          ,
          <volume>8</volume>
          , No.
          <volume>3</volume>
          :
          <fpage>423</fpage>
          -
          <lpage>459</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>Marcello</given-names>
            <surname>Tallis</surname>
          </string-name>
          .
          <article-title>Semantic Word Processing for Content Authors</article-title>
          .
          <source>In Proceedings of the Knowledge Markup &amp; Semantic Annotation Workshop</source>
          , Florida, USA, Part of the Second International Conference on Knowledge Capture,
          <string-name>
            <surname>K-CAP</surname>
          </string-name>
          <year>2003</year>
          . ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <surname>Victoria</surname>
            <given-names>Uren</given-names>
          </string-name>
          , Philipp Cimiano, Jos Iria, Siegfried Handschuh, Maria Vargas-Vera,
          <string-name>
            <given-names>Enrico</given-names>
            <surname>Motta</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Fabio</given-names>
            <surname>Ciravegna</surname>
          </string-name>
          .
          <article-title>Semantic Annotation for Knowledge Management: Requirements and a Survey of the State of the Art</article-title>
          .
          <source>Journal of Web Semantics</source>
          <volume>4</volume>
          ,
          <issue>1</issue>
          :
          <fpage>14</fpage>
          -
          <lpage>28</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>Victoria</surname>
            <given-names>Uren</given-names>
          </string-name>
          , Simon Buckingham Shum,
          <string-name>
            <given-names>Gangmin</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Michelle</given-names>
            <surname>Bachler</surname>
          </string-name>
          .
          <article-title>Sensemaking Tools for Understanding Research Literatures: Design, Implementation and User Evaluation</article-title>
          .
          <source>Int. Jnl. Human Computer Studies</source>
          ,
          <volume>64</volume>
          , No.
          <volume>5</volume>
          :
          <fpage>420</fpage>
          -
          <lpage>445</lpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>M.</given-names>
            <surname>Vargas-Vera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Motta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Domingue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lanzoni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Stutt</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Ciravegna. MnM: Ontology Driven Semi-Automatic</surname>
          </string-name>
          and
          <article-title>Automatic Support for Semantic Markup</article-title>
          .
          <source>In EKAW02, 13th International Conference on Knowledge Engineering and Knowledge Management, LNCS/LNAI 2473</source>
          , pages
          <fpage>379</fpage>
          -
          <lpage>391</lpage>
          , Sigu¨enza, Spain,
          <year>October 2002</year>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>Max</given-names>
            <surname>Voelkel</surname>
          </string-name>
          and
          <string-name>
            <given-names>Tudor</given-names>
            <surname>Groza</surname>
          </string-name>
          .
          <article-title>SemVersion: RDF-based Ontology Versioning System</article-title>
          .
          <source>In Proceedings of the IADIS International Conference WWW/Internet (ICWI</source>
          <year>2006</year>
          ), Murcia, Spain,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>