<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>MnM: Ontology-Driven Tool for Semantic Markup</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Maria Vargas-Vera</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Enrico Motta</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>John Domingue</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mattia Lanzoni</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arthur Stutt</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fabio Ciravegna</string-name>
          <email>f.ciravegna@dcs.shef.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Sheffield</institution>
          ,
          <addr-line>Regent Court 211</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>The Open University</institution>
          ,
          <addr-line>Walton Hall, Milton Keynes, MK7 6AA</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>An important precondition for realising the goal of a semantic web is the ability to annotate web resources with semantic information. In order to carry out this task, users need appropriate representation languages, ontologies, and support tools. In this paper we present MnM, an annotation tool which provides both automated and semi-automated support for annotating web pages with semantic contents. MnM integrates a web browser with an ontology editor and provides open APIs to link to ontology servers and for integrating information extraction tools. MnM can be seen as an early example of the next generation of ontology editors, being web-based, oriented to semantic markup and providing mechanisms for large-scale automatic markup of web pages.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        An important pre-condition for realising the goal of the
semantic web is the ability to annotate web resources with
semantic information. In order to carry out this task, users
need appropriate knowledge representation languages,
ontologies, and support tools. The knowledge
representation language provides the semantic interlingua
for expressing knowledge precisely. RDF ([
        <xref ref-type="bibr" rid="ref12">11</xref>
        ] and [
        <xref ref-type="bibr" rid="ref17">16</xref>
        ])
and RDFS [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] provide the basic framework for expressing
metadata on the web, while current developments in
webbased knowledge representation, such as DAML+OIL
(DAML+OIL, 2001) and WebOnt (http://www.w3.org),
are building on the RDF base framework to provide more
sophisticated knowledge representation support. Ontologies
([
        <xref ref-type="bibr" rid="ref10">9</xref>
        ]) provide the mechanism to support interoperability at a
conceptual level. In a nutshell, the idea of interoperating
more emphasis will have to be put on facilitating semantic
markup by ‘ordinary’ web users (people who are neither
experts in language technologies nor 'power knowledge
engineers'). In particular, automated knowledge extraction
technologies are likely to play an ever increasing important
role, as a crucial technology to tackle the semantic web
version of the knowledge acquisition bottleneck.
In this paper we present MnM, an annotation tool which
provides both automated and semi-automated support for
marking up web pages with semantic contents. MnM
integrates a web browser with an ontology editor and
provides open APIs to link to ontology servers and for
integrating information extraction tools. MnM can be seen
as an early example of the next generation of ontology
editors, being web-based, oriented to semantic markup and
providing mechanisms for large-scale automatic markup of
web pages.
      </p>
      <p>The rest of the paper is organised as follows: in the next
section we will show the process model underlying the
design of the tool. Finally sections 3 and 4 discuss related
work and re-state the main tenets and results from our
research.
2</p>
    </sec>
    <sec id="sec-2">
      <title>PROCESS MODEL</title>
      <p>Within this work we have focused on creating a generic
process model for developing semantically enriched web
content. The component tools which are used in MnM are
ontology servers, Information Extraction (IE) tools and
augmented web browsers. During our initial work in this
area we found that either the existing tools did not directly
support the creation of semantic web content or the
mapping between the tasks to be carried out and the toolset
was non-trivial. Hence, within MnM, we adopted a generic
process model, which can be easily understood by web
developers who are not necessarily expert ontology
engineers or human language technology experts.</p>
      <p>Another key feature of our process model is that it is
generic with respect to the specific ontology server and IE
technologies used.</p>
      <p>There are five main activities supported by MnM:
•
•
•
•</p>
      <sec id="sec-2-1">
        <title>Browse. A specific set of knowledge components is chosen from a library of knowledge models on an ontology server.</title>
      </sec>
      <sec id="sec-2-2">
        <title>Markup. The chosen set of knowledge components is selected to form the basis of an IE mechanism. A corpus of documents are manually marked up.</title>
      </sec>
      <sec id="sec-2-3">
        <title>Learn. A learning algorithm is run over the</title>
        <p>marked up corpus to learn the extraction rules.</p>
      </sec>
      <sec id="sec-2-4">
        <title>Test. The IE mechanism is run over a test corpus to assess its precision and recall measures.</title>
        <p>•</p>
      </sec>
      <sec id="sec-2-5">
        <title>Extract. An IE mechanism is selected and run over a set of documents We will now provide more details of each of the above activities in turn.</title>
        <sec id="sec-2-5-1">
          <title>Browse</title>
          <p>In this activity the user browses a library of knowledge
models which sit on a web based ontology server. The user
can see an overview of the existing models and can select
which one to focus on (i.e., which ontology to use to
initiate the markup process). Within a selected ontology the
user can browse the existing items - for example the classes
and instances. Items within an ontology can be selected as
the starting point for selecting an IE mechanism. More
specifically, the selected class forms the basis for a
template which will eventually be matched against a corpus
of documents and instantiated in the extraction activity.</p>
        </sec>
        <sec id="sec-2-5-2">
          <title>Mark-Up</title>
          <p>Once a class has been selected a training corpus of
manually marked up pages needs to be created. Here the
user views appropriate documents within MnM’s built-in
web browser and annotates segments of text using the tags
based on the class’s slot as given in the ontology (i.e.,
ontology driven mark-up). As the text is selected MnM
inserts the relevant XML tags into the document.</p>
        </sec>
        <sec id="sec-2-5-3">
          <title>Learning</title>
          <p>MnM integrates web browsing, ontology browsing and IE
development. It does not have a built-in IE tool but
provides a plug-in interface which allows the integration of
IE tools easily.</p>
          <p>
            In a previous version of our MnM we integrated Marmot,
Badger and Crystal from the University of Massachusetts
([
            <xref ref-type="bibr" rid="ref22">21</xref>
            ]) and our own NLP components (i.e., OCML
preprocessor). A full description of this version can be
found in ([
            <xref ref-type="bibr" rid="ref24">23</xref>
            ] and [
            <xref ref-type="bibr" rid="ref25">24</xref>
            ]). However, in this paper we will
concentrate on the recent integration work that we have
carried out with Amilcare, a tool for adaptive information
extraction ([
            <xref ref-type="bibr" rid="ref4">3</xref>
            ]).
          </p>
          <p>Amilcare is designed to support active annotation of
documents. It performs IE by enriching texts with XML
annotations. To use Amilcare in a new domain the user
simply has to manually annotate a training set of
documents. No knowledge of Natural Language
Technologies is necessary.</p>
          <p>
            Amilcare is designed to accommodate the needs of
different user types. While naïve users can build new
applications without delving into the complexity of Human
Language Technology, IE experts are provided with a
number of facilities for tuning the final application.
Induced rules can be inspected, monitored and edited to
obtain some additional accuracy, if required. The interface
also allows precision (P) and recall (R) to be balanced.
The system can be run on an annotated unseen corpus and
users are presented with statistics on accuracy, together
with details on correct matches and mistakes. Retuning the
P&amp;R balance does not generally require major retraining,
facilities for inspecting the effect of different P&amp;R balances
are provided. Although the current interface for balancing
P&amp;R is designed for IE experts, a future version will
provide support for naive users ([
            <xref ref-type="bibr" rid="ref7">6</xref>
            ]).
          </p>
          <p>
            At the start of the learning phase Amilcare preprocesses
texts using Annie, the shallow IE system included in the
Gate package ([
            <xref ref-type="bibr" rid="ref18">17</xref>
            ], www.gate.ac.uk). Annie performs text
tokenization (segmenting texts into words), sentence
splitting (identifying sentences) part of speech tagging
(lexical disambiguation), gazetteer lookup (dictionary
lookup) and named entity recognition (recognition of
people and organization names, dates, etc.).
          </p>
          <p>
            Amilcare then induces rules for information extraction. The
learning system is based on LP2, a covering algorithm for
supervised learning of IE rules based on Lazy-NLP ([
            <xref ref-type="bibr" rid="ref4">3</xref>
            ]
and [
            <xref ref-type="bibr" rid="ref5">4</xref>
            ]). This is a wrapper induction methodology ([
            <xref ref-type="bibr" rid="ref16">15</xref>
            ])
that, unlike other wrapper induction approaches, uses
linguistic information in the rule generalization process.
The learning system starts inducing wrapper-like rules that
make no use of linguistic information, where rules are sets
of conjunctive conditions on adjacent words. Then the
linguistic information provided by Annie is used in order to
create generalised rules: conditions on words are
substituted with conditions on the linguistic information
(e.g. condition matching on either the lexical category, or
the class provided by the gazetteer, etc. ([
            <xref ref-type="bibr" rid="ref5">4</xref>
            ]).
          </p>
          <p>
            All the generalizations are tested in parallel by using a
variant of the AQ algorithm ([
            <xref ref-type="bibr" rid="ref20">19</xref>
            ]) and the best
generalizations are kept for IE. The idea is that the
linguistic-based generalisation is deployed only when the
use of NLP information is reliable or effective. The
measure of reliability here is not linguistic correctness, but
effectiveness in extracting information using linguistic
information as opposed to using shallower approaches.
Lazy NLP-based systems learn which is the best strategy
for each information/context separately. For example they
may decide that using the result of a part of speech tagger
is the best strategy for recognising the speaker in seminar
announcements, but not to spot the seminar location. This
strategy is quite effective for analysing documents with
mixed genres, a common situation in web documents ([
            <xref ref-type="bibr" rid="ref6">5</xref>
            ]).
The learning system induces two types of rules: tagging
rules and correction rules. A tagging rule is composed of a
left hand side, containing a pattern of conditions on a
connected sequence of words, and a right hand side that is
an action inserting an XML tag in the texts. Correction
rules shift misplaced annotations (inserted by tagging rules)
to the correct position. These are learnt from the errors
found whilst attempting to re-annotate the training corpus
using the induced tagging rules.
          </p>
          <p>Correction rules are identical to tagging rules, but (1) their
patterns also match the tags inserted by the tagging rules
and (2) their actions shift misplaced tags rather than adding
new ones. The output of the training phase is a collection
of rules for IE that are associated with the specific scenario.</p>
        </sec>
        <sec id="sec-2-5-4">
          <title>Testing</title>
          <p>MnM provides various mechanisms for selecting a test
corpus and distinguish this from a training corpora. The
user can manually select training and test corpora and these
can be in the form of local files or on the web. In addition,
it is also possible to simply select a corpus (either locally or
on the web) and let the system to create, to test and training
corpora randomly.</p>
        </sec>
        <sec id="sec-2-5-5">
          <title>Extraction</title>
          <p>
            After the training phase Amilcare has a library of induced
rules which can be used to extract information from texts.
When working in extraction mode, Amilcare receives as
input a (collection of) text(s) with the associated scenario –
scenario is the set of tags that the user will insert in the
training corpora- (including the rules induced during the
training phase). It preprocesses the text(s) by using Annie
and then it applies its rules and returns the original text
with the added annotations. The Gate annotation schema is
used for annotation ([
            <xref ref-type="bibr" rid="ref18">17</xref>
            ]).
          </p>
          <p>Once that is done the information extracted is presented to
the user for approval. Then the extracted information is
sent to the ontology server which will populate the selected
ontology.</p>
          <p>During the population step the IE mechanism fills
predefined slots associated with an extraction template.
Each template consists of slots of a particular class as
defined in the selected ontology, for instance, the class
visiting-a-place-or-people has the slots: visitor, place, etc.</p>
          <p>Our goal is to automatically fill as many slots as
possible. However, some of the slots may still require
manual intervention. There are several reasons for this
problem:
• there is information that is not contained in the text,
• none of the rules from our IE libraries match with
the sentence that might provide the information
(incomplete set of rules). This means that the learning
phase needs to be tuned.</p>
          <p>The extracted information is also validated using the
ontology. This is possible because each slot in each class
of the ontology has a type associated with it. Therefore,
extracted information which does not match the type
definition of the slot in the ontology can be highlighted as
incorrect.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>RELATED WORK</title>
      <p>
        A number of annotation tools for producing semantic
markup exist. The most interesting of these are Annotea
([
        <xref ref-type="bibr" rid="ref14">13</xref>
        ]); SHOE Knowledge Annotator ([
        <xref ref-type="bibr" rid="ref13">12</xref>
        ]); the COHSE
annotator ([
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]); AeroDAML ([
        <xref ref-type="bibr" rid="ref15">14</xref>
        ]); and, OntoMat, a tool
being developed using the CREAM annotation framework
([
        <xref ref-type="bibr" rid="ref11">10</xref>
        ]). A commercial version of OntoMat is available as
OntoAnnotate
(http://www.ontoprise.de/com/co_produ_tool2.htm).
Annotea provides RDF-based markup but it does not
support information extraction nor is it linked to an
ontology server. It does, however, have an annotation
server which makes annotations publicly available. SHOE
Knowledge Annotator allows users to mark up pages in
SHOE guided by ontologies available locally or via a URL.
These marked up pages can be reasoned about by
SHOEaware tools such as SHOE Search. The COHSE annotator
uses an ontology server to mark up pages in DAML+OIL.
The results can be saved as RDF. AeroDAML is available
as a web page. The user simply enters a URL and the
system automatically returns DAML annotations on a web
page using a predefined ontology based on WordNet.
Of the systems listed above, OntoMat is closest to MnM
both in spirit and in functionality. Both can provide some
form of automated extraction. However, while MnM makes
it possible to access ontology servers through APIs, such as
OKBC, and also to access ontologies specified in a markup
format, such as RDF and DAML+OIL, OntoMat only
provides the latter functionality. In contrast with OntoMat,
MnM can handle multiple ontologies at the same time,
which makes it very easy to switch from one to another,
and also allows inherited definitions to be displayed for
ontology editing and browsing. On the other hand,
OntoMat can store pages annotated in DAML+OIL using
OntoBroker as an annotation server. It also provides
crawlers which can search the Web for marked up pages
for addition to its internal knowledge base.
      </p>
      <p>While both MnM and OntoMat are very similar they
illustrate a slight difference of emphasis in providing tools
for the Semantic Web. While OntoMat adopts the
philosophy that the markup which indicates the knowledge
content of a web resources should be included as part of
that resource, MnM’s annotations are stored both as
markup on a page and as items in a knowledge base held
on the WebOnto combined ontology and knowledge base
server.
4</p>
    </sec>
    <sec id="sec-4">
      <title>CONCLUSIONS</title>
      <p>In this paper we have described MnM, an ontology-based
annotation tool which provides both automated and
semiautomated support for annotating web pages with semantic
contents. The first prototype of the system has now been
completed and tested with both Amilcare and the UMass
set of tools. The early results are encouraging in terms of
the quality and robustness of our current implementation,
however, there is clearly a lot more work needed to make
this technology easy to use for our target user base (people
who are neither experts in language technologies nor
'power knowledge engineers'). In particular, all the
activities associated with automated markup tend to be very
sensitive to the quality of markup and to the
appropriateness of the chosen corpora. Amilcare already
attempts to address some of these issues through its
adaptive mechanisms, however, more work is needed in
this area. In addition, we also plan to do more work on the
user interface, in particular with respect to the integration
of markup, ontology browsing and the 'semantic
navigation' of web pages. Currently, ontology and web
browsing are integrated with respect to contents annotation,
but ontologies do not inform the web browsing component
of MnM directly. Our vision for the semantic web is one in
which new forms of 'conceptual navigation' will emerge,
where association between resources will be semantic as
well as hypertextual. We plan to experiment with these
ideas and extend the interface of MnM to support novel,
markup-driven forms of web browsing, as well as the
standard HTML based ones.</p>
    </sec>
    <sec id="sec-5">
      <title>ACKNOWLEDGEMENTS</title>
      <p>This work was funded by the Advanced Knowledge
Technologies (AKT) Interdisciplinary Research
Collaboration (IRC), which is sponsored by the UK
Engineering and Physical Sciences Research Council under
grant number GR/N15764/01. The AKT IRC comprises the
Universities of Aberdeen, Edinburgh, Sheffield,
Southampton and the Open University. The authors would
like to thank Maruf Hassan and Simon Buckingham Shum
for their invaluable help in reviewing the first draft of this
paper.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bechhofer</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Goble</surname>
          </string-name>
          ,
          <article-title>Towards Annotation Using AML+OIL. First International Conference on Knowledge Capture (K-CAP</article-title>
          <year>2001</year>
          ). Workshop on Semantic Markup and Annotation. Victoria, BC., Canada,
          <year>October 2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Brickley</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Guha</surname>
          </string-name>
          ,
          <article-title>Resource Description Framework (RDF) Schema Specification 1.0</article-title>
          .
          <string-name>
            <surname>Candidate</surname>
            <given-names>recommendation</given-names>
          </string-name>
          ,
          <source>World Wide Web Consortium</source>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          http://www.w3.org/TR/2000/CR-rdf-schema-
          <volume>20000327</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>F.</given-names>
            <surname>Ciravegna</surname>
          </string-name>
          ,
          <article-title>Adaptive Information Extraction from Text by Rule Induction and Generalisation</article-title>
          ,
          <source>Proc. of 17th International Joint Conference on Artificial Intelligence (IJCAI</source>
          <year>2001</year>
          ) , Seattle,
          <year>August 2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>F.</given-names>
            <surname>Ciravegna</surname>
          </string-name>
          ,
          <article-title>LP2 an Adaptive Algorithm for Information Extraction from Web-related Texts</article-title>
          .
          <source>Proc. of the IJCAI-2001 Workshop on Adaptive Text Extraction and Mining held in conjunction with the 17th International Conference on Artificial Intelligence (IJCAI-01)</source>
          ,
          <year>August</year>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Ciravegna</surname>
          </string-name>
          ,
          <article-title>Challenges in Information Extraction from Text for Knowledge Management in IEEE Intelligent Systems</article-title>
          and Their Applications,
          <year>November 2001</year>
          ,
          <article-title>(Trend and Controversies)</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Ciravegna</surname>
          </string-name>
          and
          <string-name>
            <given-names>D.</given-names>
            <surname>Petrelli</surname>
          </string-name>
          ,
          <source>User Involvement in Adaptive Information Extraction: Position Paper in Proceedings of the IJCAI-2001 Workshop on Adaptive Text Extraction and Mining held in conjunction with the 17th International Conference on Artificial Intelligence (IJCAI-01)</source>
          ,
          <year>August</year>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Feigenbaum</surname>
          </string-name>
          ,
          <article-title>The art of artificial intelligence 1: Themes and case studies of knowledge engineering</article-title>
          .
          <source>Technical report</source>
          , Pub. no.
          <source>STAN-SC-77-621</source>
          , Stanford University, Department of Computer Science,
          <year>1977</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>D.</given-names>
            <surname>Fensel</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Motta</surname>
          </string-name>
          ,
          <article-title>Structured Development of Problem Solving Methods</article-title>
          .
          <source>Transactions on Knowledge and Data Engineering</source>
          <volume>13</volume>
          (
          <issue>6</issue>
          ):
          <fpage>9131</fpage>
          -
          <lpage>932</lpage>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T. R.</given-names>
            <surname>Gruber</surname>
          </string-name>
          ,
          <article-title>A Translation Approach to Portable Ontology Specifications</article-title>
          .
          <source>Knowledge Adquisition</source>
          <volume>5</volume>
          (
          <issue>2</issue>
          ),
          <fpage>199</fpage>
          -
          <lpage>220</lpage>
          ,
          <year>1993</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Handschuh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          and
          <string-name>
            <given-names>A.</given-names>
            <surname>Maedche</surname>
          </string-name>
          ,
          <article-title>CREAM- Creating relational metadata with a component-based, ontology-driven annotation framework</article-title>
          .
          <source>First International Conference on Knowledge Capture (K-CAP</source>
          <year>2001</year>
          ),
          <string-name>
            <surname>Victoria</surname>
            <given-names>B.C.</given-names>
          </string-name>
          ,
          <year>October 2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hayes</surname>
          </string-name>
          , RDF Model Theory, W3C Working Draft,
          <year>February 2002</year>
          URL: http://www.w3.org/TR/rdf-mt/.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Heflin</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Hendler</surname>
          </string-name>
          ,
          <article-title>A Portrait of the Semantic Web in Action</article-title>
          .
          <source>IEEE Intelligent Systems</source>
          ,
          <volume>16</volume>
          (
          <issue>2</issue>
          ),
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kahan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Koivunen</surname>
          </string-name>
          , E. Prud'Hommeaux and
          <string-name>
            <given-names>R.</given-names>
            <surname>Swick</surname>
          </string-name>
          ,
          <article-title>Annotea: Open RDF Infrastructure for Shared Web Annotations</article-title>
          .
          <source>In Proc. of the WWW10 International Conference. Hong Kong</source>
          ,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Kogut</surname>
          </string-name>
          and
          <string-name>
            <given-names>W.</given-names>
            <surname>Holmes</surname>
          </string-name>
          ,
          <article-title>AeroDAML: Applying Information Extraction to Generate DAML Annotations from Web Pages. First International Conference on Knowledge Capture (K-CAP</article-title>
          <year>2001</year>
          ).
          <source>Workshop on Knowledge Markup and Semantic Annotation</source>
          , Victoria,
          <string-name>
            <given-names>B.C.</given-names>
            ,
            <surname>Canada</surname>
          </string-name>
          ,
          <year>October 2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>N.</given-names>
            <surname>Kushmerick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Weld</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Doorenbos</surname>
          </string-name>
          ,
          <article-title>Wrapper induction for information extraction</article-title>
          ,
          <source>Proc. of 15th International Conference on Artificial Intelligence, IJCAI-97.</source>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>O.</given-names>
            <surname>Lassila</surname>
          </string-name>
          and
          <string-name>
            <given-names>R.</given-names>
            <surname>Swick</surname>
          </string-name>
          ,
          <article-title>Resource Description Framework (RDF): Model</article-title>
          and
          <string-name>
            <given-names>Syntax</given-names>
            <surname>Specification</surname>
          </string-name>
          . Recommendation, World Wide Web Consortium,
          <year>1999</year>
          . URL: http://www.w3.org/TR/REC-rdf-syntax/.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>D.</given-names>
            <surname>Maynard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tablan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cunningham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. Saggion K.</given-names>
            <surname>Bontcheva</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wilks</surname>
          </string-name>
          ,
          <article-title>Architectural Elements of Language Engineering Robustness</article-title>
          .
          <article-title>Journal of Natural Language Engineering - Special Issue on Robust Methods in Analysis of Natural Language Data</article-title>
          ,forthcoming,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>S.</given-names>
            <surname>McIlraith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. C.</given-names>
            <surname>Son</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Zeng</surname>
          </string-name>
          , Semantic Web Services,
          <source>IEEE Intelligent Systems, Special Issue on the Semantic Web</source>
          , Volume
          <volume>16</volume>
          , No.
          <issue>2</issue>
          , pp.
          <fpage>46</fpage>
          -
          <lpage>53</lpage>
          , March/April,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>R. S.</given-names>
            <surname>Mickalski</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Mozetic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lavrack</surname>
          </string-name>
          ,
          <article-title>The multi purpose incremental learning system AQ15 and its testing application to three medical domains'</article-title>
          ,
          <source>in Proceedings of the 5th National Conference on Artificial Intelligence</source>
          , Philadelphia. Morgan Kaufmann publisher,
          <year>1986</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>E</given-names>
            <surname>Motta</surname>
          </string-name>
          ,
          <article-title>Reusable Components for Knowledge Models</article-title>
          . IOS Press, Amsterdam,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>E.</given-names>
            <surname>Riloff</surname>
          </string-name>
          ,
          <article-title>An Empirical Study of Automated Dictionary Construction for Information Extraction in Three Domains</article-title>
          .
          <source>The AI Journal</source>
          ,
          <volume>85</volume>
          ,
          <fpage>101</fpage>
          -
          <lpage>134</lpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mädche</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Handschuh</surname>
          </string-name>
          ,
          <article-title>An Annotation Framework for the Semantic Web</article-title>
          . In: S. Ishizaki (ed.),
          <source>Proc. of The First International Workshop on MultiMedia Annotation. January</source>
          ,
          <volume>30</volume>
          -
          <fpage>31</fpage>
          ,
          <year>2001</year>
          . Tokyo, Japan.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M.</given-names>
            <surname>Vargas-Vera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Domingue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Kalfoglou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Motta</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Buckingham-Shum</surname>
          </string-name>
          ,
          <article-title>Template-driven information extraction for populating ontologies</article-title>
          .
          <source>Proc of the IJCAI'01 Workshop on Ontology Learning</source>
          , Seattle, WA, USA
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>M.</given-names>
            <surname>Vargas-Vera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Motta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Domingue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Buckingham</given-names>
            <surname>Shum</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Lanzoni</surname>
          </string-name>
          ,
          <article-title>Knowledge Extraction by using an Ontology-based Annotation Tool</article-title>
          . First International Conference on
          <article-title>Knowledge Capture (K-CAP</article-title>
          <year>2001</year>
          ).
          <source>Workshop on Knowledge Markup and Semantic Annotation</source>
          ,
          <string-name>
            <surname>Victoria</surname>
            <given-names>B.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Canada</surname>
          </string-name>
          ,
          <year>October 2001</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>