<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Rabbit to OWL: Ontology Authoring with a CNL-based Tool</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ronald Denaux</string-name>
          <email>rdenaux@comp.leeds.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vania Dimitrova</string-name>
          <email>vania@comp.leeds.ac.uk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anthony G. Cohn</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Catherine Dolbear</string-name>
          <email>Catherine.Dolbear@ordnancesurvey.co.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Glen Hart</string-name>
          <email>Glen.Hart@ordnancesurvey.co.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ordnance Survey Research</institution>
          ,
          <addr-line>Romsey Rd, Southampton, SO16 4GU</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Computing, University of Leeds</institution>
          ,
          <addr-line>Woodhouse Lane, Leeds, LS2 9JT</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <kwd-group>
        <kwd>CNL for ontology engineering</kwd>
        <kwd>CNL-based tools</kwd>
        <kwd>Semantic Web</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. CNL-based Ontology Authoring</title>
      <p>
        There is a recent trend of using controlled natural language (CNL) interfaces to provide
more intuitive ways for entering abstract knowledge constructs [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This can reduce the
complexity of knowledge formulation, which can lead to wider user involvement in
ontology authoring and improved efficiency of the knowledge engineering process.
However, CNL-based tools for ontology engineering focus solely on providing a CNL
interface, while ignoring the whole ontology construction process, and still require
good knowledge engineering skills. We have developed a novel approach where a
CNL-based tool has been designed to support the involvement of domain experts
without knowledge engineering background in the overall ontology authoring process.
This work has been inspired by the ontology authoring experience at Ordnance Survey,
the mapping agency of Great Britain.
      </p>
      <p>
        The Ordnance Survey is developing a modular topographic domain ontology to
facilitate the description and reuse of its topographic data by third parties [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. At the
heart of ontology development is the active involvement of domain experts (e.g.
geographers and ecologists), which is reflected in the Ordnance Survey’s methodology
for ontology construction [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], comprising of several steps:
• Identifying the scope, purpose and other requirements of the ontology;
• Gathering source knowledge and documents and identifying ontologies for reuse;
• Capturing the ontology content in a knowledge glossary;
• Formally defining core concepts and relationships between concepts by using
structured English sentences;
• Converting the structured English sentences into OWL1;
1 OWL (Web Ontology Language) is a W3C standard for authoring ontologies intended to be
used in the Semantic Web. See http://www.w3.org/TR/owl-features/
• Ontology verification and validation.
      </p>
      <p>Following this methodology, the domain expert is engaged in the construction of a
conceptual ontology which involves the first four steps. The knowledge engineer is
then performing the last two steps which focus on the logical level of the ontology.</p>
      <p>
        A crucial component of the Ordnance Survey methodology is the use of a controlled
language for authoring the conceptual ontology - a CNL, called Rabbit, has been
developed for this purpose [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Rabbit is aimed to be easy for domain experts to read
and write, allowing them to express what they need to in order to describe their
domain. The design and evaluation of Rabbit is presented elsewhere [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], a comparison
with other controlled languages is given in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>This paper focuses on the support for creating OWL ontologies in Rabbit. A tool
called ROO (Rabbit to OWL Ontology authoring) is outlined, pointing at aspects of
Rabbit parsing and the CNL-enabled interaction.</p>
    </sec>
    <sec id="sec-2">
      <title>3. Rabbit to OWL Parsing in ROO</title>
      <p>ROO is a Protégé plug-in that assists domain experts to build conceptual ontologies.
ROO includes the following usability features to support ontology authoring:
• provide easy to understand suggestions and task specific messages to help the user
enter correct Rabbit constructs;
• show feedback about the parsed structure to help the user recognize CNL patterns;
• show warnings when the user sentences are parsed but there may be ambiguity
when selecting the corresponding knowledge constructs in OWL;
• facilitate the knowledge input process by providing a list of Rabbit templates;</p>
      <p>
        ROO provides a customized interface to support the ontology authoring process by
entering Rabbit sentences. See [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for an overview of the system architecture and
example screenshots. A key module of ROO is the Rabbit parser performing two main
tasks: parsing Rabbit sentences and converting parsed sentences to OWL.
Parsing Rabbit Sentences and Conversion to OWL
The Rabbit parser uses a chain of linguistic Processing Resources. The current
implementation of the parsing API follows the CLOnE [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] approach and is based on
libraries from the GATE2 text processing environment. Specifically, ANNIE3 performs
basic natural language preprocessing such as tokenization, sentence splitting, Part of
Speech (POS) and morphological tagging. As an extra preprocessing step we use a
Rabbit key phrase gazetteer to add annotations to key phrases used in Rabbit.
      </p>
      <p>JAPE4 transducers perform the parsing of Rabbit constructs based on the
annotations gathered during the preprocessing phase. Most constructs contain key
phrases which makes the constructs unambiguous. For example: &lt;instance&gt; and
2 http://gate.ac.uk
3 Information extraction library included in GATE http://gate.ac.uk/sale/tao/index.html#annie
4 Java Annotation Pattern Engine which provides a language for finding annotation patterns
during the parsing process. It also provides hooks for invoking Java methods during the
parsing of a text. See http://gate.ac.uk/sale/tao/index.html#jape
&lt;instance&gt; are different, where we have underlined the key phrases.
However, constructs like concepts, relationships and instances are not known a priori
and can consist of multiple words e.g. Natural Body of Water. JAPE rules
restrict the possible set of these constructs based on the initial POS and morphological
annotations (e.g. a noun phrase is expected to be a concept and a relationship is
expected to be a verb phrase). This allows for some ambiguity at the initial stage of the
parsing, as some part of text may be linked to different Rabbit construct (e.g. compare
Transport of Water and Body of Water. Both have the same linguistic
structure, but on a conceptual level Body of Water is interpreted as a single
concept, while Transport of Water is a composed concept describing a subclass
of concept Transport restricted by its relationship to concept Water). The Rabbit
parser uses the ontology being defined to decide which interpretation should be used.</p>
      <p>The end result of the parsing process is a Java representation of Rabbit sentences.
At this level, the parser removes any ambiguity by using the ontology being built, as
well as heuristic rules. Consider the sentence Every Irrigation Canal
contains Water for Irrigation. The parsed result of the sentence depends
on whether the ontology defines relation contains water for or relation
contains, and concepts Irrigation Canal, Water, Irrigation or Water
for Irrigation. The parser will detect when any of these relations and concepts
are missing and, if necessary, will prompt the user to define the missing part. The user
will also be warned about ambiguity when several possibilities are already defined in
the ontology since the heuristic used to disambiguate might not be correct and this can
result in an unwanted OWL translation. The advantage of this approach is that the
parser is able to correctly detect sentences even when there are missing parts in the
ontology or when parts are linguistically ambiguous. This also enables the parser to
provide better error messages and suggestions to the user. A possible drawback is that
the user has to be careful not to introduce names which may bring undesired
ambiguity, but the parser helps to avoid this by giving warning messages.</p>
      <p>The conversion to OWL is done based on the Java representation produced by the
parser. The Rabbit language specification defines the OWL equivalent of each Rabbit
sentence. For instance, declaring concept Water adds the following axioms to the
ontology: SubClassOf(Water owl:Thing) and EntityAnnotation(
Class(Water) Label(“Water”)). The current implementation of the
conversion to OWL is based on OWLAPI5, because that is the API used by Protégé.
The architecture allows for an easy reimplementation where a different API could be
used (such as Jena).</p>
    </sec>
    <sec id="sec-3">
      <title>4. Evaluation and Current State</title>
      <p>
        Continuous user studies are performed at Ordnance Survey to examine the usability
and the use of Rabbit to define sample domain ontologies [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. The ROO tool has been
evaluated in a recent study at the University of Leeds with 16 volunteers from the
departments of Geography (8 students) and Earth and Environment (8 students)[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]
5 http://sourceforge.net/projects/owlapi/
following ontology modeling tasks conducted by domain experts associated with
Ordnance Survey. To examine the benefits of the support offered in ROO, the
interaction with ROO was compared to another authoring tool – ACE View[
        <xref ref-type="bibr" rid="ref6 ref9">6</xref>
        ] –
which provides similar interaction means.
      </p>
      <p>The study showed that with even a minimal amount of training, in both ROO and
ACE View, domain experts were able to perform ontology authoring tasks without the
need to learn ontology languages, such as OWL. It is always likely that for complex
ontologies knowledge engineering skills will be required. However, the study indicated
that if methodical, intelligent support for ontology authoring is embedded in the
authoring tool, as this is done in ROO, domain experts would be able to actively
engage in the ontology authoring process. There was a strong evidence that the support
and guidance offered in ROO led to better usability and had positive effect on the
quality of the resultant ontologies. The study also identified requirements for further
support that could be provided by improving the natural language analysis and by
considering common error patterns when using a CNL to define ontological constructs.</p>
      <p>The current version of ROO6 provides support for the full set of Rabbit constructs.
Further studies with ROO are planned for early 2009.</p>
      <p>During the 1 hour presentation at CNL09, we would like to present more details on
the topics discussed in this abstract. We will include a demonstration of how the tool is
typically used, highlighting the advantages of using a CNL-based tool for ontology
construction. We would also like to discuss issues we have come across during the
implementation of ROO such as expressing complex logical constructs using Rabbit
without introducing ambiguity and evaluating the resultant ontology.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>V.</given-names>
            <surname>Dimitrova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Denaux</surname>
          </string-name>
          , G. Hart,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dolbear</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.G.</given-names>
            <surname>Cohn. Involving Domain</surname>
          </string-name>
          <article-title>Experts in Authoring OWL Ontologies</article-title>
          .
          <source>In Proc. of the International Semantic Web Conference (ISWC</source>
          <year>2008</year>
          ), Germany,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>C.</given-names>
            <surname>Dolbear</surname>
          </string-name>
          , G. Hart,
          <string-name>
            <given-names>J.</given-names>
            <surname>Goodwin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kovacs</surname>
          </string-name>
          .
          <article-title>The Rabbit language: description, syntax and conversion to OWL</article-title>
          .
          <source>Ordnance Survey Research Labs Tech. Report IRI-0004</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>A.</given-names>
            <surname>Funk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tablan</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Bontcheva</surname>
          </string-name>
          .
          <article-title>Controlled Language IE Components version 2</article-title>
          .
          <source>SEKT project deliverable D.2.2</source>
          .2.,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>A.</given-names>
            <surname>Funk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tablan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Bontcheva</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cunningham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Davis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Handschuh</surname>
          </string-name>
          .
          <source>CLOnE: Controlled Language for Ontology Editing In Proc. of the International Semantic Web Conference (ISWC</source>
          <year>2007</year>
          ), Korea,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>G.</given-names>
            <surname>Hart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Johnson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dolbear</surname>
          </string-name>
          . Rabbit:
          <article-title>Developing a Control Natural Language for Authoring Ontologies</article-title>
          .
          <source>In Proceedings of the European Semantic Web Conference (ESWC2008)</source>
          , Spain,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>K.</given-names>
            <surname>Kaljurand</surname>
          </string-name>
          .
          <article-title>Attempto Controlled English as a Semantic Web Language</article-title>
          .
          <source>PhD thesis</source>
          , Faculty of Mathematics and Computer Science, University of Tartu,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>K.</given-names>
            <surname>Kovacs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dolbear</surname>
          </string-name>
          , G. Hart,
          <string-name>
            <given-names>J.</given-names>
            <surname>Goodwin</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Mizen</surname>
          </string-name>
          <article-title>A Methodology for Building Conceptual Domain Ontologies</article-title>
          .
          <source>Ordnance Survey Research Labs Technical Report IRI-0002</source>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>R.</given-names>
            <surname>Schwitter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kaljurand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cregan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dolbear</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Hart</surname>
          </string-name>
          .
          <article-title>A Comparison of three Controlled Natural Languages for OWL 1.1</article-title>
          .
          <source>In Proc. of the 4th International Workshop on OWL Experiences and Directions (OWLED</source>
          <year>2008</year>
          -DC),
          <string-name>
            <surname>Washington</surname>
            <given-names>DC</given-names>
          </string-name>
          , USA,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>6 ROO is developed within the Confluence project and is distributed as open source</article-title>
          . It can be downloaded from: http://www.comp.leeds.ac.uk/confluence/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>