<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>OntoVerbal: a Prot e´g e´ plugin for verbalising ontology classes</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shao Fen Liang</string-name>
          <email>fennie.liang@cs.man.ac.uk</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Robert Stevens</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Donia Scott</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alan Rector</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science, University of Manchester</institution>
          ,
          <addr-line>Oxford Road, Manchester, M13 9PL</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Engineering and Informatics, University of Sussex</institution>
          ,
          <addr-line>Falmer, Brighton, BN1 9QH</addr-line>
          ,
          <country country="UK">UK</country>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>3</lpage>
      <abstract>
        <p>OntoVerbal attempts to reduce the difficulties that non-ontology experts face in 'reading' ontologies, and the burden that ontology authors face in writing natural language definitions of classes. It does this by verbalising (i.e., automatically generating as natural language) the axioms of OWL classes. Its method relies on presenting, through the use of natural language generation (NLG), naturalistic descriptions of ontology classes as textual paragraphs. OntoVerbal has been implemented as a Prot e´ge´ plugin that can offer an alternative 'English' view of a class and graphical views provided by various other Prot e´ge´ plugins. The plugin provides automatic RDF label generation for ontology entities and a natural language description for each class, both for the asserted and 'inferred' forms of the class. We have made OntoVerbal, version 1.0, available for Prote´ g e´ 4.1 via http://swatproject.org/demos.asp.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 INTRODUCTION</title>
      <p>
        Ontology development involves at least two ‘hard’ authoring
activities: creating axioms in a new ontology and editing existing
axioms. Thus it is fundamental to using an ontology that the author
is able to understand its content. As a consequence, managing
ontologies is a highly skilled task that tends to be carried out by
specialists. A richly axiomatised ontology can be hard to read,
either in a native OWL syntax or in some graphical presentation.
Given the growing importance and proliferation of ontologies in
the biomedical and other fields, the lack of ready access to their
content is a major stumbling block to wider use. Also, natural
language descriptions are a desirable feature of ontologies and
mandated by the Open Biomedical Ontologies consortium, and as
these are time-consuming to write, support for their production can
be valuable (
        <xref ref-type="bibr" rid="ref3">Stevens et al. (2011)</xref>
        ).
      </p>
      <p>
        OntoVerbal has been developed to help address these problems
(
        <xref ref-type="bibr" rid="ref1 ref2">Liang et al. (2011</xref>
        a)). It has applied methods from linguistics,
psycholinguistics and computational linguistics to achieve its
language generation (
        <xref ref-type="bibr" rid="ref1 ref2">Liang et al. (2011</xref>
        b)). In particular,
OntoVerbal deploys axioms of a selected class into a discourse
structure. Thus axioms can be transformed into a set of sentences
and then into a structured and well ordered paragraph that represents
the class. OntoVerbal’s aim is not perfect natural language, but
a generic approach to producing acceptable English for a class’
axioms.
      </p>
      <p>ONTOVERBAL IN PROTE´ GE´
OntoVerbal generates a natural language paragraph for any selected
class. For illustrative examples in this paper, we use the heart
ontology1 that describes the anatomy of a human heart. The
ontology’s axioms relating to the Valve class are :
(&lt;AnatomicalCavity&gt;DisjointClasses &lt;Valve&gt;)
(&lt;TricuspidValve&gt;SubClassOf &lt;Valve&gt;)
(&lt;PartialValve&gt;SubClassOf &lt;Valve&gt;)
(&lt;Valve&gt;SubClassOf &lt;AnatomicalConcept&gt;)
(&lt;SemiLunarValve&gt;SubClassOf &lt;Valve&gt;)
(&lt;VestigialCardiacValve&gt;SubClassOf &lt;Valve&gt;)
(&lt;MitralValve&gt;SubClassOf &lt;Valve&gt;)
(&lt;AtrioVentricularValve&gt;EquivalentTo (&lt;Valve&gt;and (&lt;hasValveInput&gt;
some &lt;AtriumCavity&gt;) and (&lt;hasValveOutput&gt;some &lt;VentricularCavity&gt;))
)</p>
      <p>OntoVerbal has structured and ordered these axioms into an
English paragraph (Figure 1) according to Rhetorical Structure
Theory as</p>
      <p>A valve is a kind of anatomical concept. More specialised
kinds of valve are mitral valve, partial valve, semi lunar valve,
tricuspid valve and vestigial cardiac valve. Also, a valve is
different from an anatomical cavity. Another relevant aspect
of a valve is that an atrio ventricular valve is defined as a valve
that has valve input an atrium cavity and has valve output a
ventricular cavity.
1 retrieved from http://owl.cs.manchester.ac.uk/
repository/download?ontology=http://smi.stanford.
edu/people/dameron/ontology/anatomy/heart\&amp;format=
RDF/XML downloaded April 2012.</p>
      <p>The names of the ontology’s classes provide much of the lexical
content of the generated English, and so if the ontology does not
use well formed labels, but only URI fragments, this will have
a detrimental effect on OntoVerbal’s verbalisation; for example,
instead of reading A valve is a kind of anatomical concept the
reader will be confronted with A &lt;URI#Valve&gt; is a kind of
&lt;URI#AnatomicalConcept&gt;. In the latter case, OntoVerbal will
also lose some of its abilities in paragraph generation, such as
putting articles in the right places. For this reason, OntoVerbal will,
when necessary, make its own labels from URI fragments. The
natural language generation engine will supply labels for ontology
classes, object properties, data properties and individuals. It breaks
entity URI fragments such as CamelCase, Under score or a mixture
of both into separate words.</p>
      <p>OntoVerbal can also provide descriptions for classes after
reasoning. The description for the Valve class after running a
reasoner becomes:</p>
      <p>A valve is a kind of anatomical concept. A more specialised
kind of valve is partial valve. Also, a valve is different from a
left atrium cavity, a coronary artery, a vestigial cardiac valve,
a conus artery, a right marginal artery, a pulmonary valve, ...
an apex of heart, an anterior part of wall of right ventricle, a
valve of coronary sinus, a left circumflex artery and an aorta.
Another relevant aspect of a valve is that an atrio ventricular
valve is defined as a valve that has valve input an atrium cavity
and has valve output a ventricular cavity.</p>
      <p>After reasoning, much more is known about the class and this
obviously has an effect on the verbalisation. However, a reader
needs verbalisation of both views at different times—as is provided
in tools such as Prote´ge´. The inferred description (figure 2), in fact,
contains 64 disjoint classes, and some of them are omitted in this
paper. The red coloured classes shown in Prote´ge´ are unsatisfiable,
but OntoVerbal’s descriptions has ignored the red colour and still
generated descriptions using the inferred axioms.</p>
    </sec>
    <sec id="sec-2">
      <title>DISCUSSION</title>
      <p>Currently, OntoVerbal uses lightweight linguistic approaches for its
NL paragraph generation. The main reason is that OntoVerbal is
intended as a real time application and employing heavy linguistic
methods will slow down its performance. Also, since the aim is not
to produce perfect English, but rather English that is acceptable
for the purpose of revealing clearly the content of the ontology,
the output of OntoVerbal will at times include incorrect articles
(as seen above) and/or plurals, and be clumsy in places. Since
OntoVerbal is intended to be faithful to its input, in contexts where
the selected class contains many related axioms, it will sometimes
produce excessively long paragraphs. Given that our aim is for rapid
generation of coherent English text for any class, we feel that these
compromises are acceptable.</p>
      <p>The OntoVerbal Description tab can generate paragraphs for
classes without RDF labels, but the text will be of reduced
quality compared to those with hand-crafted labels. The OntoVerbal
Description tab can also provide more specific descriptions for
classes if a reasoner is used. OntoVerbal will not replace
handcrafted natural language descriptions, but can provide a substitute in
their absence. It also provides an alternative view to an ontology’s
axioms in a reasonably familiar natural language form that seeks to
‘ease’ access to often complex ontologies.</p>
    </sec>
    <sec id="sec-3">
      <title>ACKNOWLEDGEMENTS</title>
      <p>This work is part of the Semantic Web Authoring Tool (SWAT)
project (see www.swatproject.org), which is supported by the UK
Engineering and Physical Sciences Research Council (EPSRC)
grant EP/G032459/1, to the University of Manchester, the
University of Sussex and the Open University. We are grateful for
the comments received from our colleagues on the project.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>S. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevens</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scott</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Rector</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2011a</year>
          ).
          <article-title>Automatic verbalisation of SNOMED classes using ontoverbal</article-title>
          .
          <source>Proceedings of the 13th Conference on Artificial Intelligence in Medicine, AIME 2011</source>
          , pages
          <fpage>338</fpage>
          -
          <lpage>342</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Liang</surname>
            ,
            <given-names>S. F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Scott</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stevens</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Rector</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2011b</year>
          ).
          <article-title>Unlocking medical ontologies for non-ontology experts</article-title>
          .
          <source>Proceedings of the 2011 Workshop on Biomedical Natural Language Processing, ACL-HLT</source>
          <year>2011</year>
          , pages
          <fpage>174</fpage>
          -
          <lpage>181</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Stevens</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Malone</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Power</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          , and
          <string-name>
            <surname>Third</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          (
          <year>2011</year>
          ).
          <article-title>Automating generation of textual class definitions from OWL to English</article-title>
          .
          <source>Journal of Biomedical Semantics</source>
          ,
          <volume>2</volume>
          (
          <issue>Suppl 2</issue>
          ),
          <fpage>S5</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>