<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Terminology-Based Patterns for Natural Language Denitions in Ontologies</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dagmar Gromann</string-name>
          <email>dgromann@wu.ac.at</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Vienna University of Economics and Business</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Natural language content in ontologies is crucial to any human interaction with them, but scarcely available. Terminology science centers on best practices in domain-specic natural languages. Hence, ontologies can benet from the systematic approach of terminology to natural language denitions. This paper proposes an Annotation Ontology Design Pattern named Natural Language Denition ODP that provides natural language denitions for ontology classes. For this purpose, a (semi-)automated method for implementing this pattern combining ontology verbalization and information extraction is investigated herein and exemplied in the domain of nance.</p>
      </abstract>
      <kwd-group>
        <kwd>Annotation ODPs</kwd>
        <kwd>Natural Language Denition</kwd>
        <kwd>Terminology</kwd>
        <kwd>Automatic Extraction of ODPs</kwd>
        <kwd>Domain-Specic ODP Application</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        A growing number of application scenarios for Semantic Web (SW) ontologies
render reusable, high-quality solutions to their design increasingly important.
For this purpose, Ontology Design Patterns (ODP) dene a formal methodology
for various aspects of ontological design, ranging from Logical to Presentation
ODPs [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The latter seek to increase the usability and readability of ontologies
from a user’s perspective, which are vital to multi-lingual scenarios [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and
interactions with domain experts and users [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and are divided into Annotation and
Naming ODPs. Annotation ODPs provide best practices for homogeneous
natural language (NL) expressions ( rdfs:label) and denitions ( rdfs:comment ),
while the latter focus on naming conventions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. A general paucity of an
operational approach to NL denition authoring and its time-intensive nature led to
scarce and frequently inconsistent NL denitions in ontologies. Thus, this paper
investigates their (semi-)automated generation by means of the proposed
Annotation ODP Natural Language Denition ODP based on established methods
from terminology science.
      </p>
      <p>Ever since its advent, terminology science has realized the need of
providing a systematic approach to NL denitions. Concept-centered terminologies as
dened by ISO 704:2009 and 1087:2000 consist of sets of terminology concepts
in specialized domains. NL denitions are required to form these terminology
concepts and their interrelations. In contrast, ontology concepts are formally
dened by means of logics. Combining the formal ontological denition with the
terminological NL denition authoring method results in a multidimensional
approach formalized as the proposed Annotation ODP. Thereby, the ISO 704
method of combining the denomination of the superordinate concept with (a)
characteristic(s), delimiting the concept to be dened from its related concepts,
can be (semi-) automated and is the foundation of the proposed pattern. Given
domain-specic, axiomatized ontology elements with a minimum of NL coverage
in labels or fragment identiers, the superordinate concept’s denomination can
be identied and applied by using its subsumption hierarchy. For the non-trivial
purpose of obtaining characteristics, three mutually complementing approaches
are proposed: ontology verbalization, utilizing existing NL content, and
information extraction. An example is provided by applying the pattern to the partially
available NL content of Fadyart’s Finance Ontology 1.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Natural Language Denition ODP</title>
      <p>The objective of the NL Denition ODP is to dene an ontology concept in
natural language(s). Ontologies represent knowledge by formalizing
vocabularies of terms as well as their interrelations and dene their meaning formally.
Terminologies mostly rely on NL characteristics to establish NL denitions for
concepts and interrelate concepts designated by terms, appellations, or symbols.
The two most important types of denitions as specied by ISO 704 are
extensional denitions, listing the instances of a concept, and intensional denitions.
The latter constitutes a combination of superordinate concept and manually
identied delimiting characteristic(s) for concepts related generically.</p>
      <p>The intensional approach oers the most explicit, consistent, and precise
method to denition formation. It is intended to provide the minimum of
information needed for human users to dierentiate one terminology concept from
another. To facilitate its automation, the basic textual description of ISO has
been adopted and formalized for proposing the NL Denition ODP introduced
in Denition 1 and illustrated in Example 1.</p>
      <p>The pattern denes the NL denition of an entry term, which corresponds
to the label of the ontology class. The singular form of the term is preferred,
unless only available in plural, e.g. liabilities. It utilizes the label or fragment
identier of the superordinate concept, which for the experiment herein is
restricted to Noun Phrases (NP). Thereby, it obtains a context and implicitly
inherits the characteristics of the superclass. The NP is connected to
characteristics by utilizing a nite set of relative pronouns, verbs, and where applicable
verbalized object properties. The same elements and a coordinating conjunction
are needed to string together several characteristics.</p>
      <p>Obtaining the characteristic(s) relies on a three-tiered mutually
complementing approach of ontology verbalization, utilizing existing NL content, and
infor</p>
      <sec id="sec-2-1">
        <title>1 http://fadyart.com/ version 3.04</title>
        <p>mation extraction from structured Web resources. All three of them help
specifying the relative pronoun and linking verb to be used for the concept to be dened.
Denition 1: NL Denition ODP
Entry Term
[A/An] NP&lt;superclass&gt; [which/that/who/whose] [(can) be/include/belong to/classify as OR
&lt;objectProperty&gt;] [(&lt;characteristic(s)&gt;) and] &lt;characteristic&gt;
Example 1: NL Denition of Concept Card (Fadyart Finance Ontology)
Card
[A] payment instrument [that] [has as card type] &lt;a credit card or debit
card&gt; and [has as card data] &lt;a start date, sequence number, holder name, expiry date, issuer
name, card number, security code&gt;</p>
        <p>
          Ontology verbalization refers to the translation of ontology concepts,
relations, and axioms to (controlled) natural languages, such as Attempto Controlled
English [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. In contrast to controlled natural language, the objective herein is to
use verbalized ontology elements to identify the appropriate verb and relative
pronoun linking the denition’s characteristics. For this purpose, verbalization
patterns have been identied, of which selected ones are provided below.
P1 - ObjectUnionOf : [a/an OR ObjectMinCardinalty] [(NP&lt;class&gt;,)* or] NP(class)
P2 - ObjectMinCardinalty : at least &lt;number&gt;
P3 - ObjectSomeValuesFrom : NP&lt;class&gt;(domain)[&lt;ObjectSomeValuesFrom&gt; that &lt;ObjectProperty&gt;
[a/an] NP&lt;class&gt;(range(s))]
P4 - ObjectProperty with has is split into two parts: NP&lt;class&gt;&lt;(domain) &lt;ObjectProperty:has&gt;
as &lt;ObjectProperty:rest&gt; [a/an] NP&lt;class&gt;(range)
Should the label of the object property already contain the concept label in the
range, the concept label is not reiterated in the NL denition, e.g. hasManager
pointing to Manager. The above list is not exhaustive and requires NLP
methods for its implementation, e.g. tokenization. The application of these patterns
to characteristic formation will be exemplied in the next section.
        </p>
        <p>In a next step, the existing rdfs:comment of the ontology class is linked to
the NP and, where applicable, verbalized content by means of a coordinating
conjunction and the identied relative pronoun, which, if not available in the
comment, can be obtained from Wiktionary.</p>
        <p>If no NL content is available, re-using existing structured Web resources, such
as DBpedia, has been considered. The tentative information extraction process
herein relies on string matching and an immediate subsumption to top
DBpedia ontology concepts (e.g. Organization, Resource). Reducing NL denitions to
DBpedia information might result in quality issues. For instance, circular
denitions are frequent on DBpedia, i.e., a term is dened by itself or by a second term
that refers back to the rst term. For instance, Debtor is dened as Debtor
owes a debt to someone .... Applying the proposed pattern ensures the proper
context for the concept, i.e., superordinate concept, and DBpedia information
provide useful additional details.</p>
        <p>Due to its systematic nature, the described pattern enables a consistent
formation of NL denitions, which strongly enhances the human readability of
ontologies it is applied to. The proposed pattern is illustrated for the English
language and requires minor adaptations for its realization in other NLs
syntactically similar to English provided lexical resources are available.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Example Application</title>
      <p>An OWL ontology serves as the input to the intended system design, here
exemplied with Fadyart’s Finance Ontology in English. By means of the OWL API
the ontology can be parsed, the subclass relations and object properties
identied, and an annotation property can be added. Starting from &gt;, the subsumption
hierarchy is traversed to the rst concept not directly subsumed by it. If its
superordinate concept contains no label, its fragment identier is tokenized (using
e.g. the Stanford Core NLP) and represents the NP&lt;superclass&gt; of Denition
1. In Example 2, the class ClientPortfolio is the subclass of AccountsPayable.</p>
      <p>To ensure the correct grammatical number and relative pronoun, the
superclass term is queried in Wiktionary, e.g. Java-based Wiktionary Library 2. Here,
the query returns plural only and the relative pronoun that 3 for Accounts
Payable. Subsequently, tokenization and verbalization pattern P4 dened in the
previous section are applied to the object property of Example 2. Its range
consists of a union of three classes, which is verbalized using pattern P1. Finally, the
existing comment is to be added to the already obtained denition. By means
of Wiktionary Clients in the existing rdfs:comment is identied as countable
noun, so its singular form can be combined with the obtained denition using a
coordinating conjunction and the relative pronoun identied above. The derived
denition can be added to the concept ClientPortfolio as rdfs:comment .
Example 2: Class ClientPortfolio in Manchester Syntax
Original Input in OWL
Ontology: &lt;http://www.fadyart.com/Finance.owl&gt; ...</p>
      <p>ObjectProperty: hasClientPortfolioBenecialOwnerOfIncome</p>
      <p>SubPropertyOf: hasAccountDomain
Domain: ClientPortfolio
Range: PartyHolder or PartyLegalRepresentative or</p>
      <p>PartyUsufructuary
Class: AccountsPayable</p>
      <p>Annotations: rdfs: label "Accounts payable"@en</p>
      <p>SubClassOf: ShortTermLiabilities
Class: ClientPortfolio</p>
      <p>Annotations: rdfs: label "Client accounts"@en,
rdfs :comment "The clients of the nancial institution for who’s
account the securities handling operations are performed."^^
xsd:string
SubClassOf: AccountsPayable
Resulting Denition</p>
      <p>Client Account
Accounts payable that has as client
lpeoarsttfoolinoe pbaernteycihaolldoewr,nepraortfyinlceogmale at
representative , or party usufructuary and
tihnasttitiustiaonclifeonrt wohfo’tsheaccnoaunnctiatlhe securities
handling operations are performed.</p>
      <p>Several object properties and enormous unions render it necessary that
domain experts decide which verbalization most adequately denes the concept.
Additionally, at times comments are utilized for supplementary information
rather than NL denitions of concepts, which is why for some cases the comments
might not be re-used for the denition formation process.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Related Work</title>
      <p>
        Glosses for ontology concepts reuse existing lexical resources, e.g. WordNet, to
provide ontology engineers with various linguistic descriptions to choose from for
a specic concept [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Approaches grounding existing ontologies in lexical and
      </p>
      <sec id="sec-4-1">
        <title>2 http://code.google.com/p/jwktl/ 3 Money that is owed ...</title>
        <p>
          linguistic descriptions (e.g. lemon 4) either re-use glosses for NL descriptions or
derive meaning by pointing to the semantic object in the ontology. Ontology
verbalization utilizes formalized knowledge in ontologies to derive NL descriptions.
For instance, the SWAT project 5 facilitates the understandability of verbalized
entailments by providing individual inference steps in the English language [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
Instead of providing an external meta-model or ontology engineering support,
the proposed pattern seeks to re-use existing resources and use verbalization
patterns in order to provide NL denitions for existing domain-specic ontology
classes. As a standard-based approach, it reects established best practices and
accepted semiotic theories.
5
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Discussion and Future Work</title>
      <p>This paper proposes a NL denition ODP on the basis of denition
formation methods from terminology science. Subsequent to dening the pattern, a
(semi-)automated design to obtaining NL denitions by means of ontology
verbalization, utilization of existing NL comments, and information extraction has
been exemplied in the nancial domain. In terms of future work, the degree to
which the pattern can be generalized to other domains will be tested. As regards,
information extraction, a profound disambiguation process will be considered.
Furthermore, its formalization for a submission to the ontology design pattern
repository is planned.</p>
      <sec id="sec-5-1">
        <title>4 http://lemon-model.net</title>
        <p>5 http://swatproject.org</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Presutti</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blomqvist</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daga</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gangemi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Pattern-Based Ontology Design</article-title>
          . In Suarez-Figueroa,
          <string-name>
            <given-names>M.C.</given-names>
            ,
            <surname>Gomez-Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Motta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Gangemi</surname>
          </string-name>
          , A., eds.: Ontology Engineering in a Networked World. Volume
          <volume>12</volume>
          . Springer (
          <year>2012</year>
          )
          <fpage>3564</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>McCrae</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sintek</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>LexInfo: A Declarative Model for the Lexicon-Ontology Interface</article-title>
          .
          <source>Web Semantics: Science, Services and Agents on the World Wide Web</source>
          <volume>9</volume>
          (
          <issue>1</issue>
          ) (
          <year>2011</year>
          )
          <fpage>2951</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Damljanovic</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Agatonovic</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cunningham</surname>
          </string-name>
          , H.:
          <article-title>Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and Ontology-Based Lookup through the User Interaction</article-title>
          . In Sure, Y.,
          <string-name>
            <surname>Domingue</surname>
          </string-name>
          , J., eds.:
          <source>The Semantic Web: Research and Applications</source>
          . Volume
          <volume>19</volume>
          . Springer (
          <year>2010</year>
          )
          <fpage>106120</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kaljurand</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kuhn</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>A Multilingual Semantic Wiki Based on Attempto Controlled English and Grammatical Framework</article-title>
          . In Corcho, P.C.O.,
          <string-name>
            <surname>Hollink</surname>
            ,
            <given-names>V.P.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudolph</surname>
          </string-name>
          , S., eds.:
          <source>The Semantic Web: Semantics and Big Data</source>
          . Volume
          <volume>17</volume>
          . Springer (
          <year>2013</year>
          )
          <fpage>427441</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Jarrar</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Position paper: Towards the Notion of Gloss, and the Adoption of Linguistic Resources in Formal Ontology Engineering</article-title>
          .
          <source>In: Proceedings of the 15th international conference on World Wide Web, ACM</source>
          (
          <year>2006</year>
          )
          <fpage>497503</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Nguyen</surname>
            ,
            <given-names>T.A.T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Power</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Piwek</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Williams</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Predicting the Understandability of OWL Inferences</article-title>
          . In Corcho, P.C.O.,
          <string-name>
            <surname>Hollink</surname>
            ,
            <given-names>V.P.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rudolph</surname>
          </string-name>
          , S., eds.:
          <source>The Semantic Web: Semantics and Big Data</source>
          . Volume
          <volume>17</volume>
          . Springer (
          <year>2013</year>
          )
          <fpage>109123</fpage>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>