<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>OntoLex and Onomasiological Ordering: Supporting Topical Thesauri</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Leiden University</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Leiden</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>The Netherlands</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>The OntoLex vocabulary has been designed to capture lexicons and to add their lexicographical knowledge to ontologies in the Semantic Web. Although the speci cation of the vocabulary posits that OntoLex allows lexicons to be ordered onomasiologically, it does so for a very speci c kind of onomasiological ordering only. As a consequence, the vocabulary is currently insu cient for capturing a large proportion of the existing topical thesauri. This paper demonstrates the current expressivity and this shortcoming of OntoLex through two case studies: The Historical Thesaurus of the Oxford English Dictionary and The Scots Thesaurus. In order for OntoLex to o er full support for topical thesauri and their ordering principles, this paper proposes the addition of a single property to the vocabulary: ontolex:isSenseIn. . . .</p>
      </abstract>
      <kwd-group>
        <kwd>OntoLex</kwd>
        <kwd>Lemon</kwd>
        <kwd>onomasiological ordering</kwd>
        <kwd>thesaurus</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The Lexicon Model for Ontologies vocabulary has been designed to capture
lexicons and to add their lexicographical knowledge to ontologies in the Semantic
Web [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The vocabulary has seen a number of updates, and was published as
a W3C vocabulary by the OntoLex community group in May 2016 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This
version, henceforth OntoLex, has since been picked up by a number of bodies,
including the Global WordNet Association, to represent and link existing lexical
resources on the Semantic Web [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        The speci cation of OntoLex puts forward a manner in which \lexicons can be
ordered onomasiologically, that is by meanings rather than by lemmas" [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. For
publishers of topical thesauri, this is good news indeed. Such support is essential
for these lexicographical works, which order their words by meaning instead of
from a to z as is common in typical dictionaries. Yet the OntoLex vocabulary
supports a very speci c kind of onomasiological ordering only. As a consequence,
the vocabulary is currently insu cient for capturing the knowledge from a large
proportion of the existing topical thesauri. The current paper demonstrates this
shortcoming of OntoLex and proposes a way forward for the vocabulary.
      </p>
    </sec>
    <sec id="sec-2">
      <title>Methodology</title>
      <p>
        In order to provide insight into the current support of OntoLex for the
onomasiological ordering of topical thesauri, this paper will present two case
studies. The rst is based on the Historical Thesaurus of the Oxford English
Dictionary [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]; the second on The Scots Thesaurus [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Both lexicographical
works employ an onomasiological ordering for their lexicon. The
rst-mentioned thesaurus is considered to be a distinctive one and contains sets
of synonyms. The second is not distinctive but cumulative and refrains from
indicating synonymy [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        This paper expresses samples from both thesauri in the OntoLex vocabulary.
The manner in which OntoLex is applied is in line with the speci cation of the
vocabulary [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and the approach outlined by the Global WordNet Association
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This approach has been adopted by several projects, amongst which the
Open Dutch Wordnet [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Namespaces relevant for this paper are provided in
Listing 1. The RDF snippets in subsequent listings are speci ed in the Turtle
RDF syntax [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Sample data from the case studies correspond with resources
between angular brackets in the RDF snippets (that is to say, their namespace
is left unspeci ed for the present purpose).
      </p>
      <p>Listing 1. Namespaces
@prefix ontolex: &lt;http://www.w3.org/ns/lemon/ontolex#&gt; .
@prefix owl: &lt;http://www.w3.org/2002/07/owl#&gt; .
@prefix rdfs: &lt;http://www.w3.org/2000/01/rdf-schema#&gt; .
@prefix skos: &lt;http://www.w3.org/2004/02/skos/core#&gt; .
@prefix wn: &lt;http://wordnet-rdf.princeton.edu/ontology#&gt; .
3</p>
      <p>
        Case study Historical Thesaurus of the OED
The rst case study presented here is that of the Historical Thesaurus of the
Oxford English Dictionary (HTOED). HTOED captures the English lexis that
has existed throughout its 1300-year history, from Old English up to Modern
English. This topical thesaurus groups together lexical items that are considered
near-synonymous and provides insight into their use in time and place. HTOED
was rst published in print in 2009 [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and in the following year also electronically
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Society
Communication</p>
      <p>Authority</p>
      <p>Lack of
subjection
Permission</p>
      <p>Freedom/liberty
synonyms
freedom, n.
y freeship, n.
y franchise, n.</p>
      <p>liberty, n.
...</p>
      <p>(in sense 3)
(in sense 2)
(in sense 1a)
(in sense 1b of homonym 1)</p>
      <p>
        Expressing categories of the topical system of HTOED in OntoLex is
relatively straightforward. Each HTOED category corresponds with a lexical
concept in OntoLex. The latter is de ned as a \mental abstraction, concept or
unit of thought that can be lexicalized by a given collection of senses" [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This
de nition appears highly applicable to categories from topical thesauri. As
lexical concepts are asserted to be specializations of SKOS concepts, it is
possible to capture the hierarchy between categories using the
broader/narrower relations from SKOS [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Listing 2 contains the RDF for
expressing one of the HTOED categories in OntoLex, \Freedom/Liberty", and
the relation to its superordinate category \Lack of subjection".
      </p>
      <p>Listing 2. HTOED category \Freedom/liberty" expressed in OntoLex
&lt;category-FreedomLiberty&gt; a ontolex:LexicalConcept ;
skos:prefLabel "Freedom/liberty"@en ;
skos:broader &lt;category-LackOfSubjection&gt; .</p>
      <p>
        The OntoLex vocabulary also contains terminology to express lexical senses
and the lexical entries to which they belong. In order to state that a given
lexical sense from HTOED belongs to one of its categories, the property
ontolex:isLexicalizedSenseOf can be used. This property relates a lexical
sense to a lexical concept, stating that it \lexicalizes" that concept. According
to the section on Lexical Nets in the OntoLex speci cation, lexical senses that
lexicalize the same concept are considered synonymous [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In other words, the
relation of synonymy is not explicitly asserted in OntoLex, but can be inferred
from the use of the ontolex:isLexicalizedSenseOf property. The resulting
RDF for the sense of freedom from the HTOED sample and its relation to the
\Freedom/liberty" category is provided in Listing 3.
      </p>
      <p>Listing 3. HTOED sense of freedom expressed in OntoLex
&lt;sense-freedom-n-3&gt; a ontolex:LexicalSense ;
skos:prefLabel "freedom n. (sense 3)"@en ;
ontolex:isSenseOf &lt;entry-freedom-n&gt; ;
ontolex:isLexicalizedSenseOf &lt;category-FreedomLiberty&gt; .
&lt;entry-freedom-n&gt; a ontolex:LexicalEntry ;
skos:prefLabel "freedom, n."@en ;
wn:partOfSpeech wn:noun .</p>
      <p>As shown, capturing the onomasiological ordering of the HTOED lexicon
presents no issues with the OntoLex vocabulary. The vocabulary enables one to
express categories and their hierarchy, lexical senses and their relation to a lexical
entry, and the relation between the senses from HTOED and the categories to
which they belong.
4</p>
      <p>
        Case study The Scots Thesaurus
The second case study in this paper concerns The Scots Thesaurus (ScT) [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. ScT
captures the Lowland Scots lexis available throughout history, from its
twelfthcentury beginnings to the present. This thesaurus, published in 1990, categorizes
its lexical items but does not indicate synonymy. Figure 2 depicts the sample
taken from ScT, encompassing ve categories and four lexical senses.
      </p>
      <p>Farming
Farmers</p>
      <p>Crops
Ploughing</p>
      <p>Sowing
y blander (in sense 'disperse scantily')
happer (in sense 'a basket or container')
heuch (in sense 'earth up plants in drills')
miss (in sense 'fail to germinate or grow')
...</p>
      <p>Fig. 2. Example ScT content</p>
      <p>Expressing categories from ScT is possible in a manner identical to that
used for HTOED. The result for the \Sowing" category from ScT, including its
relation to the superordinate category \Crops", is provided in Listing 4.</p>
      <p>Listing 4. ScT category \Sowing" expressed in OntoLex
&lt;category-Sowing&gt; a ontolex:LexicalConcept ;
skos:prefLabel "Sowing"@en ;
skos:broader &lt;category-Crops&gt; .</p>
      <p>As for the lexical senses from ScT, these too can be expressed in OntoLex
comparable to how it has been done for HTOED. There is, however, a notable
di erence. The property ontolex:isLexicalizedSenseOf is unsuitable for
relating the senses of ScT to the categories to which they belong. The lexical
senses in ScT are not necessarily lexicalizations of the category in question.
Moreover, senses that belong to the same category are not necessarily
considered synonymous. In fact, they rarely are. A case in point are the senses
of happer and miss from the sample. Both of these senses are members of the
category \Sowing", and indeed belong to that semantic domain, but can hardly
be said to be synonymous or even to lexicalize the category.</p>
      <p>What is missing, then, from the OntoLex vocabulary is terminology to
express a looser manner of onomasiological ordering with categories than
ontolex:isLexicalizedSenseOf does. The RDF snippet in Listing 5 contains
the desired situation, where a tentative property isSenseIn is coined (see
highlighted line) to express the relation between the sense of blander and the
category to which it belongs.</p>
      <p>Listing 5. ScT sense of blander expressed in OntoLex
&lt;sense-blander-v-disperseScantily&gt; a ontolex:LexicalSense ;
skos:prefLabel "blander"@sco ;
skos:definition "disperse scantily"@en ;
ontolex:isSenseOf &lt;entry-blander-v&gt; ;
:isSenseIn &lt;category-Sowing&gt; .
&lt;entry-blander-v&gt; a ontolex:LexicalEntry ;
skos:prefLabel "blander, v."@sco ;
wn:partOfSpeech wn:noun .</p>
      <p>In short, OntoLex itself does not yet provide terminology to onomasiologically
order the lexicographical content of ScT { and of other thesauri like it.
5</p>
    </sec>
    <sec id="sec-3">
      <title>Discussion</title>
      <p>The two case studies have shown that OntoLex is not yet expressive enough to
indicate the relation between senses and categories for all topical thesauri. In fact,
the lack of a property like the tentative isSenseIn does not just a ect conveying
content from ScT and the great many existing cumulative thesauri like it. It also
a ects expressing these very relations found in thesauri such as HTOED. After
all, senses in HTOED are not just lexicalizations of a category, they are also
members of a number of categories. To illustrate, the assertion that the HTOED
sense of freedom is a lexicalization of the category \Freedom/liberty" entails that
this sense is a member of not just that category but also of its superordinate
categories (see Listing 6).</p>
      <p>Listing 6. HTOED sense of freedom and its relation to the categories of HTOED
&lt;sense-freedom-n-3&gt; a ontolex:LexicalSense ;
ontolex:isLexicalizedSenseOf &lt;category-FreedomLiberty&gt; ;
:isSenseIn &lt;category-FreedomLiberty&gt; ,
&lt;category-LackOfSubjection&gt; ,
&lt;category-Authority&gt; ,
&lt;category-Society&gt; .</p>
      <p>In order to truly express how senses are categorized according to topical
systems in thesauri, then, additional terminology is required beyond what
OntoLex currently o ers. Properties from other vocabularies that might ll the
gap, such as the subject property from Dublin Core Terms [11], tend to be
too generic to be able to infer further knowledge from topical systems of
thesauri. Moreover, the relation between such properties and
ontolex:isLexicalizedSenseOf is not evident. As such, the required
terminology is best captured in an update of the OntoLex vocabulary itself.
The small addition of a single property such as isSenseIn (see Listing 7),
then, and asserting its connection to the existing OntoLex property (see
Listing 8) would enable onomasiological ordering of lexicons in topical thesauri
of all varieties { distinctive or cumulative, and regardless of whether synonymy
is indicated between senses.</p>
      <p>Listing 7. Suggested OntoLex property isSenseIn
ontolex:isSenseIn a owl:ObjectProperty ;
rdfs:label "is sense in"@en ;
rdfs:comment "This property relates a lexical sense to a
concept that captures its meaning to some
extent (that is, partially or even fully)."@en ;
rdfs:domain ontolex:LexicalSense ;
rdfs:range ontolex:LexicalConcept .</p>
      <p>Listing 8. Connection between existing OntoLex property and the suggested one
ontolex:isLexicalizedSenseOf</p>
      <p>rdfs:subPropertyOf ontolex:isSenseIn .
6</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>This paper has shown, by means of two case studies, to what extent the
OntoLex vocabulary currently supports relating lexical senses to the concepts
that facilitate an onomasiological ordering. Such an ordering is (by their very
de nition) used in lexicographical works known as topical thesauri. As it
stands, the OntoLex vocabulary o ers some support for those thesauri
considered to be distinctive and that capture synonymy. Such thesauri ensure
that lexical senses displayed at a certain category do not just belong to that
category, but also express (or lexicalize) that category. Those thesauri that do
not have that same level of speci city, but merely use their categories to
organize lexical senses into semantic domains, are not yet supported by the
terminology in OntoLex.</p>
      <p>The small addition of a single property, as suggested in this paper, would
have a big impact on the expressivity of OntoLex. The onomasiological ordering
of both distinctive and cumulative thesauri { regardless of whether these thesauri
indicate synonymy { could then properly be conveyed on the Semantic Web. As
a result, the variety of lexicographical resources that sit comfortably in OntoLex
would not be limited to dictionaries and lexical nets, as is presently the case, but
would also include thesauri. Increased support in OntoLex for onomasiological
ordering, then, would allow all these resources to truly shine on the Web. In short,
ordering by meaning through the new ontolex:isSenseIn is both meaningful
and sensible.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>McCrae</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aguado-de Cea</surname>
          </string-name>
          , G.,
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Declerck</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>GmezPrez</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gracia</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hollink</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montiel-Ponsoda</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spohr</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wunner</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Interchanging lexical resources on the Semantic Web</article-title>
          .
          <source>Language Resources and Evaluation</source>
          <volume>46</volume>
          (
          <issue>4</issue>
          ),
          <volume>701</volume>
          {
          <fpage>719</fpage>
          (
          <year>2012</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2. Lexicon Model for Ontologies:
          <source>Community report, 10 May</source>
          <year>2016</year>
          (
          <year>2016</year>
          ). URL http://www.w3.org/
          <year>2016</year>
          /05/ontolex/
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3. Global WordNet Association:
          <article-title>Global Wordnet formats</article-title>
          . URL http://globalwordnet. github.io/schemas/
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Kay</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Roberts</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Samuels</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wotherspoon</surname>
          </string-name>
          , I. (eds.):
          <article-title>Historical thesaurus of the Oxford English Dictionary: with additional material from "A thesaurus of Old English"</article-title>
          . Oxford University Press, Oxford (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Macleod</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cairns</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Macafee</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Martin</surname>
            ,
            <given-names>R</given-names>
          </string-name>
          . (eds.):
          <article-title>The Scots thesaurus</article-title>
          . Aberdeen University Press, Aberdeen (
          <year>1990</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Kay</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alexander</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Diachronic and synchronic thesauruses</article-title>
          . In: P. Durkin (ed.)
          <source>The Oxford handbook of lexicography</source>
          , pp.
          <volume>367</volume>
          {
          <fpage>380</fpage>
          . Oxford University Press, Oxford (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Postma</surname>
          </string-name>
          , M.,
          <string-name>
            <surname>van Miltenburg</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Segers</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schoen</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vossen</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Open Dutch WordNet</article-title>
          . In: Proceedings of the Eighth Global Wordnet Conference. Bucharest, Romania (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Beckett</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Berners-Lee</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Prud</surname>
          </string-name>
          'hommeaux, E.,
          <string-name>
            <surname>Carothers</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <source>: RDF 1.1 Turtle: W3C recommendation 25 February</source>
          <year>2014</year>
          (
          <year>2014</year>
          ). URL http://www.w3.org/TR/ turtle/
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <source>Historical thesaurus of the Oxford English Dictionary</source>
          (
          <year>2010</year>
          ). URL http://oed. com/thesaurus
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>SKOS</surname>
          </string-name>
          <article-title>Simple Knowledge Organization System reference</article-title>
          :
          <source>W3C recommendation 18 August</source>
          <year>2009</year>
          (
          <year>2009</year>
          ). URL http://www.w3.org/TR/skos-reference/ 11. DCMI metadata terms (
          <year>2012</year>
          ). URL http://purl.org/dc/terms/
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>