<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>How Terms Meet in Small-World Lexical Networks: The Case of Chemistry Terminology</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Francesca Ingrosso</string-name>
          <email>francesca.ingrosso@univ-lorraine.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alain Polgue`re</string-name>
          <email>alain.polguere@univ-lorraine.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>ATILF UMR 7118, CNRS-Universite ́ de Lorraine</institution>
          ,
          <addr-line>44 av. de la Libe ́ration, BP 30687, 54063 Nancy Cedex</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>SRSMC UMR 7565, CNRS-Universite ́ de Lorraine</institution>
          ,
          <addr-line>Boulevard des Aiguillettes, BP 70239, 54506 Vandoeuvre-le`s-Nancy Cedex</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <fpage>167</fpage>
      <lpage>172</lpage>
      <abstract>
        <p>We present a new type of terminological model based on formal network structures called lexical systems. Those are nonhierarchical lexical graphs where the bulk of lexical relations is formally encoded by means of Meaning-Text lexical functions. This paper describes how this approach to lexical structuring can be applied to the modeling of terminologies, more specifically, to the French and English terminology of chemistry. The first section explains the importance of terminology in chemistry and introduces the aim of our project. Section 2 is a brief presentation of formal characteristics of lexical systems. Section 3 illustrates the type of terminological descriptions we are implementing with the specific case of the chemical term catalysis.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1.1</p>
    </sec>
    <sec id="sec-2">
      <title>Structuring the Lexicon of Chemistry</title>
      <sec id="sec-2-1">
        <title>Key Role of Terminology in Chemistry</title>
        <p>Terminology plays a key role in chemistry
research. For instance, chemical terms, by their
very morphological structure, are closely related
to the behavior and properties of substances they
designate. As noted by R. Hoffmann1 and P.
Lazlo (1991), the knowledge of the name of a
chemical compound, that strictly reflects the compound
structure, gives the chemist the “control” over the
molecule. Additionally, the terminology of
chemistry is extremely vast and fluctuant. The
importance of using a proper terminology in chemistry
has lead to the creation, in 1919, of the IUPAC:</p>
        <p>
          International Union for Pure and Applied
Chemistry. IUPAC elaborates rules for the
nomenclature of molecules, in order to avoid definitional
ambiguities and ensure harmonization of
terminological proposals when new molecules are
discovered. It has made available on-line for chemists
the so-called Gold Book
          <xref ref-type="bibr" rid="ref8">(McNaught and
Wilkinson, 1997)</xref>
          : an extensive dictionary-like
description of English chemistry terms.
        </p>
        <p>In spite of such efforts, the terminology of
chemistry is loosely normalized. There is also a
lack of multilingual perspective as most scientific
papers are written in English, which can lead to
serious problems in the context of the teaching of
this discipline in schools and universities.
1.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Terminological Networking</title>
        <p>In the on-line IUPAC Gold Book, term
descriptions are eminently relational, as illustrated by the
entry for the nominal term bond below.2
(1)</p>
        <p>bond
There is a chemical bond between two atoms or
groups of atoms in the case that the forces acting
between them are such as to lead to the formation of
an aggregate with sufficient stability to make it
convenient for the chemist to consider it as an
independent ‘molecular species’.</p>
        <p>See also: agostic, coordination, hydrogen bond,
multi-centre bond
The definition (There is a chemical bond . . . )
in this description is doing its job of
establishing connections between bond and related terms
such as force, aggregate, etc. It is however an
unstructured and non-formalized model for such
connections. A greater applicative potential could
2http://goldbook.iupac.org/B00697.html
be achieved by explicitly encoding the linguistic
structure of a term definition. If such structure
mirrors the logical organization of concepts in the
corresponding scientific domain, the lexical
definition can be an efficient tool for the
understanding of scientific texts, for scientific writing and for
teaching chemistry.</p>
        <p>Additionally, as illustrated by terminological
pointers at the end of (1) – agostic, coordination,
hydrogen bond, multi-centre bond – the
mastering of a chemistry term and of the corresponding
notion depends on the ability to position this term
within the network of other terms that gravitate in
its semantic space.</p>
        <p>Polysemy is another acute problem in the
terminology of chemistry, that is generally ignored
in existing resources. Polysemy manifests itself
in two ways.</p>
        <p>A) It can occur within the terminology itself,
when a single form is used to denote different
terminological notions – for instance, to catalyze as
‘[for a substance] to cause a certain type of
chemical reaction’ in (2) vs. ‘[for a chemist] to make this
reaction take place’ in (3):3</p>
        <p>These fiber catalysts can efficiently catalyze
the Knoevenagel condensation of
benzaldehyde and ethyl cyanoacetate in water (yields:
95-98%).</p>
        <p>
          These Ta2O5-T samples were characterized
by TG / DTA, XPS, nitrogen adsorption, XRD,
and UV-Raman, and were employed to
catalyze the gas-phase dehydration of glycerol
(GL) to produce acrolein (AC) at around 315
degrees C.
minological database in chemistry, for both the
English and French languages. It also has
theoretical implications as it explores a new approach
to the structuring of terminologies based on
nonhierarchical graph structures (see lexical systems,
section 2 below), where each term is an element
in a global lexical network in which it is
related to the rest of the domain terminology, as
well as to general language lexicon, by means of
Meaning-Text lexical functions
          <xref ref-type="bibr" rid="ref9">(Mel’cˇuk, 1996)</xref>
          .
Lexical functions have already proved to be an
efficient tool to model relations between terms
          <xref ref-type="bibr" rid="ref6">(L’Homme, 2002)</xref>
          . In our project, however, the
recourse to lexical functions is embedded within
a formal proposal for the graph structuring of
lexical knowledge – lexical systems – that we believe
is particularly suited to account for the
interaction between terminologies of different domains –
e.g., chemistry terms used in physics – as well as
between “purely” terminological units and units
that belong to the general language – e.g. water
as a type of molecule and water as a substance.
2
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Terminologies as Lexical Systems</title>
      <p>
        The terminological models we are elaborating are
grafted on two general language lexical resources:
the English and French Lexical Networks
        <xref ref-type="bibr" rid="ref3 ref7">(Gader
et al., 2014; Lux-Pogodalla and Polgue`re, 2011)</xref>
        ,
respectively en-LN and fr-LN.
      </p>
      <p>
        The design of the en- and fr-LNs is based on
a new type of lexical model called lexical
system
        <xref ref-type="bibr" rid="ref10 ref3">(Polgue`re, 2014)</xref>
        . From a formal point of
view, a lexical system is a graph whose vertices
are lexical units of the lexicon under description
and whose edges are lexical relations of
essentially two types:
1. semantic relations – (chemical) bond is
linked to to bond, interaction, compound . . . ;
2. combinatorial relations – (chemical) bond
combines with covalent, ionic . . .
      </p>
      <p>Both types of relations are modeled by means
of lexical functions (section 1.2 above):
paradigmatic lexical functions in the first case and
syntagmatic lexical functions in the second case.
Though lexical functions provide the bulk of
graph structuring in lexical systems, other types
of relations are also implemented. For instance,
(2)
(3)
(4)
B) Polysemy can also spread over both chemistry
terminology (2)-(3) and general language (4).4
Cities are always building new stadiums with
the justification that they’ll catalyze the local
economy.</p>
      <p>All these observations show that it is necessary to
organize the terminology of chemistry according
to rigorous theoretical and descriptive principles.</p>
      <p>The project we are presenting has a very
practical aim: the design and construction of a
ter3Chemistry examples are borrowed from Web of Science
(http://webofscience.com/).</p>
      <p>4New York Times, COCA corpus (http://corpus.
byu.edu/coca/).
semantic embedding is implemented via links
weaved within lexical definitions: cf. the
definition of CATALYSIS I.1, section 3.3 below, that
formally links this term to two semantically
embedded terms: REACTION 1 and GIBBS ENERGY.</p>
      <p>
        Such graphs belong to the family of so-called
small-world networks
        <xref ref-type="bibr" rid="ref12">(Watts and Strogatz, 1998)</xref>
        and their topological properties allow for the
automatic identification of semantic spaces through
clusterization. Figure 1 illustrates the semantic
space of BONDN I.2 – the chemistry sense of the
noun BONDN in the current version of the en-LN5.
      </p>
      <p>
        Beside being computer-tractable structures,
lexical systems are equivalent to “virtual
dictionaries”, as all properties of lexical units are
encapsulated in graph vertices – lexical definitions,
grammatical information, citations from corpora
(i.e. contexts), etc. Thanks to a specially designed
lexical graph editor, it is possible to build a lexical
model and, thus, a terminology, by methodically
weaving lexical systems
        <xref ref-type="bibr" rid="ref10 ref3">(Polgue`re, 2014)</xref>
        . In the
specific case of the work presented here, we are
describing the terminology of chemistry by
weaving the terminological network of this discipline
directly on top of the general language en- and
fr-LNs. This will allow not only for the proper
connection of terminologies in both language, but
5Graph visualizations are based on the Tmuse algorithm
(Chudy et al., 2013) and are generated with tools provided
by Kodex.Lab (http://kodexlab.com).
also for the “interpretation” of chemical terms
relative to the non-specialized lexical stock, with
which these terms naturally interact in standard
research activity as well as in scientific texts.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Example: The Case of CATALYSIS I.1</title>
      <p>We now illustrate our approach with the en-LN
description of the noun CATALYSIS. It possesses
the same polysemic structuring as the
corresponding verb CATALYZE – see (2), (3) and (4),
section 1.2. Therefore, two chemistry senses have to
be distinguished within the nominal vocable:
• CATALYSIS I.1 [The catalysis occurred via the
formation of a chloromethylated triflate complex, and
electrophilic addition to an aromatic hydrocarbon.]
• CATALYSIS I.2 [Were they doing catalysis, and if so,
how did they recover the catalyst?]</p>
      <p>We will focus on the term CATALYSIS I.1,
which is the nominal counterpart of the basic
sense of the verb exemplified in (2).
3.1</p>
      <sec id="sec-4-1">
        <title>From Lexical Graph to Article-View</title>
        <p>
          When weaving lexical systems with the
tailormade graph editor named Dicet
          <xref ref-type="bibr" rid="ref2">(Gader et al.,
2012)</xref>
          , lexicographers are provided with a textual
rendering of lexical information: the article-view
of the headword. We present the article-view of
CATALYSIS I.1 in Figure 2 below.
        </p>
        <p>It is essential to note that the article-view is
only the textual display of fundamentally
relational information encoded in the lexical network.
For instance, what appears as:</p>
        <p>S1 : spec catalyst I
in Figure 2 is generated (i) from an S1 lexical
function link (typical name for the 1st actant of the
headword) holding between CATALYSIS I.1 and
CATALYST I and (ii) from a grammatical
characteristic link connecting this latter unit to the
linguistic usage note “spec”, that characterizes
CATALYST I as being a term.6</p>
        <p>In order to truly apprehend the formal nature of
the lexical model in which terminologies are
embedded, it is therefore necessary to distance
oneself from textual article-views and focus on the
6Even citations – cf. the [EX](ample) zone in Figure 2 –
are implemented as connections between individual citations
and lexical units they contain.
core structuring element of our model: the
multidimensional system of lexical function relations
that connects lexical units.7
3.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Web of Lexical Function Relations</title>
        <p>The structurally most relevant information in
Figure 2 appears in the lexical function zone [LF].
It corresponds to the set of paradigmatic links
that originate from the CATALYSIS I.1 headword
and connect it the rest of the lexical system. (At
present, no syntagmatic link has been encoded for
this specific term.) It is this information, together
with incoming lexical function links, that position
the term in the global structure of the en-LN and
defines its semantic space.</p>
        <p>In our terminology, the semantic space of a
lexical unit such as CATALYSIS I.1 is much more than
just the subgraph constituted of all outgoing and
incoming lexical function links. It is the
topologically significant cluster of semantically-related
nodes that gravitate around CATALYSIS I.1, as
illustrated in Figure 3.</p>
        <p>This semantic space features not only lexical
units that are directly connected to CATALYSIS I.1
– e.g. CONTACT ACTION or CATALYST I –, but
also indirectly connected terms – e.g. GREEN
7The distinction between article-view and lexical graph
perspectives on the en-LN bears some similarity with written
vs. graph information modes in Pram Nielsen (2013).
CHEMISTRY or CHEMICAL CHANGE – that
entertain significant semantic proximity with this
headword.</p>
        <p>
          At present, semantic space clustering is based
of Proxemy analysis
          <xref ref-type="bibr" rid="ref4">(Gaume, 2008)</xref>
          . It is
optimized by taking into consideration the semantic
weight of each individual lexical function. For
instance, a paradigmatic lexical function such as S1
possesses the maximal semantic weight “2” in the
en-LN model of lexical functions, while the Oper1
lexical function denoting support verb collocates
possesses the minimal semantic weight “0”.8
        </p>
        <p>We believe that lexical systems – small-world
graphs of lexical units connected by paradigmatic
and syntagmatic relations – are powerful
alternatives to more traditional taxonomic models for at
least two reasons: (i) they favor semantic space
connectivity over a more restricted class-based
organization and (ii) they unite both semantic and
combinatorial connections within the same
formal apparatus (lexical functions).
3.3</p>
      </sec>
      <sec id="sec-4-3">
        <title>Definitional Embedding of Notions</title>
        <p>To conclude, we wish to say a few words about
definitions and their role in the structuring of
terminological knowledge. As indicated in
section 2, lexical definitions also participate in the
weaving of lexical systems, though to a lesser
extent, by implementing semantic embedding. In
the specific case of CATALYSIS I.1, two lexical
units appear in the article-view as clickable
targets of definitional embedding links: REACTION 1
and the terminological idiom GIBBS ENERGY.
This is made possible by the formal encoding the
definition: an XML-like tagging of the
definitional text, that we will not detail here for lack
of space.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>We are grateful to TIA 2015 anonymous
reviewers for their comments on a preliminary version
of our paper. This research is supported by the
PEPS Mirabelle 2015 program (CNRS and
Universite´ de Lorraine) for interdisciplinary research.
Yannick Chudy, Yann Desalle, Benoˆıt Gaillard, Bruno
Gaume, Pierre Magistry and Emmanuel Navarro.
8There is no significant semantic link, in most cases,
between a noun and its support verbs. The relationship
between flu and its support verb to have, between decision and
to make . . . is first and foremost combinatorial, not semantic.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          2013.
          <article-title>Tmuse: Lexical Network Exploration</article-title>
          .
          <source>In: The Companion Volume of the Proceedings of IJCNLP</source>
          <year>2013</year>
          :
          <article-title>System Demonstrations, Asian Federation of NLP</article-title>
          , Nagoya,
          <fpage>41</fpage>
          -
          <lpage>44</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Nabil</given-names>
            <surname>Gader</surname>
          </string-name>
          , Veronika Lux-Pogodalla and Alain Polgue`re.
          <year>2012</year>
          .
          <article-title>Hand-Crafting a Lexical Network With a Knowledge-Based Graph Editor</article-title>
          .
          <source>In: Proceedings of the Third Workshop on Cognitive Aspects of the Lexicon (CogALex III)</source>
          ,
          <source>The COLING 2012 Organizing Committee</source>
          , Mumbai,
          <fpage>109</fpage>
          -
          <lpage>125</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Nabil</given-names>
            <surname>Gader</surname>
          </string-name>
          , Sandrine Ollinger and Alain Polgue`re.
          <year>2014</year>
          .
          <article-title>One Lexicon, Two Structures: So What Gives? In Heili Orav, Christiane Fellbaum</article-title>
          and Piek Vossen (eds.):
          <source>Proceedings of the Seventh Global Wordnet Conference (GWC2014)</source>
          , Tartu,
          <fpage>163</fpage>
          -
          <lpage>171</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Bruno</given-names>
            <surname>Gaume</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Mapping the Forms of Meaning in Small Worlds</article-title>
          .
          <source>Journal of Intelligent Systems</source>
          ,
          <volume>23</volume>
          :
          <fpage>848</fpage>
          -
          <lpage>862</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Roald</given-names>
            <surname>Hoffmann</surname>
          </string-name>
          and
          <string-name>
            <given-names>Pierre</given-names>
            <surname>Lazlo</surname>
          </string-name>
          .
          <year>1991</year>
          . Representation in Chemistry. Angewandte Chemie International Edition in English,
          <volume>30</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Marie-Claude L'Homme</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Fonctions lexicales pour repre´senter les relations se´mantiques entre termes</article-title>
          . Traitement automatique de la langue (T.A.L.),
          <volume>43</volume>
          (
          <issue>1</issue>
          ):
          <fpage>19</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Veronika</given-names>
            <surname>Lux-Pogodalla</surname>
          </string-name>
          and Alain Polgue`re.
          <year>2011</year>
          .
          <article-title>Construction of a French Lexical Network: Methodological Issues</article-title>
          .
          <source>In: Proceedings of the First International Workshop on Lexical Resources, WoLeR 2011. An ESSLLI 2011 Workshop</source>
          , Ljubljana,
          <fpage>54</fpage>
          -
          <lpage>61</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Alan D. McNaught</surname>
            and
            <given-names>Andrew</given-names>
          </string-name>
          <string-name>
            <surname>Wilkinson</surname>
          </string-name>
          .
          <year>1997</year>
          . IUPAC.
          <article-title>Compendium of Chemical Terminology (the “Gold Book”), 2nd edition</article-title>
          ,
          <source>Blackwell Scientific Publications</source>
          , Oxford. On-line corrected version: http://goldbook.iupac.
          <source>org (</source>
          <year>2006</year>
          -)
          <string-name>
            <surname>created by M. Nic</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Jirat</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <article-title>Kosata; updates compiled by A</article-title>
          . Jenkins.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Igor</given-names>
            <surname>Mel'cˇuk</surname>
          </string-name>
          .
          <year>1996</year>
          .
          <article-title>Lexical Functions: A Tool for the Description of Lexical Relations in the Lexicon</article-title>
          . In Leo Wanner (ed.):
          <article-title>Lexical Functions in Lexicography and Natural Language Processing</article-title>
          ,
          <source>Language Companion Series</source>
          <volume>31</volume>
          ,
          <string-name>
            <surname>John</surname>
            <given-names>Benjamins</given-names>
          </string-name>
          , Amsterdam/Philadelphia, 37-
          <fpage>102</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          Alain Polgue`re.
          <year>2014</year>
          .
          <article-title>From Writing Dictionaries to Weaving Lexical Networks</article-title>
          .
          <source>International Journal of Lexicography</source>
          ,
          <volume>27</volume>
          (
          <issue>4</issue>
          ):
          <fpage>396</fpage>
          -
          <lpage>418</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Louise</given-names>
            <surname>Pram Nielsen</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>User experimentation with terminological ontologies</article-title>
          .
          <source>In: Proceedings of the 10th International Conference on Terminology and Artificial Intelligence TIA</source>
          <year>2013</year>
          , Paris,
          <fpage>185</fpage>
          -
          <lpage>188</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Duncan J.</given-names>
            <surname>Watts</surname>
          </string-name>
          and
          <string-name>
            <surname>Steven H. Strogatz</surname>
          </string-name>
          .
          <year>1998</year>
          .
          <article-title>Collective dynamics of 'small-world' network</article-title>
          .
          <source>Nature</source>
          ,
          <volume>393</volume>
          :
          <fpage>440</fpage>
          -
          <lpage>442</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>