<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Subject Fields in Termbases - Their Design, Use and Representation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Kara Warburton</string-name>
          <email>karacw@illinois.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>1st International Conference on “Multilingual digital terminology today. Design</institution>
          ,
          <addr-line>representation formats and management systems”</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Illinois at Urbana-Champaign</institution>
          ,
          <addr-line>707 S. Mathews Ave., Urbana, Illinois, 61801</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Subject fields play an essential role in terminological resources by allowing for the creation of semantically-based subdivisions in addition to acting as a conceptual boundary for the principle of univocity. However, due to the lack of guidelines and standards, their application in termbases risks being ad-hoc, which reduces their effectiveness in achieving these goals. ISO TC/37 has published a technical specification (TS) aimed to increase the rigour of subject-field use and the interoperability of the data. This paper describes some issues and challenges relating to subject-fields in termbases and how the TS may resolve them.</p>
      </abstract>
      <kwd-group>
        <kwd>1 Terminology</kwd>
        <kwd>TBX</kwd>
        <kwd>subject fields</kwd>
        <kwd>domains</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Subject fields in Terminology</title>
      <p>
        The notion of subject fields is critical to terminology theory and practice. According to convention,
terms designate concepts that belong to a language for special purposes (LSP) (as opposed to language
for general purposes or LGP) [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and an LSP is the language used by specialists in a subject field [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
For many scholars, adherence to a subject field is a requirement for a linguistic unit to be deemed a
term [
        <xref ref-type="bibr" rid="ref2 ref6">6, 2</xref>
        ]. Indeed, specifying the subject field that a term belongs to is often considered mandatory for
terminological description [
        <xref ref-type="bibr" rid="ref1 ref5 ref6 ref7">6, 1, 7, 5</xref>
        ].
      </p>
      <p>
        Univocity, a key principle in classical terminology theory, may also depend on subject fields.
According to this principle, a term should have only one meaning. But we maintain that univocity is
only achievable if it is applied within the scope of a subject field. This is because "identical" lexical
units occur in different subject fields with different meanings (homonyms, homographs) (for example,
"port" the strong wine and "port" the computer connection). Consequently, univocity has been defined
with domain-specificity as its scope [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Scholars have also noted that subject fields should be organized in a hierarchical structure, to include
sub-fields and even finer divisions [
        <xref ref-type="bibr" rid="ref1 ref2 ref5">1, 2, 5</xref>
        ]. Figure 1 provides an example of a three level system
showing the top level Education, followed by child levels, three of which are further divided into
subordinate values.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Challenges</title>
    </sec>
    <sec id="sec-4">
      <title>3.1.1. Lack of guidelines and of a universal subject-field classification system</title>
      <p>Guidelines, standards, and representation models for subject fields are lacking in the literature.
Given the importance of subject fields which we have demonstrated, this is surprising if not troubling.
Consequently, the use of subject fields in today's termbases varies considerably. Some termbases use
none at all, others feature a flat list of values2. Each termbase that features subject fields employs a
unique set, different even from that of other termbases that cover the same or similar spheres of
knowledge. The lack of a universal subject-field classificaiton system represents a major obstacle to the
interoperability of terminological databases.</p>
    </sec>
    <sec id="sec-5">
      <title>3.1.2. Difficulties when assigning subject field values to concepts</title>
      <p>Deciding which subject field a concept "belongs" to is another challenge. The choice is not always
obvious, and terminologists often rely purely on intuition. Under these conditions, subject-field
assignments will not be reliable, which raises questions as to the effectiveness of subject fields as a
classificatory mechanism.</p>
      <p>There is also the question of whether a concept can be assigned to more than one subject field. Here,
terminologists disagree; some say yes, others no. However, if a subject-field value sets a boundary
enabling the term to be univocal, then one would assume that it is confined to this subject field. This
leads to the possibility that, if a terminologist feels inclined to select two subject fields, perhaps it is
their "parent" that should be assigned instead. These are philosophical questions worthy of further
debate.</p>
    </sec>
    <sec id="sec-6">
      <title>3.1.3. Lack of models for representing subject fields</title>
      <p>ISO Technical Committee 37, Sub-committee 3, has published a standard for representing
terminological resources in an XML markup format, ISO 30042: TermBase eXchange (TBX). TBX also
constitutes a model framework for designing a termbase. However, subject fields and their
representation is not addressed in any substantive manner. They are loosely modelled in plain text fields
(with therefore no control over permissible values), and there is no facility for establishing a taxonomic
structure. The standard merely stipulates that subject fields are to be represented in a &lt;descrip&gt; element
at the concept level, for example:
2 In the full version of this paper to be submitted for publication, some examples will be provided.</p>
    </sec>
    <sec id="sec-7">
      <title>4. The response of ISO TC 37</title>
      <p>To address the TBX limitations, in 2021 the committee published a Technical Specification (TS)
that provides guidelines for subject fields as well as for concept relations (another important feature of
termbases for which guidelines are lacking): ISO/TS 24634 - TBX-compliant representation of concept
relations and subject fields. In the following paragraphs, we summarize the contents of this TS.
4.1.</p>
    </sec>
    <sec id="sec-8">
      <title>Constraints</title>
      <p>The TS specifies the following constraints relating to subject fields. The aim is to increase
interoperability.</p>
      <p>1. The content of the subject-field data category shall be a picklist (closed list of values). These
values form the organization's subject field classification system.
2. Whenever possible, an existing public subject field classification system should be adopted,
such as EuroVoc or Lenoch.
3. The name and source of the subject-field classification must be declared in the TBX header.
4. The full subject-field classification system should be described, either in the backmatter of the
TBX document instance, or through an XML namespace. Within this description, the scope, or
meaning, of subject-field values, should also be defined. This aims to facilitate a more reliable
assignment of subject-field values to concept entries.
4.2.</p>
    </sec>
    <sec id="sec-9">
      <title>XML representation</title>
      <p>An XML model for representing subject-field classification systems is provided in the TS. The
model includes some markup adopted from the RDF-based SKOS.</p>
    </sec>
    <sec id="sec-10">
      <title>5. Conclusion</title>
    </sec>
    <sec id="sec-11">
      <title>6. References</title>
      <p>The ISO TS should help to increase the interoperability of termbases. However, it will only have
an effect if its provisions are adopted by termbase administrators. The uptake of ISO TC37 standards,
however, has been slow in the past. Furthermore, full interoperability will not be achieved without
a universal classification of subject fields. Whether that is a realistic goal remains open to debate.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Teresa Cabre</surname>
          </string-name>
          , Terminology - Theory, Methods, and
          <string-name>
            <surname>Applications</surname>
          </string-name>
          , John Benjamins Publishing Co., Amsterdam,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Dubuc</surname>
          </string-name>
          , Manuel Pratique de Terminologie, Linguatech, Montreal,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>International</given-names>
            <surname>Organization</surname>
          </string-name>
          for Standardization,
          <source>ISO 30042 - TermBase eXchange (TBX)</source>
          ,
          <year>Geneva</year>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>International</given-names>
            <surname>Organization</surname>
          </string-name>
          for Standardization, ISO/TS 24634 -
          <article-title>TBX-compliant representation of concept relations and subject fields</article-title>
          ,
          <source>Geneva</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Rey</surname>
          </string-name>
          , Essays on Terminology,
          <source>John Benjamins Publishing Co., Amsterdam</source>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G.</given-names>
            <surname>Rondeau</surname>
          </string-name>
          ,
          <article-title>Introduction à la terminologie, Centre Educatif et Culturel Inc</article-title>
          ., Montreal,
          <year>1981</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Sager</surname>
          </string-name>
          , A Practical Course in Terminology Processing, John Benjamins Publishing Co., Amsterdam,
          <year>1990</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>W.</given-names>
            <surname>Teubert</surname>
          </string-name>
          ,
          <article-title>Language as an economic factor: the importance of terminology</article-title>
          , in G. Barnbrook,
          <string-name>
            <given-names>P.</given-names>
            <surname>Danielsson</surname>
          </string-name>
          , M.Mahlberg (Eds.), Continuum, London,
          <year>2005</year>
          , pp.
          <fpage>96</fpage>
          -
          <lpage>106</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>