=Paper= {{Paper |id=Vol-3161/short1 |storemode=property |title=Representing Multilingual Terminologies with OntoLex-Lemon (short paper) |pdfUrl=https://ceur-ws.org/Vol-3161/short1.pdf |volume=Vol-3161 |authors=Patricia Martin Chozas,Thierry Declerck |dblpUrl=https://dblp.org/rec/conf/mdtt/Martin-ChozasD22 }} ==Representing Multilingual Terminologies with OntoLex-Lemon (short paper)== https://ceur-ws.org/Vol-3161/short1.pdf
Representing Multilingual Terminologies with
OntoLex-Lemon
Patricia Martín-Chozas1 , Thierry Declerck2
1
  Ontology Engineering Group, Universidad Politécnica de Madrid, Avda. Montepríncipe, s/n, Boadilla del Monte, 28660,
Spain
2
  German Research Center for Artificial Intelligence GmbH (DFKI), Multilinguality and Language Technology Lab,
Saarland Informatics Campus D3 2, Stuhlsatzenhausweg 3, 66123 Saarbrücken, Germany


                                         Abstract
                                         This paper is framed within a project to make multilingual terminologies available in a native graph
                                         representation format. We are exploring the use of the OntoLex-Lemon model, suggesting also some
                                         extensions, for achieving a declarative encoding of relations between multilingual expressions contained
                                         in terminologies. This model is not only used for encoding terms but also for their associated definitions,
                                         contexts and notes. With this effort, we aim at supporting the publication of multilingual terminologies
                                         in the Linked Open Data cloud.

                                         Keywords
                                         Terminologies, Multilingualism, Formal Representation, OntoLex-Lemon




1. Introduction
In the context of work dealing with the conversion of multilingual terminologies onto an RDF1
model, we came into modelling decisions concerning also additional language data included in
such resources. While the original purpose of the porting exercise is not to change anything
at the level of the content of the considered terminologies, their modelling in a graph-based
representation offers possibilities for their interlinking and merging with other resources, being
in the realm of terminologies or of other types of data, like for example detailed lexicographic
resources. Thus, the focus of our work is the possible improved formal representation of the
language data used in multilingual terminologies. We discuss in this short paper few decisions
points concerning our modelling strategy, also comparing our work with a directly related
former approach.




1st International Conference on ”Multilingual digital terminology today. Design, representation formats and
management systems”, June 16 – 17, Padova, Italy
Envelope-Open pmchozas@fi.upm.es (P. Martín-Chozas); declerck@dfki.de (T. Declerck)
GLOBE https://github.com/pmchozas/ (P. Martín-Chozas); https://www.dfki.de/~declerck/ (T. Declerck)
Orcid 000-0002-8922-7521 (P. Martín-Chozas); 0000-0002-9450-6648 (T. Declerck)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings          CEUR Workshop Proceedings (CEUR-WS.org)
                  http://ceur-ws.org
                  ISSN 1613-0073




                  1
                      RDF stands for “Resource Description Framework”. See https://www.w3.org/TR/rdf-primer/for more details.
2. The Data Basis: Two Terminological Resources
Currently, we consider two terminological resources as the input for our transformation work:
the multilingual terminology of the Deutsche Bahn (German Railways), which is encoded
within the TBX2 standard and can be accessed online3 ; and IATE (Interactive Terminology for
Europe)4 , one of the most representative terminological database in Europe. The consideration
of the latter was motivated by a previous exercise that focused on the conversion of the data
contained in IATE, structured in TBX, into RDF. This effort is a great starting point to compare
our approach.


3. The TBX2RDF Guidelines
The past LIDER project5 was already concerned with mapping TBX to RDF, with the goal of
transforming and publishing terminologies as Linked Data [2]. LIDER developed guidelines
for this task6 in which TBX elements are converted into OWL7 and associated with other RDF
vocabularies, while the basic vocabularies chosen as the backbone of the conversion were SKOS8
and the lemon model [3], a predecessor of the OntoLex-Lemon framework [4] we are using. [5]
describe the TBX2RDF approach9 and [6] presents recent developments related to this initiative,
relying on a virtualization approach that is making use of containerization technologies.
   The LIDER TBX2RDF approach is representing the TBX terminological concepts as skos:Con-
cept and the TIG/NTIG elements of TBX as ontolex:LexicalEntry, and most of the other TBX
elements are straightforwardly mapped onto RDF, meaning that they are encoding as URIs for
representing a resource that can be associated with RDF predicates and objects. We note also
that TXB2RDF is not representing the TBX langSet data as such, but instead is creating language
specific lexicons in which all the data included in the original langSet element are encoded.


4. Our Approach
We make use of the most recent version of OntoLex-Lemon,10 which is effectively integrating
the SKOS vocabulary for representing conceptual units and their associated language data. This
was not the case with its former version, lemon, which was used in the LIDER project. We can
now use properties defined in OntoLex-Lemon for directly linking the conceptually oriented
    2
       TBX stands for ”TermBase eXchange”. See https://www.tbxinfo.net/ [accessed 2022-02-14], or [1] for more
details.
     3
       www.deutschebahn.com/dblanguageportal [accessed 2021-10-02]
     4
       See https://iate.europa.eu/ [accessed 2022-02-14]
     5
       http://lider-project.eu/lider-project.eu/index.html [accessed 2021-10-02]
     6
       The latest version of those guidelines is available at https://github.com/bpmlod/report/blob/gh-pages/
multilingual-terminologies/index.html [accessed 2022-02-14]
     7
       OWL stands for “Web Ontology Language”. See https://www.w3.org/TR/owl2-primer/ [accessed 2022-14-02]
     8
       SKOS stands for „Simple Knowledge Organization System”. See also https://www.w3.org/2009/08/
skos-reference/skos.html [last consulted: 2022-02-14]
     9
       The corresponding W3C Community Group Report is avaialable at https://www.w3.org/2015/09/
bpmlod-reports/multilingual-terminologies/[accessed2022-02-14]
    10
       See https://www.w3.org/2016/05/ontolex/ [accessed 2022-02-14] for technical details.
terms to lexical entries, while the LIDER TBX2RDF converter was using a custom property
for this purpose. We introduce a skos:ConceptScheme for encoding the whole conceptual
organisation of the original terminology, and within this scheme we allow for the definition
of specific domain subsets, a feature not supported in TBX.11 OntoLex-Lemon is foreseeing as
a subclass of skos:Concept the class ontolex:LexicalConcept for linking lexical entries to the
conceptual part described in the SKOS vocabulary. We encode all the terms as instances of this
class, and no longer as instances of the class ontolex:LexicalEntry, as it was implemented in
TBX2RDF. Another, and more significant, departure from the LIDER TBX2RDF model is the fact
that we model definitions and contexts as instances of classes, and no longer as literal values.
In doing so, we can describe specific relations between the definitions within one language
or across different languages. In the latter case, we can specify if the definitions given for
terms in two different languages are translations of each other, multilingual equivalents or just
monolingual definitions included in the multilingual terminology. Suggested additions to the
OntoLex-Lemon model are marked with the prefix “termlex”.
   Figure 1 shows how an IATE term entry is currently represented following our approach, while
also representing the synonymy of two Spanish terms. Figure 2 displays the relations between
the terms and their definitions, which as instances of a class, can link to further information,
like the provenance or the definitions for the same original term entry in another language. The
English equivalents for the Spanish terms “surco ferroviario” and “franja ferroviaria” (displayed
in Figures 1 and 2) – “train path”, “train slot” –, as well as the English definitions and their
context of use are linked to the Spanish terms and entries via the properties defined in the
Vartrans module of OntoLex-Lemon,12 supporting a declarative description of the different types
of relations that can exist between those different types of language data (terms, definitions and
contexts of use).


5. Conclusions and Future Work
We described ongoing work in porting the multilingual terminology resources onto a Linked
Data compliant representation language. This work led us to the question if it would not be
suitable to extend the modelling of TBX terminologies in RDF already proposed by the LIDER
TBX2RDF converter. One aspect consists in considering definitions, contexts and notes as
full ontological elements that can thus be put explicitly in relation to each other. This way,
definitions in different languages can be declaratively interlinked and marked as translations,
equivalents or as not having any of those relations.
   As an outcome of our work, we are currently proposing an extension module for OntoLex-
Lemon,13 that deals with the representation of terminological data that is not covered in the
core module, as the main motivation of the development of OntoLex-Lemon vocabulary was to
represent language data with references to ontologies.


   11
      See [7] for a discussion on the difference between the “subjectField” in TBX and the conceptual hierarchy in
SKOS.
   12
      https://www.w3.org/2016/05/ontolex/#variation-translation-vartrans
   13
      https://www.w3.org/community/ontolex/wiki/Terminology
  skos:ConceptScheme


                 lexinfo:NormativeAuthorization
                                                                                                                                                                          lime:Lexicon
                              http://lexinfo#preferredTerm
                                                                                                                                                                    lime:language "es"
                                                                                                              termlex:isEvokedBy
                                                                                                                                                                                                          lime:entry
    skos:inScheme     termlex:normativeAuthorization ontolex:LexicalConcept                                    ontolex:LexicalSense
                                                                                                                                                                    ontolex:LexicalEntry                                                ontolex:Form
                                                                                   termlex:lexicalizedSense
                                                             <1443648_LC1>                                       <1443648_LS1>           ontolex:isSenseOf                                                                             writtenRep
                                                                                                                                                                  <1443648_LC1_LEN>                    ontolex:lexicalForm
                          termlex:lexicalizedConcept                                                                                                                                                                               "surco ferroviario"
     skos:Concept
                                                                                                                  lexinfo:synonym

           termlex:lexicalizedConcept
                                                                                                                                                                                                          lime:entry
                                                        ontolex:LexicalConcept    termlex:lexicalizedSense      ontolex:LexicalSense                                ontolex:LexicalEntry                                                ontolex:Form
                                                                                                                                         ontolex:isSenseOf
                    termlex:normativeAuthorization                                                                                                                                                     ontolex:lexicalForm             writtenRep
                                                             <1443648_LC2>                                        <1443648_LS2>                                    <1443648_LC2_LEN>
                                                                                                                                                                                                                                   "franja ferroviaria"
                                                                                                              termlex:isEvokedBy
                            lexinfo:NormativeAuthorization

                             http://lexinfo#deprecatedTerm




Figure 1: Representing a IATE term entry in OntoLex-Lemon, showing two Spanish terms used for a
term entry. One term is marked as “preferred” while the other is marked as “deprecated”. Our suggested
extensions (“termlex”) to OntoLex-Lemon are displayed in blue colour.


                                                                                                                                              skos:note                                           termlex:Note


                                                                                                                                                                  rdf:value "Se trata de unidades cuya disponibilidad depende de
                                                                                                                                                                  factores como el número de vías disponibles, el sistema de
                                                                                                                                                                  señalización o la diferencia de velocidad entre trenes. El
                                                                                                                                                                  Administrador de Infraestructuras Ferroviarias (Adif) concede a los
                                                                                                                                                                  operadores el derecho de explotar un tramo de vía en un día, una
                                                                                                                                                                  hora y un sentido determinado."
                                                                                                                                                                  dc:source 



                                                                                                                             termlex:Definition                                                  termlex:Source

                                                                                                                        rdf:value                                                rdf:value "Directiva 2001/14/CE relativa a
skos:ConceptScheme                                                   skos:definition                                                                          dc:source
                                                                                                                        "capacidad de infraestructura                            la adjudicación de la capacidad de
                                                                                       skos:definition                  necesaria para que un tren                               infraestructura ferroviaria y la aplicación
                                                                                                                        circule entre dos puntos en                              de cánones por su utilización"
                                                                                                             un momento dado."                                        dc:identifier "CELEX:32001L0014/ES"




                                                                                                                                                                             lime:Lexicon

                                                                                                                                                                           lime:language "es"
  skos:inScheme                                                                                               termlex:isEvokedBy

                                                      ontolex:LexicalConcept                                  ontolex:LexicalSense                                                                               lime:entry
                                                                                                                                                                          ontolex:LexicalEntry                                             ontolex:Form
                                                                                 termlex:lexicalizedSense
                                                         <1443648_LC1>                                          <1443648_LS1>             ontolex:isSenseOf
                                                                                                                                                                                                           ontolex:lexicalForm               writtenRep
                                                                                                                                                                      <1443648_LC1_LEN>
                       termlex:lexicalizedConcept                                                                                                                                                                                        "surco ferroviario"
   skos:Concept
                                                                                                                 lexinfo:synonym

 
                       termlex:lexicalizedConcept                                                                                                                                                             lime:entry
                                                      ontolex:LexicalConcept     termlex:lexicalizedSense     ontolex:LexicalSense        ontolex:isSenseOf               ontolex:LexicalEntry                                             ontolex:Form

                                                                                                                                                                                                           ontolex:lexicalForm               writtenRep
                                                         <1443648_LC2>                                          <1443648_LS2>                                             <1443648_LC2_LEN>
                                                                                                                                                                                                                                         "franja ferroviaria"
                                                                                                                termlex:isEvokedBy




Figure 2: Representing the links between terms and their definitions, which are now instances of a
specific class. Our suggested extensions (“termlex”) to OntoLex-Lemon are displayed in blue colour.


Acknowledgments
This short paper is based upon work from COST Action NexusLinguarum – European network
for Web-centered linguistic data science (CA18209), supported by COST (European Cooperation
in Science and Technology). The article is also supported by the Horizon 2020 research and
innovation programme with the project Prêt-à-LLOD (grant agreement no. 825182).
References
[1] A. Lommel, A. K. Melby, N. Glenn, J. Hayes, T. Snow, TBX-Min: A Simplified TBX-
    Based Approach to Representing Bilingual Glossaries, in: Terminology and Knowledge
    Engineering 2014, Berlin, Germany, 2014, p. 10 p. URL: https://hal.archives-ouvertes.fr/
    hal-01005851.
[2] C. Bizer, T. Heath, T. Berners-Lee, Linked data: The story so far, in: Semantic services,
    interoperability and web applications: emerging concepts, IGI global, 2011, pp. 205–227.
[3] J. P. McCrae, G. Aguado de Cea, P. Buitelaar, P. Cimiano, T. Declerck, A. Gómez-Pérez,
    J. Gracia, L. Hollink, E. Montiel-Ponsoda, D. Spohr, T. Wunner, Interchanging lexical
    resources on the semantic web, Lang. Resour. Evaluation 46 (2012) 701–719. URL: https:
    //doi.org/10.1007/s10579-012-9182-3. doi:10.1007/s10579- 012- 9182- 3 .
[4] J. P. McCrae, P. Buitelaar, P. Cimiano, The OntoLex-Lemon Model: development and
    applications, in: Proc. of the 5th Biennial Conference on Electronic Lexicography (eLex),
    2017.
[5] P. Cimiano, J. P. McCrae, V. Rodríguez-Doncel, T. Gornostay, A. Gómez-Pérez, B. Siemoneit,
    A. Lagzdins, Linked terminologies: applying linked data principles to terminological
    resources, in: Proceedings of the eLex 2015 Conference, 2015.
[6] M. P. di Buono, P. Cimiano, M. F. Elahi, F. Grimm, Terme-à-LLOD: Simplifying the conversion
    and hosting of terminological resources as linked data, in: Proceedings of the 7th Workshop
    on Linked Data in Linguistics (LDL-2020), European Language Resources Association,
    Marseille, France, 2020, pp. 28–35. URL: https://aclanthology.org/2020.ldl-1.5.
[7] D. Reineke, L. Romary, Bridging the gap between SKOS and TBX, edition - Die Fachzeitschrift
    für Terminologie 19 (2019). URL: https://hal.inria.fr/hal-02398820.