=Paper= {{Paper |id=Vol-3256/tutorial1 |storemode=property |title=TermTrends: Trends in Terminology Generation and Modelling |pdfUrl=https://ceur-ws.org/Vol-3256/tutorial1.pdf |volume=Vol-3256 |authors=Patricia Martín-Chozas,Elena Montiel-Ponsoda,Sara Carvalho,Rute Costa |dblpUrl=https://dblp.org/rec/conf/ekaw/Martin-ChozasMC22 }} ==TermTrends: Trends in Terminology Generation and Modelling== https://ceur-ws.org/Vol-3256/tutorial1.pdf
TermTrends: Trends in Terminology Generation and
Modelling
Patricia Martín-Chozas1,∗,† , Elena Montiel-Ponsoda1,† , Sara Carvalho2,† and
Rute Costa3,†
1
  Ontology Engineering Group, Universidad Politécnica de Madrid, Spain
2
  University of Aveiro, Portugal
3
  Universidade NOVA de Lisboa, Portugal


                                         Abstract
                                         This document presents the objectives, content and organisation of the TermTrends tutorial within
                                         EKAW 2022. The tutorial intends to give an overview of current techniques and tools for terminology
                                         generation, as well as of standardisation approaches for terminological data. Thus, the first part of the
                                         tutorial is a theoretical block that includes an introduction to the terminological work, current standards
                                         for terminology modelling and two use cases on legal and medical terminology. The second part is a
                                         hands-on block that deals with terminological resources and tools. The tutorial is linked to the conference
                                         through a series of topics, such as knowledge acquisition and ontology engineering; and it is suitable for
                                         both an expert and non-expert audience.

                                         Keywords
                                         Terminology Generation, Terminology Modelling, Linguistic Linked Data, Semantic Web




1. Introduction
Undeniably, terminology plays an essential role in knowledge representation and organisation,
as well as in natural language processing activities dealing with domain-specific knowledge
[1] [2]. In its origins, terminology pursued the classification of terms to avoid vagueness and
ambiguity, giving birth to initiatives towards the standardisation of terminologies and other
language resources [3] [4]. These efforts evolved into ISO standards1 such as LMF2 [5] or TMF3 .
More recently, terminology work has been increasingly relying on W3C recommendations and
standards4 like RDF and OWL, with the aim of promoting interoperability of terminological

EKAW-C 2022: Companion Proceedings of the 23rd International Conference on Knowledge Engineering and Knowledge
Management, September 26–29, 2022, Bolzano, Italy
∗
    Corresponding author.
†
    These authors contributed equally.
Envelope-Open pmchozas@fi.upm.es (P. Martín-Chozas); emontiel@fi.upm.es (E. Montiel-Ponsoda); sara.carvalho@ua.pt
(S. Carvalho); rute.costa@fcsh.unl.pt (R. Costa)
Orcid 0000−0002−8922−7521 (P. Martín-Chozas); 0000−0003−3263−3403 (E. Montiel-Ponsoda); 0000-0002-7501-5405
(S. Carvalho); 0000-0002-3452-7228 (R. Costa)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR

           CEUR Workshop Proceedings (CEUR-WS.org)
    Workshop
    Proceedings
                  http://ceur-ws.org
                  ISSN 1613-0073




1
  https://www.iso.org/ics/01.020/x/
2
  https://www.lexicalmarkupframework.org/
3
  https://www.iso.org/standard/56063.html
4
  https://www.w3.org/2001/sw/wiki/Main_Page
resources and automatising the terminological work.
   In the TermTrends tutorial5 , we study the different standardisation approaches, ranging from
the initially proposed standards to represent terminology within ISO, to models that represent
linguistic data in the Semantic Web, including emerging vocabularies still under development.
We also want to give an overview of the main features of the terminology work, exploring its
evolution and new methods to speed up the terminology generation process. To complement
this, we will present different use cases in which both new methods to generate terminologies
and new ways to represent terminological knowledge are applied in specific domains, such as
Law or Life Sciences. Additionally, half of the tutorial is planned to be a hands-on session, testing
several tools with different purposes, such as the extraction and enrichment of terminologies
and their representation as per different vocabularies.


2. Objectives and Content
The objectives of this tutorial can be grouped in three main ideas:
    1. Introducing terminology and terminology work: its importance for several areas,
       namely knowledge discovery, knowledge engineering, ontology engineering, and how it
       has evolved over time, including the study of main terminological resources nowadays,
       some of which integrating the Linguistic Linked Open Data cloud6 .
    2. Offering a complete view of the standards for language data in general, and for
       terminology specifically, from initial standardisation approaches, going through XML-
       based formats, such as TBX7 [6], to new ISO standards related to terminology extraction
       and knowledge organisation. We will also take a closer look at those following Semantic
       Web and Linked Data principles [7], such as SKOS8 and Ontolex9 , and their extensions
       still under development10 .
    3. Testing different knowledge extraction and management tools applied to terminol-
       ogy, such as TermitUp11 [8], for terminology enrichment, and OpenRefine12 , Protégé13 or
       VocBench14 [9], to structure and manage terminologies in Semantic Web formats.
  Consequently, the content of the tutorial is divided into two blocks: a theoretical and a
practical one.
  The theoretical part comprises:

    • An introduction to Terminology, which includes a brief history of the terminological
      work, from its origins to the intersection of terminology and knowledge organisation.
5
  https://termtrends.linkeddata.es/
6
  https://linguistic-lod.org
7
  https://www.tbxinfo.net/
8
  https://www.w3.org/TR/skos-reference/
9
  https://www.w3.org/2016/05/ontolex/
10
   https://www.w3.org/community/ontolex/wiki/Terminology
11
   https://www.w3.org/2009/08/skos-reference/skos.html
12
   https://openrefine.org/
13
   https://protege.stanford.edu/
14
   http://vocbench.uniroma2.it/
        • An introduction to the current and the emerging standards to organise the data contained
          in terminological resources.
        • An introduction to two different use cases: legal terminology and medical terminol-
          ogy. Both presentations include overviews of research projects, resources, repositories,
          standards, challenges, etc.

      The practical part includes the following hands-on sessions:

        • Exploration of Semantic Web vocabularies to model terminological data, such as SKOS
          and Ontolex, to identify challenges and limitations.
        • Exploration and querying (SPARQL15 ) of terminological resources in RDF.
        • Terminology tool testing, mainly: terminological knowledge acquisition (TermitUp) and
          terminological knowledge organisation and management (WebProtégé and VocBench).

  Therefore, we identify three topics of the conference that can be related to the content of the
tutorial:

       1. Methods for ontology engineering, since we propose a review of existing vocabularies,
          standards and formats to model linguistic data, as well as new proposals specifically
          intended to model terminological data.
       2. Methods for knowledge acquisition and management, since the hands-on session is
          focused on testing several tools for semi-automatic terminological knowledge acquisition,
          including term extraction, term enrichment and relation acquisition. We also propose the
          use of different tools for knowledge organisation and management over terminological
          data.
       3. Applications in specific domains, during the theoretical session and the practical
          session we will be presenting several use cases in which different terminological genera-
          tion and modelling approaches have been applied. These use cases match three different
          conference subtopics: 1) eGovernment and public administration, 2) Life sciences, health
          and medicine, and 3) Humanities and Social sciences.


3. Conclusions
This document presents the TermTrends tutorial within EKAW 2022, which focuses on the
terminology work from its origins to the present, paying special attention to current techniques
for terminology generation and management, and to state-of-the-art models to represent ter-
minological data, specifically in the Semantic Web. The tutorial contains a theoretical and a
practical part, and it is targeted at a general audience. With this tutorial we intend to raise
awareness about the value of terminological resources specifically and language resources in
general within the Semantic Web and Natural Language Processing communities, aiming at
creating synergies amongst researchers in both fields.


15
     https://www.w3.org/TR/rdf-sparql-query/
Acknowledgments
This work is framed within the COST Action (European Cooperation in Science and Technology)
through NexusLinguarum, the “European network for Web-centred linguistic data science”
COST Action (CA18209)16 .


References
[1] E. Hovenga, Guideline and knowledge management in a digital world, in: Roadmap to
    Successful Digital Health Ecosystems, Elsevier, 2022, pp. 239–270.
[2] K. Stefaniak, Terminology work in the european commission: Ensuring high-quality
    translation in a multilingual environment, Quality aspects in institutional translation 8
    (2017) 109.
[3] M. T. Cabré, Terminology: Theory, methods, and applications, volume 1, John Benjamins
    Publishing, 1999.
[4] E. Wüster, Einführung in die allgemeine Terminologielehre und terminologische Lexikogra-
    phie, Romanist. Verlag, 1991.
[5] G. Francopoulo, M. George, N. Calzolari, M. Monachini, N. Bel, M. Pet, C. Soria, Lexi-
    cal markup framework (lmf), in: International Conference on Language Resources and
    Evaluation-LREC 2006, 2006.
[6] A. Melby, Tbx: A terminology exchange format for the translation and localization industry,
    201), Handbook of Terminology (2015) 393–424.
[7] C. Bizer, T. Heath, T. Berners-Lee, Linked data: The story so far, in: Semantic services,
    interoperability and web applications: emerging concepts, IGI global, 2011, pp. 205–227.
[8] P. Martín-Chozas, K. Vázquez-Flores, P. Calleja, E. Montiel-Ponsoda, V. Rodríguez-Doncel,
    Termitup: Generation and enrichment of linked terminologies, Semantic Web (2022) 1–20.
[9] A. Stellato, M. Fiorelli, A. Turbati, T. Lorenzetti, W. Van Gemert, D. Dechandon, C. Laaboudi-
    Spoiden, A. Gerencsér, A. Waniart, E. Costetchi, et al., Vocbench 3: A collaborative semantic
    web editor for ontologies, thesauri and lexicons, Semantic Web 11 (2020) 855–881.




16
     https://nexuslinguarum.eu/