<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Representing Multilingual Terminologies with OntoLex-Lemon</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Patricia</forename><surname>Martín-Chozas</surname></persName>
							<affiliation key="aff0">
								<orgName type="laboratory">Ontology Engineering Group</orgName>
								<orgName type="institution">Universidad Politécnica de Madrid</orgName>
								<address>
									<addrLine>Avda. Montepríncipe, s/n, Boadilla del Monte</addrLine>
									<postCode>28660</postCode>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author role="corresp">
							<persName><forename type="first">Thierry</forename><surname>Declerck</surname></persName>
							<email>declerck@dfki.de</email>
							<affiliation key="aff1">
								<orgName type="department">Multilinguality and Language Technology Lab</orgName>
								<orgName type="laboratory">German Research Center for Artificial Intelligence GmbH (DFKI)</orgName>
								<orgName type="institution">Saarland Informatics Campus</orgName>
								<address>
									<addrLine>D3 2, Stuhlsatzenhausweg 3</addrLine>
									<postCode>66123</postCode>
									<settlement>Saarbrücken</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="department">International Conference on &quot;Multilingual digital terminology today. Design</orgName>
								<orgName type="institution">representation formats and management systems&quot;</orgName>
								<address>
									<addrLine>June 16 -17</addrLine>
									<settlement>Padova</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Representing Multilingual Terminologies with OntoLex-Lemon</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">49DA06934AAAF63EFE880D7579DDC4CA</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T18:02+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Terminologies</term>
					<term>Multilingualism</term>
					<term>Formal Representation</term>
					<term>OntoLex-Lemon Orcid 000-0002-8922-7521 (P. Martín-Chozas); 0000-0002-9450-6648 (T. Declerck)</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper is framed within a project to make multilingual terminologies available in a native graph representation format. We are exploring the use of the OntoLex-Lemon model, suggesting also some extensions, for achieving a declarative encoding of relations between multilingual expressions contained in terminologies. This model is not only used for encoding terms but also for their associated definitions, contexts and notes. With this effort, we aim at supporting the publication of multilingual terminologies in the Linked Open Data cloud.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In the context of work dealing with the conversion of multilingual terminologies onto an RDF 1 model, we came into modelling decisions concerning also additional language data included in such resources. While the original purpose of the porting exercise is not to change anything at the level of the content of the considered terminologies, their modelling in a graph-based representation offers possibilities for their interlinking and merging with other resources, being in the realm of terminologies or of other types of data, like for example detailed lexicographic resources. Thus, the focus of our work is the possible improved formal representation of the language data used in multilingual terminologies. We discuss in this short paper few decisions points concerning our modelling strategy, also comparing our work with a directly related former approach.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">The Data Basis: Two Terminological Resources</head><p>Currently, we consider two terminological resources as the input for our transformation work: the multilingual terminology of the Deutsche Bahn (German Railways), which is encoded within the TBX<ref type="foot" target="#foot_0">2</ref> standard and can be accessed online <ref type="bibr" target="#b2">3</ref> ; and IATE (Interactive Terminology for Europe) <ref type="bibr" target="#b3">4</ref> , one of the most representative terminological database in Europe. The consideration of the latter was motivated by a previous exercise that focused on the conversion of the data contained in IATE, structured in TBX, into RDF. This effort is a great starting point to compare our approach.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">The TBX2RDF Guidelines</head><p>The past LIDER project <ref type="bibr" target="#b4">5</ref> was already concerned with mapping TBX to RDF, with the goal of transforming and publishing terminologies as Linked Data [2]. LIDER developed guidelines for this task <ref type="bibr" target="#b5">6</ref> in which TBX elements are converted into OWL <ref type="bibr" target="#b6">7</ref> and associated with other RDF vocabularies, while the basic vocabularies chosen as the backbone of the conversion were SKOS<ref type="foot" target="#foot_6">8</ref> and the lemon model [3], a predecessor of the OntoLex-Lemon framework [4] we are using. [5]  describe the TBX2RDF approach<ref type="foot" target="#foot_7">9</ref> and [6] presents recent developments related to this initiative, relying on a virtualization approach that is making use of containerization technologies.</p><p>The LIDER TBX2RDF approach is representing the TBX terminological concepts as skos:Concept and the TIG/NTIG elements of TBX as ontolex:LexicalEntry, and most of the other TBX elements are straightforwardly mapped onto RDF, meaning that they are encoding as URIs for representing a resource that can be associated with RDF predicates and objects. We note also that TXB2RDF is not representing the TBX langSet data as such, but instead is creating language specific lexicons in which all the data included in the original langSet element are encoded.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Our Approach</head><p>We make use of the most recent version of OntoLex-Lemon, 10 which is effectively integrating the SKOS vocabulary for representing conceptual units and their associated language data. This was not the case with its former version, lemon, which was used in the LIDER project. We can now use properties defined in OntoLex-Lemon for directly linking the conceptually oriented terms to lexical entries, while the LIDER TBX2RDF converter was using a custom property for this purpose. We introduce a skos:ConceptScheme for encoding the whole conceptual organisation of the original terminology, and within this scheme we allow for the definition of specific domain subsets, a feature not supported in TBX.<ref type="foot" target="#foot_9">11</ref> OntoLex-Lemon is foreseeing as a subclass of skos:Concept the class ontolex:LexicalConcept for linking lexical entries to the conceptual part described in the SKOS vocabulary. We encode all the terms as instances of this class, and no longer as instances of the class ontolex:LexicalEntry, as it was implemented in TBX2RDF. Another, and more significant, departure from the LIDER TBX2RDF model is the fact that we model definitions and contexts as instances of classes, and no longer as literal values.</p><p>In doing so, we can describe specific relations between the definitions within one language or across different languages. In the latter case, we can specify if the definitions given for terms in two different languages are translations of each other, multilingual equivalents or just monolingual definitions included in the multilingual terminology. Suggested additions to the OntoLex-Lemon model are marked with the prefix "termlex".</p><p>Figure <ref type="figure" target="#fig_1">1</ref> shows how an IATE term entry is currently represented following our approach, while also representing the synonymy of two Spanish terms. Figure <ref type="figure" target="#fig_2">2</ref> displays the relations between the terms and their definitions, which as instances of a class, can link to further information, like the provenance or the definitions for the same original term entry in another language. The English equivalents for the Spanish terms "surco ferroviario" and "franja ferroviaria" (displayed in Figures <ref type="figure" target="#fig_2">1 and 2</ref>) -"train path", "train slot" -, as well as the English definitions and their context of use are linked to the Spanish terms and entries via the properties defined in the Vartrans module of OntoLex-Lemon,<ref type="foot" target="#foot_10">12</ref> supporting a declarative description of the different types of relations that can exist between those different types of language data (terms, definitions and contexts of use).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusions and Future Work</head><p>We described ongoing work in porting the multilingual terminology resources onto a Linked Data compliant representation language. This work led us to the question if it would not be suitable to extend the modelling of TBX terminologies in RDF already proposed by the LIDER TBX2RDF converter. One aspect consists in considering definitions, contexts and notes as full ontological elements that can thus be put explicitly in relation to each other. This way, definitions in different languages can be declaratively interlinked and marked as translations, equivalents or as not having any of those relations.</p><p>As an outcome of our work, we are currently proposing an extension module for OntoLex-Lemon, <ref type="bibr">13</ref> that deals with the representation of terminological data that is not covered in the core module, as the main motivation of the development of OntoLex-Lemon vocabulary was to represent language data with references to ontologies.   </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure1: Representing a IATE term entry in OntoLex-Lemon, showing two Spanish terms used for a term entry. One term is marked as "preferred" while the other is marked as "deprecated". Our suggested extensions ("termlex") to OntoLex-Lemon are displayed in blue colour.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2:Representing the links between terms and their definitions, which are now instances of a specific class. Our suggested extensions ("termlex") to OntoLex-Lemon are displayed in blue colour.</figDesc></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_0">TBX stands for "TermBase eXchange". See https://www.tbxinfo.net/ [accessed 2022-02-14], or<ref type="bibr" target="#b0">[1]</ref> for more details.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_1">www.deutschebahn.com/dblanguageportal [accessed 2021-10-02]</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_2">See https://iate.europa.eu/ [accessed 2022-02-14]</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="5" xml:id="foot_3">http://lider-project.eu/lider-project.eu/index.html [accessed 2021-10-02]</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="6" xml:id="foot_4">The latest version of those guidelines is available at https://github.com/bpmlod/report/blob/gh-pages/ multilingual-terminologies/index.html [accessed 2022-02-14]</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="7" xml:id="foot_5">OWL stands for "Web Ontology Language". See https://www.w3.org/TR/owl2-primer/ [accessed 2022-14-02]</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="8" xml:id="foot_6">SKOS stands for "Simple Knowledge Organization System". See also https://www.w3.org/2009/08/ skos-reference/skos.html [last consulted: 2022-02-14]</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="9" xml:id="foot_7">The corresponding W3C Community Group Report is avaialable at https://www.w3.org/2015/09/ bpmlod-reports/multilingual-terminologies/[accessed2022-<ref type="bibr" target="#b1">[2]</ref><ref type="bibr" target="#b2">[3]</ref><ref type="bibr" target="#b3">[4]</ref><ref type="bibr" target="#b4">[5]</ref><ref type="bibr" target="#b5">[6]</ref><ref type="bibr" target="#b6">[7]</ref>[8][9][10][11][12][13][14] </note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="10" xml:id="foot_8">See https://www.w3.org/2016/05/ontolex/ [accessed 2022-02-14] for technical details.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="11" xml:id="foot_9">See<ref type="bibr" target="#b6">[7]</ref> for a discussion on the difference between the "subjectField" in TBX and the conceptual hierarchy in SKOS.</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="12" xml:id="foot_10">https://www.w3.org/2016/05/ontolex/#variation-translation-vartrans</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="13" xml:id="foot_11">https://www.w3.org/community/ontolex/wiki/Terminology</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This short paper is based upon work from COST Action NexusLinguarum -European network for Web-centered linguistic data science (CA18209), supported by COST (European Cooperation in Science and Technology). The article is also supported by the Horizon 2020 research and innovation programme with the project Prêt-à-LLOD (grant agreement no. 825182).</p></div>
			</div>


			<div type="funding">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>CEUR Workshop Proceedings (CEUR-WS.org) <ref type="bibr" target="#b0">1</ref> RDF stands for "Resource Description Framework". See https://www.w3.org/TR/rdf-primer/for more details.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">TBX-Min: A Simplified TBX-Based Approach to Representing Bilingual Glossaries</title>
		<author>
			<persName><forename type="first">A</forename><surname>Lommel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">K</forename><surname>Melby</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Glenn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Hayes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Snow</surname></persName>
		</author>
		<ptr target="https://hal.archives-ouvertes.fr/hal-01005851" />
	</analytic>
	<monogr>
		<title level="m">Terminology and Knowledge Engineering 2014</title>
				<meeting><address><addrLine>Berlin, Germany</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page">10</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Linked data: The story so far</title>
		<author>
			<persName><forename type="first">C</forename><surname>Bizer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Heath</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Berners-Lee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Semantic services, interoperability and web applications: emerging concepts</title>
				<imprint>
			<publisher>IGI global</publisher>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="205" to="227" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Interchanging lexical resources on the semantic web</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Mccrae</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Aguado De Cea</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Buitelaar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cimiano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Declerck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gómez-Pérez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gracia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hollink</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Montiel-Ponsoda</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Spohr</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Wunner</surname></persName>
		</author>
		<idno type="DOI">10.1007/s10579-012-9182-3</idno>
		<ptr target="https://doi.org/10.1007/s10579-012-9182-3.doi:10.1007/s10579-012-9182-3" />
	</analytic>
	<monogr>
		<title level="j">Lang. Resour. Evaluation</title>
		<imprint>
			<biblScope unit="volume">46</biblScope>
			<biblScope unit="page" from="701" to="719" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">The OntoLex-Lemon Model: development and applications</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Mccrae</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Buitelaar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cimiano</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proc. of the 5th Biennial Conference on Electronic Lexicography (eLex)</title>
				<meeting>of the 5th Biennial Conference on Electronic Lexicography (eLex)</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Linked terminologies: applying linked data principles to terminological resources</title>
		<author>
			<persName><forename type="first">P</forename><surname>Cimiano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">P</forename><surname>Mccrae</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Rodríguez-Doncel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Gornostay</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gómez-Pérez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Siemoneit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Lagzdins</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the eLex 2015 Conference</title>
				<meeting>the eLex 2015 Conference</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Terme-à-LLOD: Simplifying the conversion and hosting of terminological resources as linked data</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">P</forename><surname>Di Buono</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cimiano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">F</forename><surname>Elahi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Grimm</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2020.ldl-1.5" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 7th Workshop on Linked Data in Linguistics (LDL-2020), European Language Resources Association</title>
				<meeting>the 7th Workshop on Linked Data in Linguistics (LDL-2020), European Language Resources Association<address><addrLine>Marseille, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="28" to="35" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Bridging the gap between SKOS and TBX</title>
		<author>
			<persName><forename type="first">D</forename><surname>Reineke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Romary</surname></persName>
		</author>
		<ptr target="https://hal.inria.fr/hal-02398820" />
	</analytic>
	<monogr>
		<title level="j">Die Fachzeitschrift für Terminologie</title>
		<imprint>
			<biblScope unit="volume">19</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
