<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Dare to Be Different: How User Needs Determine Termbase Design</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Michal</forename><surname>Měchura</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Fiontar &amp; Scoil na Gaeilge</orgName>
								<orgName type="institution">Dublin City University</orgName>
								<address>
									<settlement>Dublin</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
							<affiliation key="aff1">
								<orgName type="department" key="dep1">Natural Language Processing Centre</orgName>
								<orgName type="department" key="dep2">Faculty of Informatics</orgName>
								<orgName type="institution">Masaryk University</orgName>
								<address>
									<settlement>Brno</settlement>
									<country key="CZ">Czech Republic</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Brian</forename><forename type="middle">Ó</forename><surname>Raghallaigh</surname></persName>
							<email>brian.oraghallaigh@dcu.ie</email>
							<affiliation key="aff0">
								<orgName type="department">Fiontar &amp; Scoil na Gaeilge</orgName>
								<orgName type="institution">Dublin City University</orgName>
								<address>
									<settlement>Dublin</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Úna</forename><surname>Bhreathnach</surname></persName>
							<email>una.bhreathnach@dcu.ie</email>
							<affiliation key="aff0">
								<orgName type="department">Fiontar &amp; Scoil na Gaeilge</orgName>
								<orgName type="institution">Dublin City University</orgName>
								<address>
									<settlement>Dublin</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Gearóid</forename><forename type="middle">Ó</forename><surname>Cleircín</surname></persName>
							<email>gearoid.ocleircin@dcu.ie</email>
							<affiliation key="aff0">
								<orgName type="department">Fiontar &amp; Scoil na Gaeilge</orgName>
								<orgName type="institution">Dublin City University</orgName>
								<address>
									<settlement>Dublin</settlement>
									<country key="IE">Ireland</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff2">
								<orgName type="department">International Conference on &quot;Multilingual Digital Terminology Today: Design</orgName>
								<orgName type="institution">representation formats and management systems&quot;</orgName>
								<address>
									<addrLine>16-17 June 2022</addrLine>
									<settlement>Padova</settlement>
									<country key="IT">Italy</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Dare to Be Different: How User Needs Determine Termbase Design</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">DCF8790DC5BED32ED8D05CA60098AA11</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T18:02+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>online termbases</term>
					<term>terminology in minority languages</term>
					<term>terminology in bilingual countries</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper describes and discusses how the design of the National Terminology Database for Irish (téarma.ie) has been influenced by two factors: the assumed information needs of the intended users, and the data governance needs of the publisher. In particular, we will highlight how these factors have sometimes caused our termbase design to diverge from established practices in the terminology industry and from standards such as TBX.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The National Terminology Database for Irish (NTDI) <ref type="bibr" target="#b0">[1,</ref><ref type="bibr" target="#b1">2]</ref> serves the speakers of a minority language (Irish) in a country (Ireland) where it co-exists with a majority language (English). The users are typically translators, bilingual journalists, educators and officials in public administration who are looking for translations of specialised terms from English into Irish, in fields such as public administration, sport, public health and information technology as well as various school subjects. The termbase has a public website (https://www.tearma.ie, formerly focal.ie) which handles over half a million search requests every month. The termbase is edited by a small number of terminologists through the open-source terminology management system Terminologue (https://www.terminologue.org) <ref type="bibr" target="#b2">[3]</ref>. NTDI contains approximately 200,000 entries, each with terms in two languages.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.1.">The information needs of the end-user</head><p>When NTDI's public website was launched in 2006 it was intended as an LSP (Language for Specialised Purposes) resource as defined e.g. by <ref type="bibr" target="#b3">[4]</ref>. However, online lexical resources for the Irish language were scarce at that time and many users have come to use the termbase as if it were an LGP (Language for General Purposes) dictionary, searching for general-purpose vocabulary and looking for the kind of information one would normally expect to find a generalpurpose dictionary. NTDI has evolved to satisfy this unique mixture of the users' information needs (a concept originally defined by <ref type="bibr" target="#b4">[5]</ref>), both in its content (it contains some general-language vocabulary) and in its structure.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.2.">The data-governance needs of the terminologist</head><p>While the users' information needs are what drives the design of a termbase, the needs of the terminologists -the editors and maintainers behind the scenes -need to be taken into account as well. These are concerned mainly with data governance: quality control, keeping the termbase well organised and well maintained in the long run, avoiding duplicates and so on. The design of the NTDI reflects some of these needs, as we will show in the rest of this paper.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Some features of NTDI</head><p>We will now review some of NTDI's structural features that have been influenced by the requirements introduced above, covering both the users' information needs and the terminologist's data-governance needs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Grammatical annotations</head><p>Most termbases in the translation industry or in knowledge engineering contain only sparse grammatical information: it is expected that the user will be a (near-)native speaker and will need no help in determining the gender of nouns or the plural of noun phrases. In NTDI this assumption does not apply: NTDI is a public-service termbase, targeting the general public and serving a user community with a high percentage of learners and non-native speakers. The consequence is that terms in NTDI come with relatively rich grammatical annotations, both as labels attached to terms (part of speech, gender, inflection paradigm) and as inflected forms added to terms (plurals, genitive case). A speciality is that the termbase allows inline grammatical annotations: it is possible to attach labels not just to the entire term but also to a single word inside it, for example to the head noun of a noun phrase.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Term sharing</head><p>Because terms in NTDI contain a lot of grammatical annotation and because many terms are polysemous (= one term designates multiple concepts), the issue of duplication and consistency have arisen: entering the same term into several entries requires duplicate effort and can result in inconsistencies (for example, when a mistaken grammatical label is corrected in one entry but not in another). To prevent duplication and to enforce consistency, NTDI (and Terminologue) has a feature which allows for a term to be shared among several entries. Any changes made to the term in one entry (including changes to its grammatical annotation or to its infected forms) automatically become visible in the other entries too. Approximately 15% of terms in the termbase are shared like this. This is an example of a database design feature which is motivated not by the user's information needs (the end-users are probably not even aware of it)  but by the data-governance needs of the terminologists: the need to eliminate duplicate labour and inconsistency.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">No ontologies</head><p>It is popular in terminology to organise entries into networks of is-a, has-a and other relations, thus building entire ontologies <ref type="bibr" target="#b5">[6]</ref>. Ontologies are useful in a knowledge-engineering context where the goal is to enable the user to explore and understand an entire domain. In NTDI, however, this goal is almost absent. Our website traffic statistics show that most users consult NTDI not to explore an entire domain but simply to obtain translations of individual terms. NTDI users typically consult the termbase while they are doing something else: translating or writing. Because of this, the software behind NTDI (Terminologue) has no ontology-building features. The only type of entry-to-entry relation available is a simple "see also" relation, as well as relations implicit in our relatively rich scheme of hierarchical domain labels. We find that this is sufficient for the information needs of NTDI's users.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.4.">Optional hiding of information</head><p>It is a truism that when publishing lexical resources online as opposed to on paper, one does not need to worry about space constraints, as computer memory is practically unlimited. But this does not mean that terminological entries can be arbitrarily long: we still need to take the user's cognitive capacity into account and avoid creating a situation of information overload <ref type="bibr" target="#b6">[7]</ref>. For this reason NTDI (and Terminologue) has a feature which allows the terminologists to label certain parts of an entry as non-essential, such as protracted citations from sources, deprecated terms or certain usage examples. Such parts are hidden by default in the public user interface, while users who want to view them can reveal them by clicking a 'plus' icon.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Conclusion</head><p>The termbase described in this paper departs from established practice in terminology. Many of NTDI's structural features are difficult to map onto structural categories common in other terminological software and in interchange standards such as TBX (for example, TBX has no notion of term sharing). We have attempted to explain in this paper that this divergence is not arbitrary but motivated: motivated by the genre of the termbase (it is a public-service termbase), motivated by the information needs of the end-users, and last but not least, motivated by the data-governance needs of the terminologists.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The Irish multiword term gléas freagartha with a part-of-speech label attached to the head noun and two inflected forms (genitive case and plural). Editorial interface on the left, public website on the right.</figDesc><graphic coords="3,89.29,84.19,416.70,166.33" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2:The Irish term airgead, with all its grammatical annotation, is shared by three entries.</figDesc><graphic coords="3,89.29,308.65,416.70,159.77" type="bitmap" /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>The NTDI is managed by the Gaois research group in Fiontar &amp; Scoil na Gaeilge, Dublin City University in partnership with the Irish Terminology Committee, Foras na Gaeilge.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">The Focal.ie National Terminology Database for Irish: software demonstration</title>
		<author>
			<persName><forename type="first">M</forename><surname>Měchura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Raghallaigh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 14th EU-RALEX International Congress, Fryske Akademy</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Dykstra</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Schoonheim</surname></persName>
		</editor>
		<meeting>the 14th EU-RALEX International Congress, Fryske Akademy<address><addrLine>Leeuwarden/Ljouwert, The Netherlands</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="937" to="948" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Building on a terminology resource -the Irish experience</title>
		<author>
			<persName><forename type="first">C</forename><surname>Nic Pháidín</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Cleircín</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ú</forename><surname>Bhreathnach</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 14th EURALEX International Congress, Fryske Akademy</title>
				<editor>
			<persName><forename type="first">A</forename><surname>Dykstra</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Schoonheim</surname></persName>
		</editor>
		<meeting>the 14th EURALEX International Congress, Fryske Akademy<address><addrLine>Leeuwarden/Ljouwert, The Netherlands</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2010">2010</date>
			<biblScope unit="page" from="954" to="965" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Introducing Terminologue: a cloud-based, open-source terminology management tool</title>
		<author>
			<persName><forename type="first">M</forename><surname>Měchura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Raghallaigh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Presented at XIX EURALEX International Congress</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Manual of Specialised Lexicography: The preparation of specialised dictionaries</title>
		<idno type="DOI">10.1075/btl.12</idno>
		<editor>H. Bergenholtz, S. Tarp</editor>
		<imprint>
			<date type="published" when="1995">1995</date>
			<publisher>John Benjamins Publishing Company</publisher>
			<biblScope unit="volume">12</biblScope>
			<pubPlace>Amsterdam</pubPlace>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">The process of asking questions</title>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">S</forename><surname>Taylor</surname></persName>
		</author>
		<idno type="DOI">10.1002/asi.5090130405</idno>
	</analytic>
	<monogr>
		<title level="j">American Documentation</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page" from="391" to="396" />
			<date type="published" when="1962">1962</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Applying ontologies to terminology: Advantages and disadvantages</title>
		<author>
			<persName><forename type="first">I</forename><surname>Muñoz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">R</forename><surname>Zambrana</surname></persName>
		</author>
		<idno type="DOI">10.7146/hjlcb.v26i51.97438</idno>
	</analytic>
	<monogr>
		<title level="j">Hermes: Journal of Language and Communication in Business</title>
		<imprint>
			<biblScope unit="volume">51</biblScope>
			<biblScope unit="page" from="65" to="77" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Dictionary users in the digital revolution</title>
		<author>
			<persName><forename type="first">R</forename><surname>Lew</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G.-M</forename><surname>De Schryver</surname></persName>
		</author>
		<idno type="DOI">10.1093/ijl/ecu011</idno>
	</analytic>
	<monogr>
		<title level="j">International Journal of Lexicography</title>
		<imprint>
			<biblScope unit="volume">27</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
