<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Contemporary Voices in Ancient Tongue: Integrating Papal Encyclicals into the LiLa KB</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aurora Alagni</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Federica Iurescia</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eleonora Litta</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Università Cattolica del Sacro Cuore, CIRCSE Research Centre</institution>
          ,
          <addr-line>Largo Gemelli, 1, 20123 Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>This paper presents the integration of a new textual resource-the Papal Encyclicals corpus-into the LiLa: Linking Latin Knowledge Base. The inclusion of three recent Encyclicals authored by Pope Francis (Lumen Fidei, Laudato si', and Fratres omnes) significantly enriches the LiLa Knowledge Base by extending its chronological coverage and introducing contemporary Latin vocabulary. The linking process involved automatic tokenisation, part-of-speech tagging, and lemmatisation using the LiLa Text Linker, followed by manual validation and disambiguation. The newly added lemmas fall into three categories: Latinized anthroponyms and toponyms, ethnic adjectives, and neologisms. These lexical additions reflect both a modernising trend in Vatican Latin and diverse morphological and semantic processes, including borrowing, calquing, and analogy-based reconstruction. The resource also opens avenues for analysing the stylistic and rhetorical features of Papal Encyclicals as a genre.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Linked Open Data</kwd>
        <kwd>Latin</kwd>
        <kwd>textual resources</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction
1.1. LiLa</p>
    </sec>
    <sec id="sec-2">
      <title>LiLa (Linking Latin) is a Linked Open Data (LOD) Knowl</title>
      <p>
        edge Base (KB).1LiLa has been built to foster
interoperability across textual and lexical resources for Latin [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
The LiLa KB relies on two primary components:
      </p>
    </sec>
    <sec id="sec-3">
      <title>Lexical resources are linked to the Lemma Bank by con</title>
      <p>necting their lexical entries to their canonical forms. The
single word occurrences (tokens) in textual resources
are connected to the corresponding lemma in the LiLa
Lemma Bank.
1.2. Papal Encyclicals</p>
    </sec>
    <sec id="sec-4">
      <title>1http://lila-erc.eu</title>
      <p>
        † This paper is the result of the collaboration between the authors. 2http://lila-erc.eu/data/id/lemma/LemmaBank
For the specific concerns of the Italian academic attribution system, 3The collection of lemmas in the Lemma Bank originates from
LEMFederica Iurescia is responsible for section 1; Eleonora Litta for LAT 3.0, a morphological analyzer [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
section 2; Aurora Alagni for section 3. Section 4 was collaboratively 4The list of resources interlinked in LiLa is available at https://lila-erc.
written by all authors. eu/data-page/.
$ aurora.alagni@outlook.it (A. Alagni); 5http://lila-erc.eu/data/corpora/PapalEncyclicals/id/corpus.
federica.iurescia@unicatt.it (F. Iurescia); 6At the moment of writing, the encyclical Dilexit nos, published in
EleonoraMaria.Litta@unicatt.it (E. Litta) 2024, was not available.
      </p>
      <p>0000-0001-5100-5539 (F. Iurescia); 0000-0002-0499-997X (E. Litta) 7https://www.vatican.va/content/francesco/la/encyclicals.index.</p>
      <p>©At2tr0i2b5utCioonpy4r.0igIhnttefornratthioisnpaalp(CerCbByYit4s.0a)u.thors. Use permitted under Creative Commons License html#encyclicals.
genres represented. Moreover, the addition of this corpus output of this task is in Table 2.
not only expands the Lemma Bank with new lemmas but Inevitably, the output of the lemmatisation process was
also enables the study of lexical innovation strategies not definitive. The accuracy of the 1:1 matches amount
employed to express modern concepts in Latin. to around 97%. However, in certain cases, incorrect URIs
were assigned.8 One common source of error was the
lemmatiser’s assumption that any word beginning with a
cap2. Linking ital letter should be classified as a proper noun (PROPN).
As a result, nouns occurring at the beginning of a
sen2.1. Linking tence were sometimes misclassified, leading to erroneous
The initial phase of the linking process involved the acqui- matches when a proper noun homograph exists for a
sition of plain-text versions of the three texts, retrieved regular noun (e.g., Amor, the Roman god of love, versus
from the oficial Vatican website. Collectively, these texts amor, the common noun for ‘love’). Another frequent
comprise 89,463 tokens, including punctuation and nu- error involved the lemmatisation of quod, which was
unimerical elements associated with verse numbering and formly tagged as a pronoun (PRON), despite its potential
biblical references. to function as a subordinating conjunction (SCONJ) or</p>
      <p>
        Tokenisation, sentence segmentation, part-of-speech determiner (DET), depending on its syntactic role in the
(PoS) tagging, and lemmatisation were carried out auto- sentence. Similarly, quam was consistently tagged as a
matically using the LiLa Text Linker—an NLP tool specif- subordinating conjunction (SCONJ), although it could
ically designed for Latin. Table 1 displays the number of also serve as a pronoun (PRON) or a determiner (DET).
tokens per letter, excluding punctuation and numbering. Errors can arise for various reasons. As a result, the
lemDeveloped as part of the user-oriented services provided matisation output was subjected to systematic manual
by LiLa [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the Text Linker not only performs linguistic review and correction by trained annotators, as well as
annotation but also establishes links between the anno- disambiguation of 1:N matches.9 Some of the one-to-zero
tated output and corresponding entries in the Lemma matches also resulted from errors in lemmatisation or
Bank. For PoS tagging and lemmatisation, the system tokenisation. In particular, it was necessary in all
inrelies on a UDPipe model trained on customised data. stances to segment tokens containing enclitics, such as
The linking procedure operates as follows: whenever the -que, -ne, and -ue, in order to enable accurate matching.
lemmatisation of a token yields a lemma–PoS pair that ex- For example, in tokens like socialemque ‘and (something/
actly matches a corresponding entry in the LiLa Lemma someone) social’, eritne ‘will it be’, licetne ‘is it allowed’,
Bank, the system returns the URI of the matched lemma. and practicumue ‘or (something) practical’, proper token
These cases are referred to as 1:1 matches. In instances splitting was required so that appropriate URIs could be
where the same lemma–PoS combination corresponds to assigned to both the first token (noun, verb, or adjective)
multiple entries in LiLa, the system returns all relevant
URIs, constituting 1:N matches. Conversely, when no
entry in the Lemma Bank corresponds to the lemma–PoS
pair produced during lemmatisation, the system returns
no URI. These instances are classified as 1:0 matches. The
      </p>
    </sec>
    <sec id="sec-5">
      <title>8Manual intervention was required in approximately 3% of the 1:1</title>
      <p>matches subset.
9For an explanation of why 1:N matches regularly arise in the process
of linking a textual resource to the LiLa LB, see the detailed report
on how homography was handled during the integration of the
LASLA corpus into the LiLa Knowledge Base [7, pp. 30–31].
and to the enclitic. Canto XXXIII of Dante’s Paradiso, particularly the verse
“l’amor che move il sole e l’altre stelle”, to emphasise that
“God’s love is the fundamental moving force in all
cre3. Papal Encyclicals in LiLa: ated things” (LS, 77).10 Similarly, he references
WittgenAdding New Lemmas stein’s Vermischte Bemerkungen, where the philosopher
discusses the “connection between faith and certainty”
Following the disambiguation process, several lemmas (LF, 27), and Irenaeus of Lyon’s Adversus haereses,
particremain unlinked to LiLa, as they are not yet represented ularly the passage that uses the metaphor of melody to
in the Knowledge Base. A thorough analysis of the 1:0 explain how diferent sounds can come from the same
match types is necessary before considering their inclu- composer, just as each of us comes from the same
Cresion in the Lemma Bank. A subset of these unmatched ator (FO, 58). By contrast, Desmond Tutu, Martin Luther
lemmas corresponds to non-Latin words, which are not King Jr., Mother Teresa of Calcutta, and Saint Thérèse of
intended to be integrated into the Knowledge Base. These Lisieux serve as exempla to be emulated: for their acts
include: non-Latinized anthroponyms, such as Nietzsche, of universal brotherhood despite religious diferences,
Dostoevskij (LF), King, and Al-Tayyeb (FO); words translit- their faith in sufering, and their daily gestures of love
erated into the Latin alphabet from other languages, and peace. As for toponyms, the category includes
Ause.g., emûnah from Hebrew or didachés from Greek (LF); tralia, Columbia, Corea, Croatia (FO), and Zelandia (LS),
acronyms such as DNA, OGM (LS); and compound words all appearing in the genitive case following episcopi, as
joined by a hyphen or other special characters, such as well as Congus (LS) and Hiroshima (FO), cited
respecDeo-Amen (LF) or Rio+20 (LS). tively as examples of the importance of preserving land</p>
      <p>In addition, a specific subset of the 1:0 and biodiversity, and of the moral imperative not to
formatches—consisting of orthographic variants, di- get historical tragedies to which “we must never grow
alectal forms, or alternative spellings of standardized accustomed or inured” (FO, 248).
forms—required targeted handling. In accordance The second category consists of ethnic adjectives. Of
with the OntoLex model used in LiLa, these cases the 15 instances found, 12 appear for the first time in
have been incorporated as written representations the encyclical Laudato si’, two in Fratres omnes, and only
(ontolex:writtenRep) of existing lemmas already one in Lumen fidei . From a derivational morphological
present in the Lemma Bank [8, p. 69]. Specifically, these perspective, these adjectives can be divided into three
cases result from greater accuracy in transliteration from main types. The largest group (10 lemmas) consists of
Hebrew (Bethlehem, LF; Hillel, FO), from the gemination denominal adjectives derived from a toponym with the
of the sibilant in the toponym Assisium (FO; present in sufix -ensis (Basileensis, LS), including its extended form
the Lemma Bank as Asisium), from the abandonment -iensis (Canadiensis, LS), a sufix typically used in Latin
of a more Hellenising or archaic spelling of Babilonia for forming ethnic adjectives [9, p. 439]. The second
(LS; listed in the Lemma Bank as Babylonia), and from group includes Apparitiopolitanus, Boliuianus,
Paraguaa diferent graphical representation of the consonant ianus (LS), Nazarethanus (LS, FO), and Bonaeropolitanus
cluster [ks] in exstraneus (FO; found in the Lemma Bank (FO), formed with the equally canonical sufix -anus [9,
as Extraneus). These examples may reflect a modernising p. 410]. A further distinction, intersecting with the
pretendency in Latin spelling practices adopted by the viously discussed category of Latinized toponyms,
conVatican, possibly aimed at aligning Latin orthography cerns the nature of the geographical names from which
more closely with modern Italian spelling conventions these adjectives are derived. Some are adapted
borrow(cf. Assisi, Babilonia, Estraneo). The same tendency will ings (*Basilea from Basileensis), while others seem to be
be noted again in later parts of the analysis. structural calques [10, pp. 118, 122], such as
*Flumenian</p>
      <p>The lemmas that have been added to the LiLa Knowl- uarius (from Flumenianuariensis, “of the city of Rio de
edge Base, on the other hand, can be classified into three Janeiro”, LS). Some of these calques may undergo an
addimain categories. tional morphological process, i.e. compounding with the</p>
      <p>The first category of lemmas added to the Lemma Bank Greek lexeme polis, resulting in forms like *Apparitiopolis
includes Latinised anthroponyms and toponyms. Among (from Apparitiopolitanus, “of the city of Aparecida”, LS)
the anthroponyms are Desmondus, Martinus Luterus, Ire- and *Bonaeropolis (from Bonaeropolitanus, “of the city
naeus (FO), Ludouicus (LF), the patronymic Aligherius of Buenos Aires”, FO). Morphologically, the adjectives
(LS), Teresia, and Bonauentura (LS, LF). These figures, belong either to the second declension with two endings
cited in the Encyclicals, can play one of two roles: that (first group) or to the first declension (second group),
of auctoritas or exemplum. In the case of Dante Alighieri, depending on the sufixation process. Semantically, the
Saint Bonaventure, Saint Irenaeus, and Ludwig
Wittgenstein, Pope Francis primarily refers to their words and 10Terhael latenxgtuaogfesthaits:
hatntdps:o//twhewrwe.vnactyicclainca.vlsa/cisonatevnatil/afrbalenceinscos/eivt/works to support his arguments. For example, he cites encyclicals.index.html.
adjectives occur in diferent contexts: some appear in -ismum), the lemmas were assigned masculine gender
the genitive plural linked to episcopi (6); others refer to and classified as second-declension nouns with
nomicities where documents, declarations, or environmen- native in -us, based on analogy with the attested forms.
tal agreements were signed (4); two are characteristic For instance, the tokens dynamismum (LF, accusative)
attributes of female saints. Bonaeropolitanus refers to or deconstructionismi (FO, genitive), are entered into the
the positive influence of Jewish culture in Rio de Janeiro, KB as dynamismus and deconstructionismus, respectively.
while Nazarethanus, in both instances, occurs in the fem- These reconstructions follow the model of lemmata such
inine form, dependent on familia. as fatalismus and determinismus, both attested in the</p>
      <p>The final category of new lemmas linked to the Lemma LRL, or anthropocentrismus (LS), which is already found
Bank consists of neologisms. The introduction of new in the nominative form within the corpus. Another
exlexical units into the inventory of a language can occur ample involves nine lemmas pertaining to the semantic
not only through internal resources and mechanisms, field of “Chemistry and Mineralogy” (see below). Among
but also by drawing on elements from other languages, these, six such as carbonium (LS) and fermentum (FO) are
either through borrowing or calquing [11, p. 281]. In the present in the LRL as Latin equivalents of ‘carbon’ and
present case, the linguistic influence is unidirectional, ‘enzyme’ respectively, and are clearly neuter nouns of the
from Italian to Latin, which is unsurprising, given that second declension. One more, dioxydum (LS), appears
Italian, although descended from Latin, is a living lan- in the nominative. By analogy, the word forms cyanido
guage with an active speaker community, unlike Latin. and nitrogeni were reconstructed as cyanidum and
nitroHowever, what is particularly noteworthy is that some of genum and added to the KB as neuter second-declension
the Italian terms themselves are the result of interference nouns. Moreover, the LRL further reflects a modernizing
from other languages. These layers of influence have tendency in the lexical choices of the Latin used in the
Encontributed significantly to the enrichment of the Latin cyclicals. A number of Italian terms that the LRL renders
lexicon recorded in the LiLa Knowledge Base. Across the using periphrasis—in accordance with its assertion that
three encyclicals of Pope Francis under consideration, “Latin is less suited (than Greek) to compounding words
234 neologisms have been identified, though they are not into one”12—reappear in the Encyclicals as single new
evenly distributed. In the first and shortest encyclical lexical items. These are often modeled directly on Italian,
(see Table 1), Lumen Fidei, 32 neologisms appear for the incorporating morphological adaptations. For example,
ifrst time. In the second, Laudato si’, 126 new formations the Italian noun totalitarismo is translated in the LRL as
are attested. Finally, in the third and longest encyclical, “absolutum civitatis regimen”, but appears in both Lumen
Fratres omnes, 76 neologisms are recorded. Fidei and Fratres omnes as totalitarismus. Similarly, the</p>
      <p>Before proceeding with the analysis of this final cat- adjective mammifero, which is translated in the LRL as
egory, a preliminary methodological clarification is re- “belua mammans”, appears in Laudato si’ as mammiferum,
quired. In 1992, the Libreria Editrice Vaticana published clearly modeled on the Italian form. Having established
the Lexicon Recentis Latinitatis (hereafter LRL), a lexicon the necessary methodological premises, we can now
prothat translates into Latin “many new words introduced ceed with an analysis of the neologisms. These may be
by this era”, generated “while preserving the norms of classified into adjectives, nouns, and verbs.
philology and the character of the Latin language”.11 This As for adjectives, there are a total of 99. From a
morlexicon was fundamental for aligning word forms in the phological perspective, 68 are first-class adjectives; 3
Encyclicals with the corresponding correct lemma. How- are first-class adjectives ending in -ius (communitarius,
ever, its application has also revealed the need for up- consumptorius, fragmentarius); 27 are second-class
addates. Of the 234 lemmas analyzed, 145 are attested in jectives with two endings; 1 is a second-class adjective
the LRL. The remaining 89 were manually reconstructed with a single ending (globalizans, present participle of
by observing the word forms in their textual context. *globalizo). From a derivational standpoint, first-class
In some cases, reconstruction was straightforward; in adjectives are typically denominal, formed using the
sufothers, it was not possible to determine the lemma with ifxes -icus (atomicus) and -osus (gasiosus), which are
comcertainty. In these cases, the principle of analogy was monly employed in Latin for this type of morphological
applied. For instance, among the 36 neologisms formed construction [9, p. 1125]. The nouns from which these
with the sufix -ismus, half are found in the LRL. Of the adjectives are derived originate from Ancient Greek
(agremaining 18, only four appear in the nominative case. nosticus),13 Classical Latin (Prometheicus), Medieval Latin
For the other 14, given the absence of modifying
adjectives that could disambiguate gender (and therefore the
case, which might otherwise suggest a nominative in
12Author’s translation from Latin: “linguam Latinam minus aptam
esse (quam Graecam) ad componenda verba ita ut in unum
coalescant” [12, p. 7].
13From this point on, only one example per source language or
variety is cited. The list is not intended to be exhaustive; this editorial
choice was made for space reasons. The linguistic analysis
con11Author’s translation from Latin: “multa verba nova, quae haec
aetas induxit” and “servatis normis disciplinae philologae et indole
linguae Latinae” [12, p. 7].
(inclusiuus), Scientific Latin ( electricus), Modern Latin abstract and conveying a positive meaning, such as
actu(aestheticus), and Ecclesiastical Latin (encyclicus). They ositas, biodiversitas, solidarietas, as well as nouns related
also result from interlinguistic influence between Italian to the sphere of the individual, such as sacralitas,
responand modern languages such as French (acusticus), En- salitas, intimitas, and sexualitas. Another noteworthy
sufglish (romanticus), Czech (roboticus), German (nazistus). fix is -tio, used to form deverbal nouns denoting actions,
The second-class adjectives with two endings are formed such as dissentio, immigratio, and globalizatio. Among
either through sufixation with -alis/-aris (structuralis, the most common combining forms is -logia (from the
polaris) or with -bilis (renouabilis). These are derived Greek logos, and also the basis for the sufixoid -logicus,
from nouns of various origins: Greek (theologalis), Clas- see above), which forms nouns such as ideologia and
oesical Latin (optionalis), Medieval Latin (interdisciplinaris), cologia. Also worth noting is that, in the case of nouns
Late Latin (exsistentialis), Scientific Latin ( molecularis), as well, some of foreign origin have entered the Latin
Legal Latin (solidalis), and modern languages, such as lexicon via Italian. Examples include imanus from
AraEnglish (internationalis). Remaining within the scope of bic (imam); three chemistry-related terms from French:
derivation, it is particularly noteworthy that many of methanum, nitrogenum, dioxydum; mangrouia from
Enthe neologisms exhibit prefixal or compositional struc- glish; three terms with the combining form gen-, genetica,
tures prior to sufixation. These include prefixoids such genoma, genum, from German. There are also nouns
deas inter- (as in interdisciplinaris, internationalis), multi- rived from Classical Latin (uniuersalismus), Late Latin
(re(multilateralis, multinationalis, multipolaris), and trans- ciprocitas), Legal Latin (solidalitas), Medieval Latin
(repre(transgeneticus, transnationalis). Other frequent composi- salia), and Scientific Latin ( gasium). Finally, particularly
tional elements include bases such as anthropo- (anthro- interesting from a derivational point of view are
sevpocentricus, anthropologicus), auto- (autonomus, autotesti- eral structural calques from other languages: tromocratia,
monialis), and techno- (technocraticus, technologicus). Par- with its derivative tromocratus, from French terrorisme
ticularly prominent is the sufixoid -logicus (methodolog- (from terreur + -isme); autocinetum or autoraeda from
icus, oecologicus, technologicus), which highlights how French automobile; caeliscalpium and interrete from
Enthese adjectival neologisms respond to the growing need glish skyscraper and internet; and ferriuia from German
for terminology that addresses the study of the human Eisenbahn.
being, its place within an increasingly interconnected Finally, there are only four verbal neologisms. Of these,
world, the technologies it produces, and the discourse two belong to the first conjugation ( obstaculo, subordino),
surrounding it. one to the third conjugation (interconecto), while the
re</p>
      <p>
        As for new nouns, there are 131 in total. Morphologi- maining verb, secumfert, is classified as anomalous. This
cally, the majority belong to the second declension (64, of is due to its composition: it is formed by the enclitic
atwhich 22 are neuter), followed, at a significant distance, tachment of the reflexive pronoun se to the preposition
by the third declension (33, of which 4 are neuter and 1 cum, followed by the verb fero, which itself is classified as
masculine), the first declension (32, with only 1 mascu- an anomalous verb. From a derivational morphological
line noun, asceta), and finally, just 2 nouns belong to the perspective, three of these new verbs are the result of
fourth declension. Particularly interesting data emerge compounding, having been created by adding a prefix
from the derivational morphological analysis of these (sub-, inter-) or a prefixoid ( secum-) to an already
existnouns: 36 are denominal nouns formed with the sufix ing Latin verb. In contrast, obstaculo has undergone a
-ismus, which is used to create abstract nouns referring to derivational process, being a denominal verb derived
religious, political, social, philosophical, literary, or artis- from obstaculum, ‘obstacle’.
tic doctrines and movements (dualismus, ascetismus, ab- From a semantic perspective, the classification of
nesolutismus, populismus, materialismus, romanticismus), as ologisms pertaining to the three parts of speech was
well as attitudes, trends, collective or individual traits (fa- conducted by mapping them to the 41 domains, defined
natismus, localismus, globalismus), behaviors or actions as “spheres of activity or knowledge”, established by
Ba(fatalismus), and even conditions or qualities, including belNet - a multilingual semantic network that integrates
moral or physical defects and harmful habits (egoismus, diverse resources, including WordNet, Wikipedia, the
narcissismus). The high number of neologisms formed Italian WordNet and Wiktionary [13, p. 4560].15 Across
with this sufix clearly demonstrates not only the increas- the three Encyclicals, and counting the occurrences of
ing need for its use but also its overuse in contemporary individual word forms, the neologisms most frequently
language.14 There are also 11 nouns ending in -tas, all attested (167 tokens) belong to the domain “Environment
and meteorology”, even though this domain comprises
only nine lemmas. This result is unsurprising,
considerducted in this study is primarily based on the Grande dizionario
della lingua italiana, available at https://www.gdli.it which served
as the main reference for determining the historical and
etymological origins of the lemmas.
14See, for example, the corresponding entry in the Treccani online
dictionary at https://www.treccani.it/vocabolario/ismo/.
15For the process of identifying and refining domains, see
BabelDomains: Large-Scale Domain Labeling of Lexical Resources [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
ing that Pope Francis is widely regarded as one of the
Popes most committed to environmental and
climaterelated issues. Notably, the adjective ambitalis alone
appears 47 times. Ecology, represented through terms
such as oecologia, oecologicus, and oecosystema, is a
central theme of his pontificate. Throughout the texts, the
Pope repeatedly reminds both global leaders and all
people (geosystema) of their responsibility to protect and
preserve biodiversity (biodiuersitas, biosphaera). This is
followed by neologisms belonging to the domain
“Philosophy, psychology and behavior” (58 lemmas, 159 tokens),
“Culture, anthropology and society” (33 lemmas, 135 to- EFnigcuycrleic1al:s DbyisStreimbuatniotinc Dofomneaoinlogism occurrences in Papal
kens), and “Politics, government and nobility” (21
lemmas, 103 tokens). As previously mentioned,
philosophical reflection on the human condition is central to the
Encyclicals, and is addressed from psychological (actu- knowledge and more competent use of Latin”.16
ositas, creatiuitas, egoismus, exsistentialis, infrahumanus,
responsalitas, uulnerabilitas), social (communitarius, dis- 4. Conclusions and Future Works
criminatorius, ethicisticus, phyleticus, xenophobus), and
political (absolutismus, demagogicus, nazistus, sinistror- This paper has presented the integration of a new textual
sus, technocraticus) angles. There is a noticeable drop in resource—the Papal Encyclicals corpus—into the LiLa
the number of occurrences for neologisms in the domain Knowledge Base (KB). Although this is not the first
in“Craft, engineering and technology” (8 lemmas, 39 to- stance of integrating a new corpus into LiLa—recent
adkens), which nevertheless reflect the idea of humanity as ditions include Augustine of Hippo’s Confessiones,17 de
the primary agent of progress (biotechnologia, nanotech- Ciuitate Dei,18 de Trinitate,19 and Ovid’s Tristia and
Episnologia, technica) and technological innovation (roboticus, tulae ex Ponto [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]—this first release of the Papal
Encyclitelegraphum). At this point, and with the same number cals corpus is the result of a fine-grained manual revision
of occurrences (38) as those in the domain “Chemistry of the automatic output. It constitutes a gold standard,
and mineralogy”, appear the neologisms of the domain whereas other textual resources linked to LiLa did not
“Religion, mysticism and mythology” (22 lemmas). This is benefit from such an accurate manual revision - as in the
particularly significant, as one might have expected this case of the Biblioteca Digitale di Testi Latini Tardoantichi,
to be among the most represented domains. The data in- where the considerably larger size of the corpus posed a
stead confirm that the Encyclicals are not intended solely limiting factor.20 Furthermore, the inclusion of the Papal
for Christian audiences, but are addressed to people of Encyclicals corpus is significant on a more fundamental
all faiths, promoting values intrinsic to the notion of hu- level. A core assumption about Latin corpora is that they
manity, not exclusively of Christianity. In fact, among are static, since Latin is no longer a spoken language with
the lemmas within this domain, only a few are explicitly native speakers. As a result, existing texts have been the
tied to the Christian faith (catechumenatus, christifidelis , subject of intense and ongoing scholarly investigation.
christologicus, encyclicus, liturgia, trinitarius), while oth- For example, Confessiones, de Ciuitate Dei and de Trinitate,
ers testify to the variety of world religions and belief now linked to LiLa, have been studied for centuries from
systems (agnosticus, ascetismus, dualismus, sacralitas, syn- a variety of perspectives, ranging from psychological to
cretismus, theologalis). For the distribution of domains, strictly philological. Ovid’s exilic writings have a long
see Figure 1 above. tradition of linguistic, historical and thematic analysis. In
      </p>
      <p>The incorporation of new lemmas of modern and con- contrast, the Latin texts of Papal Encyclicals have not yet
temporary origin into the Lemma Bank, using the corpus been the focus of consistent scholarly study. This means
of the three Encyclicals promulgated by Pope Francis that the work presented in this paper is not built upon an
between 2013 and 2020, has proven to be highly fruitful
from both a quantitative and a qualitative standpoint.</p>
      <p>Undoubtedly, the eforts involved in the development 16LCiintagtuiao,npfrroommultghaeteEdnbgylisPhopveerBseionnedoifctthXeVAIopnosNtoolvicemLebtetrer10L,a2t0in12a.
and maintenance of a project such as LiLa—which was The full text is available online in eight languages at https://www.
conceived as a network of interconnected language re- vatican.va/content/benedict-xvi/la/motu_proprio/documents/hf_
sources specifically for Latin—intersect with those of the ben-xvi_motu-proprio_20121110_latina-lingua.html.
Catholic Church, which continues to employ Latin as 17https://github.com/CIRCSE/AugustiniConfessiones.
a universal language of communication. Both share a 1189hhttttppss::////ggiitthhuubb..ccoomm//CCIIRRCCSSEE//AAuugguussttiinniiDDeeTCriiuniittaatteeD. ei.
common goal: “to support the commitment to a greater 20https://github.com/CIRCSE/digilibLT.
existing body of research, but is instead pioneering and
foundational. It lays the groundwork for future studies
and opens the door to a renewed consideration of Latin
as a living language in specific, ongoing institutional
contexts. Within the LiLa framework, the inclusion of
a corpus that engages with contemporary concepts and
referents significantly enriches the KB along several
dimensions. First, the Lemma Bank has been expanded
with new lexical items, enabling the study of linguistic
strategies employed to create lemmas for concepts that
did not exist in antiquity. This opens avenues for
investigating the mechanisms of lexical innovation in Latin,
particularly in the context of modern discourse. Second,
the addition of the Encyclicals corpus ofers a valuable
opportunity to explore the distinctive linguistic and stylistic
features of Papal Encyclicals as a genre. This resource
allows for a more nuanced understanding of its rhetorical
structures, specialised vocabulary, and register-specific
phenomena. Third, the corpus contributes to extending
the diachronic coverage of texts represented in the LiLa
KB, facilitating longitudinal studies of Latin usage and
lexical evolution across time. Future work will focus on
expanding this initial integration to include the complete
set of Latin Encyclicals authored by all Popes. This will
support in-genre, cross-temporal comparisons, enabling
scholars to trace linguistic trends and shifts within a
consistent textual domain. Additionally, further analysis
of unmatched lemmas and their potential inclusion will
continue to refine the coverage and connectivity of the
KB.</p>
      <p>Declaration on Generative AI</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Passarotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mambrini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Franzini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Cecchini</surname>
          </string-name>
          , E. Litta, G. Moretti,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rufolo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          ,
          <article-title>Interlinking through Lemmas. The Lexical Collection of the LiLa Knowledge Base of Linguistic Resources for Latin, Studi e Saggi Linguistici lviii (</article-title>
          <year>2020</year>
          )
          <fpage>177</fpage>
          -
          <lpage>212</lpage>
          . URL: https: //www.studiesaggilinguistici.it/index.php/ssl/ article/view/277. doi:
          <volume>10</volume>
          .4454/ssl.v58i1.
          <fpage>277</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chiarcos</surname>
          </string-name>
          , powla: Modeling Linguistic Corpora in owl/dl, in: E. Simperl,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Polleres</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Corcho</surname>
          </string-name>
          , V. Presutti (Eds.),
          <source>The Semantic Web: Research and Applications. 9th Extended Semantic Web Conference</source>
          , eswc
          <year>2012</year>
          , Heraklion, Crete, Greece, May
          <volume>27</volume>
          -31,
          <year>2012</year>
          , Proceedings,
          <source>number 7295 in Lecture Notes in Computer Science</source>
          , Springer, Berlin/Heidelberg, Germany,
          <year>2012</year>
          , pp.
          <fpage>225</fpage>
          -
          <lpage>239</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>642</fpage>
          -30284-8_
          <fpage>22</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chiarcos</surname>
          </string-name>
          , M. Sukhareva, OLiA - Ontologies of Linguistic Annotation,
          <source>Semantic Web</source>
          <volume>6</volume>
          (
          <year>2015</year>
          )
          <fpage>379</fpage>
          -
          <lpage>386</lpage>
          . URL: https://www.semantic
          <article-title>-web-journal</article-title>
          .net/ system/files/swj518_0.pdf .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bosque-Gil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gracia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Buitelaar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cimiano</surname>
          </string-name>
          ,
          <article-title>The OntoLex-Lemon Model: Development and Applications, in: Electronic lexicography in the 21st century</article-title>
          .
          <source>Proceedings of eLex 2017 conference, Lexical Computing CZ s.r.o.</source>
          ,
          <string-name>
            <surname>Brno</surname>
          </string-name>
          , Czech Republic,
          <year>2017</year>
          , pp.
          <fpage>587</fpage>
          -
          <lpage>597</lpage>
          . URL: https://elex.link/elex2017/wp-content/ uploads/2017/09/paper36.pdf .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Passarotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Budassi</surname>
          </string-name>
          , E. Litta,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rufolo</surname>
          </string-name>
          ,
          <article-title>The lemlat 3.0 package for morphological analysis of Latin</article-title>
          , in: G. Bouma, Y. Adesam (Eds.),
          <source>Proceedings of the NoDaLiDa 2017 Workshop on Processing Historical Language</source>
          , Linköping University Electronic Press, Gothenburg,
          <year>2017</year>
          , pp.
          <fpage>24</fpage>
          -
          <lpage>31</lpage>
          . URL: https://aclanthology.org/W17-0506/.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Passarotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mambrini</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. Moretti,</surname>
          </string-name>
          <article-title>The services of the LiLa knowledge base of interoperable linguistic resources for Latin</article-title>
          , in: C.
          <string-name>
            <surname>Chiarcos</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Gkirtzou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ionov</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          <string-name>
            <surname>McCrae</surname>
            ,
            <given-names>E. M.</given-names>
          </string-name>
          <string-name>
            <surname>Ponsoda</surname>
          </string-name>
          , P. M. Chozas (Eds.),
          <source>Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING</source>
          <year>2024</year>
          ,
          <article-title>ELRA</article-title>
          and
          <string-name>
            <given-names>ICCL</given-names>
            ,
            <surname>Torino</surname>
          </string-name>
          , Italia,
          <year>2024</year>
          , pp.
          <fpage>75</fpage>
          -
          <lpage>83</lpage>
          . URL: https://aclanthology.org/
          <year>2024</year>
          .ldl-
          <volume>1</volume>
          .
          <fpage>10</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fantoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Passarotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mambrini</surname>
          </string-name>
          , G. Moretti,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rufolo</surname>
          </string-name>
          ,
          <article-title>Linking the LASLA Corpus in the LiLa Knowledge Base of Interoperable Linguistic Resources for Latin</article-title>
          , in: T. Declerck,
          <string-name>
            <given-names>J. P.</given-names>
            <surname>McCrae</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Montiel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chiarcos</surname>
          </string-name>
          , M. Ionov (Eds.),
          <source>Proceedings of the 8th Workshop on Linked Data in Linguistics within the 13th Language Resources and Evaluation Conference</source>
          , European Language Resources Association, Marseille, France,
          <year>2022</year>
          , pp.
          <fpage>26</fpage>
          -
          <lpage>34</lpage>
          . URL: https://aclanthology.org/
          <year>2022</year>
          .ldl-
          <volume>1</volume>
          .4.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Mambrini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. M.</given-names>
            <surname>Cecchini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Franzini</surname>
          </string-name>
          , E. Litta,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Passarotti</surname>
          </string-name>
          , P. Rufolo, LiLa: Linking Latin.
          <article-title>Risorse linguistiche per il latino nel Semantic Web (AIUCD</article-title>
          <year>2019</year>
          ), Umanistica
          <string-name>
            <surname>Digitale</surname>
          </string-name>
          (
          <year>2020</year>
          ). URL: https: //umanisticadigitale.unibo.it/article/view/9975. doi:
          <volume>10</volume>
          .6092/issn.2532-
          <issue>8816</issue>
          /9975, number:
          <fpage>8</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>G.</given-names>
            <surname>Rohlfs</surname>
          </string-name>
          ,
          <article-title>Grammatica storica della lingua italiana e dei suoi dialetti. Sintassi e formazione delle parole</article-title>
          , volume
          <volume>3</volume>
          ,
          <string-name>
            <surname>Giulio</surname>
            <given-names>Einaudi editore</given-names>
          </string-name>
          , Torino,
          <year>1969</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>G.</given-names>
            <surname>Gobber</surname>
          </string-name>
          ,
          <article-title>Argomenti di linguistica</article-title>
          ,
          <source>ISU Università Cattolica, Milano</source>
          ,
          <year>2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>G.</given-names>
            <surname>Berruto</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cerruti</surname>
          </string-name>
          ,
          <article-title>La linguistica</article-title>
          .
          <source>Un corso introduttivo</source>
          , 2° ed.,
          <source>UTET Università, Torino</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>F.</given-names>
            <surname>Latinitas</surname>
          </string-name>
          , Lexicon recentis latinitatis,
          <source>Libreria Editrice Vaticana, Urbs Vaticana</source>
          ,
          <year>1992</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Navigli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bevilacqua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Conia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Montagnini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Cecconi</surname>
          </string-name>
          , Ten Years of BabelNet: A Survey, volume
          <volume>5</volume>
          ,
          <year>2021</year>
          , pp.
          <fpage>4559</fpage>
          -
          <lpage>4567</lpage>
          . URL: https://www.ijcai. org/proceedings/2021/620. doi:
          <volume>10</volume>
          .24963/ijcai.
          <year>2021</year>
          /620, iSSN:
          <fpage>1045</fpage>
          -
          <lpage>0823</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>J.</given-names>
            <surname>Camacho-Collados</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Navigli</surname>
          </string-name>
          ,
          <article-title>BabelDomains: Large-Scale Domain Labeling of Lexical Resources</article-title>
          , in: M.
          <string-name>
            <surname>Lapata</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Blunsom</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Koller (Eds.),
          <source>Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume</source>
          <volume>2</volume>
          ,
          <string-name>
            <surname>Short</surname>
            <given-names>Papers</given-names>
          </string-name>
          , Association for Computational Linguistics, Valencia, Spain,
          <year>2017</year>
          , pp.
          <fpage>223</fpage>
          -
          <lpage>228</lpage>
          . URL: https://aclanthology.org/E17-2036/.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Alagni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Mambrini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Passarotti</surname>
          </string-name>
          , Lifeless Winter without Break:
          <article-title>Ovid's Exile Works and the LiLa Knowledge Base</article-title>
          , in: F.
          <string-name>
            <surname>Dell'Orletta</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lenci</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Montemagni</surname>
          </string-name>
          , R. Sprugnoli (Eds.),
          <source>Proceedings of the 10th Italian Conference on Computational Linguistics</source>
          (CLiC-it
          <year>2024</year>
          ), CEUR Workshop Proceedings, Pisa, Italy,
          <year>2024</year>
          , pp.
          <fpage>4</fpage>
          -
          <lpage>12</lpage>
          . URL: https://aclanthology.org/
          <year>2024</year>
          .clicit-
          <volume>1</volume>
          .2/.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>