Linking the Lewis & Short Dictionary to the LiLa Knowledge Base of Interoperable Linguistic Resources for Latin Francesco Mambrini, Eleonora Litta, Marco Passarotti, Paolo Ruffolo CIRCSE Research Centre Università Cattolica del Sacro Cuore Largo Gemelli, 1 - 20123 Milan, Italy francesco.mambrini@unicatt.it, eleonoramaria.litta@unicatt.it, marco.passarotti@unicatt.it, paolo.ruffolo@posteo.eu Abstract treebanks2 and lexica3 . These digital resources join the large set of textual and lexical resources This paper describes the steps taken to that were created over the centuries for Latin: tex- include data from the Lewis & Short tual collections, thesauri, lexica, glossaries and bilingual Latin-English dictionary into the mono/bilingual dictionaries. Among the latter, Knowledge Base of linguistic resources we could mention, for instance, the Oxford Latin for Latin LiLa. First, data were extracted Dictionary (Glare, 1968), the Dictionary of me- from the original XML and matched with dieval Latin from British sources (Ashdowne et al., entries in LiLa, overcoming ambigui- 1975), the Forcellini lexicon (Forcellini and Fac- ties and structural inconsistencies in the ciolati, 1871) and the still under construction The- source. Subsequently, senses were mod- saurus Linguae Latinae (Ehlers, 1968), many of elled using the Ontolex Lemon Lexico- which are today accessible also in digital format. graphic module (lexicog), so that they However, the impact of these digital resources could be included in the LiLa Knowledge on the everyday work of classicists is still limited. Base and thus made interoperable with the On the one side, this is due to the still existing di- (meta)data of the linguistic resources for visive dichotomy between “traditional” Humani- Latin therein interlinked. ties and computational approaches. On the other, it is a matter of fact that classicists are not yet 1 Introduction put in the best condition to fully exploit all avail- able resources for ancient languages, as these are Since the pioneering times of 1949, when the Je- currently scattered across the web in uncommu- suit Roberto Busa persuaded Thomas Watson Sr., nicative blocks, using different query languages, CEO of IBM, to fund his project aimed at pro- data formats, annotation criteria and tagsets. The cessing the Latin texts of Thomas Aquinas with last decade has seen a number of exploratory so- computers (Jones, 2016), scholars in the areas lutions to tackle the sparseness of linguistic re- of Computational Linguistics, Literary Comput- sources. Among them, the European infrastruc- ing and Digital Humanities have built a plethora ture CLARIN4 represents a common hub where of linguistic resources for both modern and histor- data and metadata of resources collected in sin- ical languages. gle repositories (at national level) can be searched Particularly over the last two decades, many and (through the so-called Virtual Language Observa- diverse linguistic resources have been made avail- tory) and processed with different tools (through able for Latin. These consist in corpora of texts the CLARIN Language Resource Switchboard). spanning different eras and genres1 , dependency As for Classical languages, Logeion5 is a meta- 2 Copyright © 2021 for this paper by its authors. Use per- Index Thomisticus Treebank (Passarotti, 2019), Late mitted under Creative Commons License Attribution 4.0 In- Latin Charter Treebank (Cecchini et al., 2020a), UDante ternational (CC BY 4.0). (Cecchini et al., 2020b), PROIEL (Eckhoff et al., 2018) and 1 See, for example, Musisque deoque for Classical Latin Latin Dependency Treebank (Bamman and Crane, 2011). 3 poetry (Manca et al., 2011), CLaSSES, containing epigraphic Such as, for instance, valency and subcategorisation lex- material (De Felice et al., 2015), the large corpus of Classical ica (Passarotti et al., 2016; McGillivray and Vatri, 2015), Latin prose and poetic texts by LASLA (Denooz, 2007) and the Latin WordNet (Minozzi, 2017) and word lists (Tombeur, CroALa, which brings together writings by Croatian authors 1998; Ramminger, 2008). 4 produced between the 10th and 20th centuries (Jovanović, https://www.clarin.eu. 5 2012). https://logeion.uchicago.edu/lexidium. dictionary that allows to query together the lexical Net and a valency lexicon (Mambrini et al., 2021). entries of several dictionaries for Ancient Greek The most recent among the LiLa connections is and Latin, while Corpus Corporum6 is a meta- the bilingual Latin-English dictionary by Charlton collection that allows searches across more than Lewis and Charles Short (1879). The inclusion of twenty different corpora for Latin. However, what this type of lexicon in LiLa was much needed, as such initiatives still lack is to provide a real inter- no resource providing semantic information con- operability between distributed resources, which sisting of translations and definitions was avail- would result in interaction at both syntactic (struc- able in the network of connected resources before. tural) and semantic (conceptual) level. Since Lewis & Short is the first lexical resource of Syntactic interoperability is defined as ‘the abil- its kind included in LiLa, the process of its link- ity of different systems to process (read) ex- ing to the KB opened a number of LLOD-related changed data either directly or via trivial conver- challenges. sion’, using a common data model consisting of This paper describes how such challenges have shared protocols and data formats. Semantic in- been tackled and is organised as follows: Section 2 teroperability, on the other hand, is ‘the ability describes the Lewis & Short dictionary in its main to automatically interpret exchanged information characteristics. Section 3 discusses the ontologies meaningfully and accurately in order to produce involved in the modelling phase, the challenges useful results’, by using a set of common linguistic that need to be overcome in the representation of data categories defined in ad-hoc ontologies (Ide the linguistic data as LLOD (3.1), and the strate- and Pustejovsky, 2010). gies adopted to represent the dictionary entries us- Attaining syntactic and semantic interoperabil- ing the chosen vocabularies (3.2). Finally, Section ity between distributed linguistic resources is the 4 discusses conclusions and highlights directions objective of the Linguistic Linked Open Data for future work. (LLOD) community, which applies the princi- ples of the Linked Data paradigm (Bizer et al., 2 The “Lewis & Short” Dictionary 2008) to the (meta)data contained in linguistic re- sources. As for Classical languages, the LiLa 2.1 The Printed and Digital Dictionary Knowledge Base (KB)7 (Passarotti et al., 2020) The Latin Dictionary, curated by Ch. T. Lewis makes textual and lexical resources for Latin inter- and Ch. Short and commonly referred to as the act through a commonly used data model, called “Lewis & Short” (L&S), was published by Harper the Resource Description Framework (RDF) (Las- and Oxford University Press in 1879 (Lewis and sila et al., 1998), and ontologies developed and Short, 1879). Though based on previous work by shared by the LLOD community. In this way, the German scholars, it remained a standard in Latin linked resources become interoperable with each lexicography in the English-speaking world until other as well as with those for other languages de- it was superseded by the Oxford Latin Dictionary scribed following the same structural and concep- (Glare, 1968). tual principles. Based on a large collection of “canonical In the digital age, its importance rests on two forms” (lemmas) - the so-called “Lemma Bank”, grounds. On the one hand, its relevance for the LiLa achieves interoperability between resources history of Classical Scholarship is undeniable. On by linking all those entries in lexical resources and the other hand, also on account of its copyright tokens in corpora that point to the same lemma in status, as the dictionary belongs now to the pub- the LiLa collection. lic domain, the L&S has quickly become one of the most used and best curated digital Latin dic- The lexical resources for Latin linked so far tionaries on the web. Following the same work- to LiLa include a word formation lexicon (Pelle- flow used for the Greek-English Lexicon (Liddell grini et al., 2021), a polarity lexicon (Sprugnoli et et al., 1940), the Perseus Project has developed a al., 2020), an etymological dictionary (Mambrini widely used digital edition of the dictionary based and Passarotti, 2020) and a joint resource provid- on the standards of the Text Encoding Initiative ing a manually checked subset of the Latin Word- (TEI) (Rydberg-Cox, 2002). The digital L&S has 6 http://www.mlat.uzh.ch/MLS/. been incorporated in the word-search tools avail- 7 https://lila-erc.eu. able on the Perseus website and in a series of other desktop and web applications.8 2.2 Linking the L&S to LiLa Perseus’ TEI edition is the point of departure of The LiLa KB includes about 200,000 canonical our work.9 Though its publication was a remark- forms, each of which is described by a series of able achievement, this electronic text is not exempt properties that record the part of speech (PoS), from occasional flaws and inconsistencies, which the full morphological description and the inflec- had to be taken into account. tional category. Also, the data property “written In the digital edition, entries from the L&S are representation”, defined in the ontology Ontolex based on an XML encoding of the whole dic- (see Section 3.1), registers all the attested spellings tionary. The XML structure, albeit not always of any lemma. Publishing a lexical resource as consistent, offers the following information about LLOD within LiLa means to both represent its in- each word: formation using the appropriate standards and vo- 1. Entry: the headword. Entries are encoded cabularies (Section 3.1) and to link the dictionary within the TEI element and entries to the right form in LiLa by matching the are 51,596 in total.10 lemmas used to index the records to the appropri- ate form in the KB. 2. Information about inflection, encoded as at- In order to achieve the latter goal, firstly we tributes in the XML and visualised in the out- had to normalise the spelling of the L&S dictio- put reproducing the customary descriptions nary lemmas by removing upper case initials and for Latin dictionaries, e.g. a masculine noun substituting j with i and v with u in order to mir- of the second declension (e.g. gallus ‘cock’) ror LiLa’s conventions. Then, after mapping part- is followed by the genitive singular ending of of-speech and inflectional information between re- the word (‘i’), and the abbreviation for gen- sources, we extracted 31,142 1:1 matches, 2,998 der ‘m.’ (e.g. gallus, i, m.). 1:N matches and 4,553 1:0 matches, on the basis 3. Etymological or derivational information, en- of the tuple written representation - PoS. The lat- coded within the same element . ter group was subsequently matched only on the basis of graphical representation, at which point 4. Sense(s): these act as containers where the we obtained 946 1:1 matches and 50 1:N matches. meaning of the word is matched with a num- Of the remaining 3,557 unmatched entries, 1,289 ber of representative citations from Classical were successfully analysed by the morphological Latin sources. Each citation is accompanied analyser Lemlat (Passarotti et al., 2017), leaving by its canonical reference (e.g. “Cic. Sen. 8, 2,239 definitely unmatched entries. After resolv- 26” for a reference to Cicero, De Senectute, ing multi-word spellings and graphical variants, chapter 8, paragraph 26). the unmatched entries were all added to the LiLa Lemma Bank, while 1:N matches were manually Entries can contain what we call “sub-entries”, disambiguated and matched to the relevant lem- words that are not given a record of their own, but mas. are discussed within another entry. Usually, these sub-entries consist of lexicalised present and past 3 Modelling Lexical Entries participles like, for example, adolescens ‘young man’ – sub-entry of adolesco ‘to grow up’; an- 3.1 LiLa, Ontolex and lexicog other instance is the substantivised forms of ad- jectives, such as verum ‘the truth’ – sub-entry of As said, the LiLa KB for Latin resources is built verus ‘true’. Sub-entries are encoded within the around a collection of canonical forms that can be element and followed by the same type used both as head words of dictionaries or as “tar- of inflectional information structured as the main gets” for the lemmatisation of corpora (Passarotti entries. et al., 2020). These lemmas are modelled using 8 the Ontolex ontology, a now de facto standard of One example is the app Diogenes for querying corpora of Greek and Latin texts: https://d.iogen.es/. the LLOD community (Cimiano et al., 2020; Mc- 9 The digital edition is available from the repository of the Crae et al., 2017). In particular, lemmas in the Perseus DL and is distributed under a CC BY SA 4.0 license: LiLa KB are defined as forms of words that are https://github.com/PerseusDL/lexica. 10 See https://tei-c.org/release/doc/tei- linked (or are ready to be linked) to lexical entries p5-doc/en/html/ref-entryFree.html. via the property “canonical form” of the Ontolex ontology.11 Lexicographic entries are a special subset of Ontolex provides several classes and properties a larger class called Lexicographic Component. to describe the relationships that lexical entries Apart from whole dictionary articles (the en- have with, on the one hand, the grammatical forms tries), components can be used to represent senses, attested in language and, on the other, the senses sense groups or subentries (like the substantivised and the meanings of words. The core Ontolex verum) within lexicographic entries. module, however, imposes a series of restrictions It is important to stress once again that compo- that make its classes and properties ill-suited to nents represent only structural units; all linguis- represent the information in most standard dictio- tic information that is conveyed within these units naries. The class Lexical Entry from the core On- must be expressed using Ontolex. The property tolex module, for instance, is inadequate to rep- lexicog:describes provides a link between resent entries that license multiple syntactic inter- the two dimensions, so that a lexicographic entry pretations, such as words that are registered in a can be said to describe a lexical entry (as defined dictionary as both adverb and conjunction. Suben- in Ontolex). In the same way, the lexicographic tries like the noun verum from the adjective verus, components that discuss a sense of a word or in- formed by a process of substantivisation from the troduce a subentry, describe that specific lexical word in the main entry, would also produce a mis- sense (as defined in Ontolex) or another lexical en- match between the dictionary and the lexical entry. try. Finally, the L&S, as most dictionaries, defines the senses of all but the most simple words by group- 3.2 Lexicographic and Lexical Entries in the ing them in sense clusters; those clusters are gen- L&S erally organized into hierarchies with multiple lev- The LLOD version of the L&S linked to LiLa is els of nesting, from the most general to the most now available online in the LiLa KB.13 The entries specific sense, a structure for which Ontolex has can also be searched using LiLa’s query interface no suitable representation. and SPARQL endpoint.14 In order to overcome these issues, the Ontolex Figure 1 shows a visualisation of how the infor- community has developed a specific extension of mation from a sample entry, the adjective hosticus the ontology called the “OntoLex lexicography in the L&S dictionary, is represented in LiLa. In module” or lexicog (Bosque-Gil and Gracia, particular, the interplay between the linguistic and 2019).12 The module is explicitly designed to cap- structural information is reflected in the complex ture the structural information expressed in a lex- relation between the lexical and lexicographic en- icographic resource and is primarily intended to tries. support the conversion of lexicographic data that The L&S distinguishes two senses for the word: are not native to Ontolex. Retro-digitised dictio- “belonging to an enemy, hostile” and “belonging naries like the L&S are thus a perfect use case. to a stranger, foreign”. Following the Ontolex As said, lexicog focuses on the structural approach, these meanings are represented by the properties of dictionaries and does not attempt to two ‘triangles’ between the lexical entry (the light convey any lexical, or indeed linguistic informa- green node on the left), the concepts evoked by the tion, which are left to the classes and properties word (gray-blue nodes), and the senses, labeled 0 of Ontolex. The most important of these structural ad 1, that mediate between them (greenish-yellow elements introduced in the vocabulary is that of the nodes). Lexicographic Entry. In lexicog, an entry is a The lexical entry is described by a lexicographic container that represents a lexicographic article or entry, identified by the id n21014 (inherited from record as it is arranged in the source (Bosque-Gil the TEI XML file of the Perseus DL), while a spe- and Gracia, 2019). Thus, while a lexical entry (as cific lexicographic component describes each of defined in Ontolex) is an item in the lexicon of a the two senses (n21014 0 and n21014 1, respec- given language, a lexicographic entry is a record in tively). What is particularly relevant is that the a linguistic resource that documents or discusses component n21014 0, which corresponds to the some properties of a given lexical item. 13 http://lila-erc.eu/data/lexicalResour 11 http://www.w3.org/ns/lemon/ontolex#c ces/LewisShort/Lexicon. anonicalForm. 14 https://lila-erc.eu/query/, and https: 12 https://www.w3.org/ns/lemon/lexicog#. //lila-erc.eu/sparql/. Figure 1: An entry in the LiLa’s representation of the L&S. sense “hostile”, is linked to a sub-component that tion, as recorded in the Word Formation Latin re- describes the lexical entry of the noun hosticum, source, which is also linked to LiLa (Litta et al., a substantivised usage of the neuter adjective that 2020). The adjective hosticus of Figure 1, for in- means “the enemy’s territory”. That section of stance, clearly inherits its two main senses (‘hos- the entry that discusses the subentry “hosticum”, tile’ and ‘foreign’) from the same polysemy of the which is itself a section of the paragraph dedicated noun hostis ’stranger’ or ’enemy’, from which it is to the first sense, is thus linked (via the “describes” derived. At the same time, while other resources property) to a different lexical entry. in LiLa describe the senses of words, such as the Latin WordNet (Franzini et al., 2019; Mambrini 4 Conclusions and Future Work et al., 2021), the complex relations between those senses (whether, for instance, one sense is inter- Perhaps even more than for any other modern lan- preted as a specialised derivation from another) is guage, a great number of lexical resources, either generally available only in traditional lexical re- bi- or monolingual, is available for Latin, many sources like the L&S. of which have already been digitised and dissem- inated on the web. In this paper, we described a model of how this huge wealth of information can be published using the modern standards of the Semantic Web. The greatest advantage of this The solutions we found to address the chal- approach is that all the lexical resources published lenges raised by the representation of the L&S in according to the same data model can be integrated LLOD will be reused when we will link further in a wider network of linguistic information, along bilingual, as well as monolingual, dictionaries of with the other digital resources that are connected Latin to the KB. Including such lexical resources to it. In the case of the L&S in LiLa, the Latin in LiLa is an important achievement, as it makes lexical entries of the bilingual dictionary can be it possible for the KB to interact with linguistic queried together with the information about the (meta)data for languages other than Latin. Un- same words provided by the other linguistic re- doubtedly, such an inter-linguistic (re)use of dis- sources linked to the lemmas in the KB. tributed resources is one of the objectives of the One example of the fruitful interactions be- LLOD community, to which LiLa contributes by tween resources is the possibility to investigate steadily providing it also with new (kinds of) lin- the polysemy of words in relation to their deriva- guistic resources represented in LLOD. Acknowledgments Egidio Forcellini and Jacobo Facciolati. 1871. Lexicon totius latinitatis, volume 3. Typis seminarii. This project has received funding from the Eu- ropean Research Council (ERC) under the Euro- Greta Franzini, Andrea Peverelli, Paolo Ruffolo, Marco pean Union’s Horizon 2020 research and innova- Passarotti, Helena Sanna, Edoardo Signoroni, Vi- viana Ventura, and Federica Zampedri. 2019. Nunc tion programme – Grant Agreement No. 769994. Est Aestimandum. Towards an evaluation of the Latin WordNet. In Raffaella Bernardi, Roberto Nav- igli, and Giovanni Semeraro, editors, Sixth Italian References Conference on Computational Linguistics (CLiC-it 2019), pages 1–8, Bari, Italy. CEUR-WS.org. Richard Ashdowne, David R Howlett, and Ronald Ed- ward Latham. 1975. Dictionary of medieval Latin Peter GW Glare. 1968. Oxford latin dictionary. from British sources. Oxford University Press. Clarendon Press, Oxford. David Bamman and Gregory Crane. 2011. The an- cient greek and latin dependency treebanks. In Lan- Nancy Ide and James Pustejovsky. 2010. What does guage technology for cultural heritage, pages 79– interoperability mean, anyway? toward an oper- 98. Springer. ational definition of interoperability for language technology. In Proceedings of the Second Inter- Christian Bizer, Tom Heath, Kingsley Idehen, and national Conference on Global Interoperability for Tim Berners-Lee. 2008. Linked data on the web Language Resources. Hong Kong, China. (ldow2008). In Proceedings of the 17th interna- tional conference on World Wide Web, pages 1265– Steven E Jones. 2016. Roberto Busa, SJ, and the emer- 1266. gence of humanities computing: the priest and the punched cards. Routledge. Julia Bosque-Gil and Jorge Gracia. 2019. The On- toLex lemon lexicography module. https://on Neven Jovanović. 2012. Croala. enhancing a tei- tolex.github.io/lexicog/. encoded text collection. Journal of the Text Encod- ing Initiative, (2). Flavio Massimiliano Cecchini, Timo Korkiakangas, and Marco Passarotti. 2020a. A new latin tree- Ora Lassila, Ralph R. Swick, World Wide, and Web bank for universal dependencies: Charters between Consortium. 1998. Resource description frame- ancient latin and romance languages. In Proceed- work (rdf) model and syntax specification. ings of The 12th Language Resources and Evalua- tion Conference, pages 933–942. Charlton T. Lewis and Charles Short. 1879. A Latin Dictionary. Founded on Andrews’ edition of Fre- Flavio Massimiliano Cecchini, Rachele Sprugnoli, und’s Latin dictionary. Clarendon Press, Oxford. Giovanni Moretti, and Marco Passarotti. 2020b. Udante: First steps towards the universal dependen- Henry Liddell, Robert Scott, and Henry Stuart Jones. cies treebank of dante’s latin works. In CLiC-it. 1940. A Greek-English Lexicon. Clarendon Press, Oxford, 9 edition. Philipp Cimiano, Christian Chiarcos, John P. Mc- Crae, and Jorge Gracia. 2020. Linguistic Linked Eleonora Litta, Marco Passarotti, and Francesco Mam- Data: Representation, Generation and Applications. brini. 2020. Derivations and Connections: Word Springer, Cham. Formation in the LiLa Knowledge Base of Linguis- tic Resources for Latin. The Prague Bulletin Of Irene De Felice, Giovanna Marotta, and Margherita Mathematical Linguistics, 115:163–186. Donati. 2015. Classes: A new digital resource for latin epigraphy. IJCoL. Italian Journal of Computa- Francesco Mambrini and Marco Passarotti. 2020. tional Linguistics, 1(1-1):125–136. Representing etymology in the lila knowledge base Joseph Denooz. 2007. Opera latina: le nouveau of linguistic resources for latin. In Proceedings of site internet du lasla. Journal of Latin Linguistics, the 2020 Globalex Workshop on Linked Lexicogra- 9(3):21–34. phy, pages 20–28. Hanne Eckhoff, Kristin Bech, Gerlof Bouma, Kris- Francesco Mambrini, Marco Passarotti, Eleonora Litta, tine Eide, Dag Haug, Odd Einar Haugen, and Mar- and Giovanni Moretti. 2021. Interlinking valency ius Jøhndal. 2018. The proiel treebank family: frames and wordnet synsets in the lila knowledge a standard for early attestations of indo-european base of linguistic resources for latin. In Further with languages. Language Resources and Evaluation, Knowledge Graphs, pages 16–28. IOS Press. 52(1):29–65. Massimo Manca, Linda Spinazzè, Paolo Mastandrea, Wilhelm Ehlers. 1968. Der thesaurus linguae latinae. Luigi Tessarolo, and Federico Boschetti. 2011. Mu- prinzipien und erfahrungen. Antike und Abendland, sisque deoque: Text retrieval on critical editionse. J. 14(1):172–184. Lang. Technol. Comput. Linguistics, 26(2):127–138. John P. McCrae, Julia Bosque-Gil, Jorge Gracia, Paul Paul Tombeur. 1998. Thesaurus formarum totius La- Buitelaar, and Philipp Cimiano. 2017. The tinitatis: a Plauto usque ad saeculum XXum; TF.[2]. OntoLex-Lemon Model: development and applica- CETEDOC Index of Latin forms: database for the tions. In Proceedings of eLex 2017, pages 587–597. study of the vocabulary of the entire Latin world; base de données pour l’étude du vocabulaire de Barbara McGillivray and Alessandro Vatri. 2015. toute la latinité. Brepols. Computational valency lexica for latin and greek in use: a case study of syntactic ambiguity. Journal of Latin Linguistics, 14(1):101–126. Stefano Minozzi. 2017. Latin wordnet, una rete di conoscenza semantica per il latino e alcune ipotesi di utilizzo nel campo dell’information re- trieval. Strumenti digitali e collaborativi per le Scienze dell’Antichità, (14):123–134. Marco Passarotti, Berta González Saavedra, and Christophe Onambele. 2016. Latin vallex. a treebank-based semantic valency lexicon for latin. In Proceedings of the Tenth International Con- ference on Language Resources and Evaluation (LREC’16), pages 2599–2606. Marco Passarotti, Marco Budassi, Eleonora Litta, and Paolo Ruffolo. 2017. The Lemlat 3.0 Package for Morphological Analysis of Latin. In Gerlof Bouma and Yvonne Adesam, editors, Proceedings of the NoDaLiDa 2017 Workshop on Processing Histori- cal Language, volume 133, pages 24–31, Gothen- burg. Linköping University Electronic Press. Marco Passarotti, Francesco Mambrini, Greta Franzini, Flavio Massimiliano Cecchini, Eleonora Litta, Gio- vanni Moretti, Paolo Ruffolo, and Rachele Sprug- noli. 2020. Interlinking through lemmas. the lexi- cal collection of the lila knowledge base of linguis- tic resources for latin. Studi e Saggi Linguistici, 58(1):177–212. Marco Passarotti. 2019. The project of the index thomisticus treebank. In Digital Classical Philol- ogy, pages 299–320. De Gruyter Saur. Matteo Pellegrini, Eleonora Litta, Marco Passarotti, Francesco Mambrini, and Giovanni Moretti. 2021. The two approaches to word formation in the lila knowledge base of latin resources. In Proceedings of the Third International Workshop on Resources and Tools for Derivational Morphology (DeriMo 2021), pages 101–109. Johann Ramminger. 2008. Neulateinische Wortliste. Ein Wörterbuch der Lateinischen von Petrarca bis 1700. Thesaurus Linguae Latinae. Jeffrey A Rydberg-Cox. 2002. Mining Data from an Electronic Greek Lexicon. Classical Journal, 98(2):183–188. Rachele Sprugnoli, Francesco Mambrini, Giovanni Moretti, and Marco Passarotti. 2020. Towards the modeling of polarity in a latin knowledge base. In WHiSe@ ESWC, pages 59–70.