Towards the Modeling of Polarity in a Latin Knowledge Base? Rachele Sprugnoli[0000−0001−6861−5595] , Francesco Mambrini[0000−0003−0834−7562] , Giovanni Moretti, and Marco Passarotti[0000−0002−9806−7187] CIRCSE Research Centre, Università Cattolica del Sacro Cuore Largo Agostino Gemelli 1, 20123 Milano {rachele.sprugnoli,francesco.mambrini,giovanni.moretti marco.passarotti}@unicatt.it Abstract. In this paper, we describe the process of inclusion of a prior polarity lexicon of Latin lemmas, called LatinAffectus, in a knowledge base of interoperable linguistic resources developed within the LiLa: Linking Latin project. More specifically, a manually-curated list of lemma- sentiment pairs is linked to a comprehensive collection of Latin lemmas by using Semantic Web and Linked Data standards and practices. Lati- nAffectus is modeled relying on three formal representation frameworks: Lemon and Ontolex to describe the lexicon, and the Marl ontology to describe the sentiment properties of each of its lexical entries. We present the lexicon, the methodology and the results of the linking process, as well as a use case and the planned future work.1 Keywords: Linguistic Linked Open Data · Sentiment Analysis · Latin. 1 Introduction Throughout the recent years, several linguistic resources and tools were created for many languages to support sentiment analysis, i.e. the task of automatically classifying a piece of text according to the sentiment conveyed by it. Although the main applications of such resources and tools fall into categories like social media and customer experience monitoring , there is a growing interest in the research community to develop resources and tools to perform sentiment analysis of texts written in ancient languages. Such interest mirrors the substantial growth of the area dedicated to building and using linguistic resources for ancient and historical languages, which has primarily concerned Latin and Ancient Greek as essential media for accessing and understanding the so-called Classical tradition. ? This work is supported by the European Research Council (ERC) under the Eu- ropean Union’s Horizon 2020 research and innovation programme via the “LiLa: Linking Latin” project - Grant Agreement No. 769994. 1 Copyright ©2020 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). 60 R. Sprugnoli, F. Mambrini, G. Moretti, M. Passarotti In particular, Latin plays a central role in this context, as texts written in Latin are spread all over Europe, covering a time span of almost two millennia and being testimonials of the common, but still diverse, past that contributed to shape the cultural heritage of Europe. Exploiting the most advanced techniques for preserving, investigating and sharing such heritage assets that have survived from the past times is at the same time a challenge and an obligation for the research area dealing with developing linguistic resources and tools. Given the wide variety of the Latin texts in terms of their era, place and literary genre, the achievements of this field of research promise to impact a large and heterogeneous community made of historians, philologists, archaeologists and literary scholars, in different ways all dealing with textual and lexical data written in Latin. The recent launch of projects aimed at automatically extracting structured knowledge from ancient sources provided by linguistic resources, like for instance eAqua 2 , Logeion 3 and Corpus Corporum 4 , shows how the current availability of linguistic resources for ancient languages, and particularly Latin, is such that there is a large need for making them interact. To address the issue of interoperability between lexical and textual resources for Latin, the LiLa: Linking Latin project (2018-2023)5 was launched with the objective of building a Knowledge Base (KB) of linguistic resources for Latin based on the Linked Data paradigm, i.e. a collection of several data sets repre- sented using the same vocabulary of knowledge description and linked together [25]. Within the LiLa project, aside from interlinking the already available re- sources for Latin, we are also building a number of new ones, among which is LatinAffectus, a lexicon that assigns a prior sentiment score to a selection of Latin adjectives and nouns [28]. This paper describes the process of inclusion of LatinAffectus into the LiLa KB and presents a simple use-case showing how the interaction of the linguistic resources currently linked through LiLa can be exploited to address a specific research question. The core component of LiLa is a large collection of Latin lemmas, whose role is to connect the different (and possibly distributed) linguistic resources that interact in LiLa [17]. Particularly, the textual resources are included into LiLa by linking the occurrences of the words in their texts to the lemmas of LiLa, while lexical resources connect to LiLa by linking the contents of their lexical entries to the lemmas of the KB. The result is an interlinked ecosystem where textual and lexical (meta)data provided by several resources become interoperable. Including LatinAffectus into LiLa enhances a subset of the lemmas provided by the LiLa collection with a prior positive/negative polarity. Such a black or white approach is at the same time a limitation and an advantage of LatinAffectus. As for the former, the lexicon does not account for the different meanings that words may have, some of which can show different polarity values, thus failing to represent 2 http://www.eaqua.net/ 3 https://logeion.uchicago.edu/lexidium 4 http://www.mlat.uzh.ch/MLS/ 5 https://lila-erc.eu Towards the Modeling of Polarity in a Latin Knowledge Base 61 the span of possible sentiments of a word. As for the latter, assigning one prior, prototypical polarity value to the lexical entries helps the application of the (meta)data from LatinAffectus to real texts. Indeed, no sufficiently accurate tools for word sense disambiguation are currently available for Latin, which prevents from analyzing texts with the help of sentiment lexicons that provide different polarity values for the same word, as this implies to consider all the possible values while computing the overall polarity of a sentence, or a text.6 Instead, by grounding on one, prior polarity value it becomes possible to apply LatinAffectus to Latin texts without the need of pre-processing data with a layer of word sense disambiguation. This aspect becomes an added value when LatinAffectus interacts with all the other resources included in LiLa, because its (meta)data are not anymore available in isolation, but they are interoperable with those of other resources, thus exploiting to the best the contribution provided by each of them in applications to address research questions. The paper is organized as follows. Section 2 provides a brief overview of the related work on polarity lexicons and on the strategies to represent linguistic resources and services for sentiment analysis in the Linked Data framework. Section 3 describes LatinAffectus and Section 4 details the process of modeling it an including it into the LiLa KB. Section 5 presents a simple use case to show how the interoperability between the resources connected in LiLa, and partic- ularly LatinAffectus, can support research in the Humanities. Finally, Section 6 concludes the paper with a discussion about the need to make the linguistic resources for Latin interact and sketches our future work. 2 Related Work Sentiment Analysis and related tasks [16], such as emotion analysis, subjectivity detection and opinion mining, are very popular both in academic research and in business applications where the focus is mostly given to the analysis of con- temporary texts like product or service reviews [11] and social media posts [24]. In this tasks, the creation of polarity lexicons, that is lists of words associated to their out-of-context sentiment orientation, is of fundamental importance but also a very time-consuming process. Several approaches have been developed to automatize this process and build multi-lingual lexicons covering also less- resources and ancient languages. For example, Mohammad [22] adopts crowd- sourcing techniques to generate an English valence, arousal, dominance (VAD) lexicon and then automatically translates it into other 103 languages, including Latin. Chan and Skiena [5] use instead a knowledge graph propagation algorithm starting from Wikipedia to build lexicons of positive and negative words in 136 languages inclusive of Latin. These two resources, although of undoubted value, have two main drawbacks: they are noisy due to the presence of English words, such as microchip and reli- 6 An example of such approach is provided by the API of the Latin WordNet project of the University of Exeter, which can perform sentiment analysis of individual strings via HTTP POST requests to https://latinwordnet.exeter.ac.uk/sentiment/. 62 R. Sprugnoli, F. Mambrini, G. Moretti, M. Passarotti able 7 and their content was not checked by a Latin expert or evaluated against a gold standard. Another limitation is that these lexicons have not been pub- lished in the Linked Data framework, thus limiting their usability and semantic interoperability. However, various efforts have been made to develop a formal representation of linguistic resources and services for sentiment analysis. More specifically, the Marl ontology is designed for the publication of data about opin- ions and the sentiments expressed in them [30] and the EuroSentiment project has proposed a model that integrates it with the lexicon model for ontologies (lemon) [3,19], so to represent lexical resources for sentiment and emotion anal- ysis such as lexicons and annotated corpora [2]. This approach has been applied, for example, to represent polarity information of German compound words, re- lying, in particular, on Ontolex [20], that is the core module of lemon [7]. In this paper we aim to overcome the aforementioned limitations of the cur- rently available polarity lexicons for Latin by publishing, using Linked Data principles, a list of lemmas with their prior sentiment orientation created by experts of Latin language and culture, and then expanded by exploiting a set of already available manually curated linguistic resources. Each entry in the re- sources we created is linked to the collection of Latin lemmas provided by the LiLa KB, so to achieve interoperability. 3 Latin Prior Polarity Lexicons A Gold Standard (GS) lexicon was manually developed by two Latin language and culture experts who assigned a sentiment score to out-of-context lemmas using a five-value classification: 1 (fully positive), 0.5 (somewhat positive), 0 (neutral), -0.5 (somewhat negative), -1 (fully negative). We chose to take into consideration nouns and adjectives only because their polarity is more easy to define at a lexical level, i.e. out of context, than that of verbs whose semantics is more strictly connected to that of their arguments [12]. Lemmas were taken from the William Whitaker’s Words morphological analyzer and digital dictionary8 , the Cassell’s Latin dictionary9 [27] and the lemmatized version of Opera Latina [9], a corpus of Classical authors manually annotated with lemmas and Part- of-Speech (PoS) tags. In addition, a Silver Standard (SS) lexicon was built by deriving new entries in two ways: i) by exploiting derivational, synonym and antonym relations with the lemmas in the GS; ii) by adding graphical variants of lemmas present in the GS. Original polarity scores were propagated or reversed onto the newly derived lemmas: for example, scores were preserved in case of synonyms and graphical variants, whereas they were reversed for antonyms (see Table 1). Details on the composition of the GS and the SS are reported in Table 2: the resources are freely available online at https://github.com/CIRCSE/ 7 By manually revising the two lexicons, we calculated the percentage of English words: 14% in the VAD lexicon and 9% in the other. 8 https://mk270.github.io/whitakers-words/ 9 https://github.com/nikita-moor/latin-dictionary Towards the Modeling of Polarity in a Latin Knowledge Base 63 Latin_Sentiment_Lexicons and their detailed description is given in [28]. The GS and the SS have been merged in a unique resource called LatinAffectus. Table 1. Examples of extensions starting from lemmas in the GS. Lemma-GS Score-GS Extension Type Lemma-SS Score-SS purus ‘pure’ +1 derivational inpurus ‘impure’ -1 innoxius ‘harmless’ +0.5 synonym innocens ‘innocent’ +0.5 aqua ‘water’ 0 derivational aquarius ‘of/for water’ 0 apsentia ‘absence’ -0.5 variant absentia ‘absence’ -0.5 scelus ‘crime’ -1 antonym beneficium ‘benefit’ +1 Table 2. Composition of the GS and the SS prior polarity lexicons together with the total number of adjectives and nouns included in the merged resource LatinAffectus. PoS LEXICON ADJ NOUN TOT Gold Standard 454 (39.7%) 690 (60.3%) 1,144 Silver Standard 512 (39.6%) 781 (60.4%) 1,293 LatinAffectus 966 (39.6%) 1,471 (60.4%) 2,437 4 Modeling and Linking Polarity For modeling LatinAffectus, we rely on three formal representation frameworks: Lemon10 and Ontolex11 to describe the lexical resource and the Marl ontology12 to describe the sentiment properties of each entry. In our approach the polarity lexicon is defined as an instance of the class E31 Document13 of the CIDOC Conceptual Reference Model (CRM), an ontology formally describing concepts and relations in the cultural heritage domain [10]. Moreover, LatinAffectus is also defined as an object of type lexicon following the LInguistic MEtadata (lime) module of Ontolex. We link the lexicon to its entries, which are defined as instances of the Ontolex class LexicalEntry, through the property called entry belonging to the lime module. Each lexical entry has a label, an ontolex:canonicalForm property connecting it to the corresponding lemma in the LiLa KB, and an ontolex:sense property corresponding to the lexical meaning of a lexical entry. Given that LatinAffectus deals with prior polarities, each lexical entry has only one sense. modeled as an instance of an object of the class ontolex:LexicalSense. Each sense is characterized by a 10 https://lemon-model.net/ 11 https://www.w3.org/2016/05/ontolex/ 12 http://www.gsi.dit.upm.es/ontologies/marl/1.1/ 13 The class E31 “comprises identifiable immaterial items that make propositions about reality”, http://www.cidoc-crm.org/Entity/e31-document/version-6.2 64 R. Sprugnoli, F. Mambrini, G. Moretti, M. Passarotti label, the relation marl:hasPolarity and the property marl:polarityValue. More specifically, the relation marl:hasPolarity connects the sense to the class marl:Polarity, indicating if the sentiment is positive, negative or neutral. On the other side, marl:polarityValue specifies the numeric decimal value of the sentiment that, in our case, can be 1.0, 0.5, 0.0, -0.5, or -1.0. The lexical entries of LatinAffectus were linked to their corresponding lemmas in the LiLa KB in a semi-automatic way. First, we performed an automatic matching between the two resources: this revealed the presence of 246 ambiguous lemmas, that is lemmas having the same written representation and the same PoS tag: for example, the entry fides, having polarity value 1, can be linked either to the lemma of the fifth declension meaning ‘trust’ or to the lemma of the third declension meaning ‘lyre’ . These lemmas were manually disambiguated; thus fides was linked to the first of the aforementioned lemmas in the LiLa KB. Further 107 entries, such as Medieval or New Latin words like praesuppositio ‘assumption’ and radioactiuus ‘radioactive’, were not present in the KB and were therefore added. Figure 1 illustrates the modeling of the lexical entry malus ‘evil’ and of the negative prior polarity of its lexical sense as recorded in LatinAffectus. Fig. 1. Triples in Notation3 format related to the lexical entry malus ‘evil’. Thanks to the linking between entries coming from different resources, among which are LatinAffectus and the collection of lemmas of the LiLa KB, it is pos- sible to get a rich set of lexical information. An example is given in Figure 214 : a number of morphological features, including PoS, degree and inflectional cate- gory, are assigned to the node for the lemma malus in the KB. This node is also connected to a Base node that plays the role of connecting together all the lem- mas belonging to the same derivational family: in this Figure the Base2302 node interlinks the lemmas malus and maleficus ‘wicked’. The Lemma node malus is also connected to its etymology taken from the Etymological Dictionary of Latin 14 The LiLa KB can be explored using a query interface: https://lila-erc.eu/ query/. Towards the Modeling of Polarity in a Latin Knowledge Base 65 and the other Italic Languages [29] and modeled by relying on the Ontolex- lemon ontology and the lemonEty extension [14]. In particular, the Lemma node is linked to the canonical form present in the etymological dictionary, which in turn is linked to the Proto-Italic and/or Proto-Indo-European reconstructed forms of the word (e.g. *malo-) [18]. The left hand side of the image shows that both malus and maleficus are included in LatinAffectus and they both have a negative polarity. Fig. 2. The lemma malus in the LiLa KB including its etymology and its prior polarity. 5 Use Case The only corpus that is presently connected to the LiLa KB is the Index Thomisti- cus Treebank (ITTB) [26], which provides a complete morpho-syntactic anno- tation, based on a form of dependency grammar, to the Latin works of the philosopher Thomas Aquinas (13th Century). At the moment, the LiLa KB stores the connections between the 277,547 tokens, taken from the first four books of the treatise Summa contra Gentiles (SCG), and the corresponding lemma under which each token is lemmatized. For every token we report also the basic morpho-syntactic information derived from the treebank, such as the link between head and dependent in the dependency tree and the label of their syntactic relation (e.g. “Subject” or “Predicate”). Although the original annotation stored in the ITTB already allows users to perform complex queries on the language of the SCG, the inclusion of the corpus into the LiLa KB expands the range of possible research to integrate new crucial dimensions for researcher; polarity provides an outstanding example. 66 R. Sprugnoli, F. Mambrini, G. Moretti, M. Passarotti One of the philosophical problems that Thomas Aquinas engaged in his teach- ing and writing is the nature of evil.15 From the linguistic point of view, one of the prototypical constructions for providing definitions to concepts is the nomi- nal predicate, where a subject is associated with a predicate via the copula verb, like “to be” in English.16 One example of this form of definition is the sentence: “evil is a deficiency of good”. While the ITTB already allows the users to retrieve the occurrences of copular constructions, it is only by crossing the information with LatinAffectus that we can specify the constraint on the polarity of either of the terms (the subject or the predicate nominal). In this way, we can explore the definitions of the negative pole across the work of Thomas Aquinas, and consider its relation to the problem of the nature of evil. Table 3. The 5 most frequent negative subjects of copular constructions in the ITTB. Lemma English translation Occurrences malum ‘evil’ 37 corruptio ‘destruction’, ‘decay’, ‘corruption’ 7 fornicatio ‘fornication’ 4 labor ‘labour’, ‘toil’, ‘exertion’ 3 occisio ‘killing’, ‘murder’ 3 With a series of federated queries across the three endpoints of LiLa17 (the collection of lemmas, called Lemma Bank, the corpora, and the lexical resources), it is possible to obtain such results. On the negative pole, the ITTB includes 67 tokens labeled with a negative polarity that are the subjects of copular con- structions; the 5 most frequently attested lemmas are reported in Table 3. Not surprisingly, by far the most numerous attestations are those of the technical word for the concept of evil itself, malum. As Davis observes [6, 14], the term malum is, in its philosophical sense, broader than English ‘evil’. The latter car- ries very strong connotations and is usually not applied to what is perceived as generically unpleasant or troublesome; on the other hand, in the philosophical literature, malum covers all the entities that can be said to fall short of the opposite pole of bonum (‘good’). It is also informative to extract all the couplets subject-predicate nominal where the subject is negative. The word that is constantly associated with forni- catio ‘fornication’ as predicate nominal in the ITTB is peccatum ‘sin’, as all the four occurrences come from a section of the SCG discussing ‘why simple forni- cation is a sin according to divine law’ (qua ratione fornicatio simplex secundum legem divinam sit peccatum, SCG 3.120). 15 Thomas Aquinas authored a full treatise On Evil in the form of a disputatio, a formalized treatment of a subject articulated into questions and arguments and counterarguments, issuing from public debates or school seminars [6, 3-53]. 16 On the copular sentences and on the history of the notion of copula, which is also closely tied to the history of Western philosophy, see Moro [23, 248-261]. 17 https://lila-erc.eu/sparql/index.html Towards the Modeling of Polarity in a Latin Knowledge Base 67 The words labor ‘labour, toil, exertion’ and especially corruptio ‘destruction, corruption’ are interesting as well. The former is a complex term that embraces the notion of physical effort, then that of economic production, but also of pain and fatigue. While its main association with extortion justifies the negative po- larity, it is a word that could also carry positive association, in the moral and economic sphere, both in the Pagan and Christian cultures [15]. Indeed, Thomas Aquinas also uses it with the adjective bonus, to refer (in the plural) to the ‘good words [. . . ] by which we satisfy God for our sins’ [8, 622]. In the ITTB, all the three occurrences of labor are coupled with the adjective necessarius ‘necessary’, to point to the economic prerequisite of manual work for survival (SCG 3.140) and to rebut the claim that it is equally necessary on moral grounds (SCG 3.140.15 and 16). The definitions of corruptio point to a more complex picture. With the basic meaning of ‘destruction’, ‘decay’ the word has a clear negative orientation. At the same time, the nuances in its use reflect the complexities in the question of ‘evil’. Destruction can in itself be a good thing, if it implies the destruction of evil, and indeed this is exactly one of the first points raised by the philosopher in a sentence that provides a striking example of association of a negative subject with a positive predicate nominal: cum corruptio mali sit bona ‘for the destruc- tion of evil is good’ (SCG 3.11.4). Again, there are particular goods for which the corruption (corruptio) of the ones means the generation (generatio) of the other (SCG 1.11.12). In fact, the opposition between the two terms (corruptio and generatio) [8, 252] is well reflected also in another passage (SCG 3.140.4) where the concept that an evil is subordered to the creation of a good is exem- plified by the fact that the corruption (corruptio) of air is the generation of fire (generatio).18 This case study has presented only a very preliminary and partial analysis, as we have focused our attention only on one of the many syntactical construc- tions that may be worth investigating.19 In addition, it is important to note that one crucial issue that could limit the value of the results presented above is the coverage of our polarity lexicon. Given the limited number of lemmas included in the lexicon, it is possible that other negative terms, which do not have a polarity annotation, are used in copular constructions and are not detected by our query. In order to verify this situation, we extracted the list of the 150 most frequent lemmas that are used as subject of copular constructions. In the list we identified 4 lemmas (out of a total of 5, including malum, which ranked 37th) that should have had a negative polarity but are not included in LatinAffectus: privatio ‘deprivation’ (25 occurrences), paupertas ‘poverty’ (9), peccatum ‘sin’ (7), defectus ‘difect’ (6). As these lemmas should have ranked between the sec- 18 It is worth noticing that, although in Thomas Aquinas the two terms are opposite and corruptio is assigned a negative polarity, generatio does not have a polarity label in LatinAffectus. 19 Another interesting example could be the instances of coordination, to extract all terms that are associated to positive and/or negative concepts in ‘and’ or ‘or’ con- struction. 68 R. Sprugnoli, F. Mambrini, G. Moretti, M. Passarotti ond and the third position of Table 3, it is clear that this form of evaluation is required before using the corpus data. Nevertheless, it is already possible to see how fruitful is the crossing between lexical data on polarity and linguistic annotation for a broad range of corpus-based researches. 6 Conclusion and Future Work In this paper, we have described the process of inclusion of a sentiment lexicon for Latin (LatinAffectus) into the LiLa KB, an infrastructure of interoperable linguistic resources based on the Linked Data paradigm. Instead of developing from scratch a new model for representing the lexicon in Linked Data, we selected three already available and widely used models, following the general recommen- dation of the Linked Data world to re-use existing ontologies and vocabularies as much as possible, in order to enhance interoperability with other resources. Interoperability is the key word here. To show the benefit of working with linguistic resources that interact with each other, we presented a simple use case, where (meta)data from different resources for Latin (a dependency tree- bank, the LatinAffectus lexicon and the collection of lemmas of the LiLa KB) interact to investigate a basic topic of Thomistic philosophy. Although the work of philosophical investigation on the texts of Thomas Aquinas is centuries-old, it was never possible until now to join automatically (and, thus, to exploit to the best) the information provided by separate resources, like lexicons, dictionaries and corpora, to find the answers to fundamental research questions. As for the specific case of Thomas Aquinas, this goes back to the history itself of linguistic computing, since his texts were among the first to be automatically processed with computers, when in the 1950s the Jesuit Roberto Busa started to use IBM machines to build the large corpus of the Index Thomisticus [4]. Today we can make the data of the Index Thomisticus (now partly tree- banked) speak the same language of several other linguistic resources for Latin that were created across the last decades. It is from the synergy of the (meta)data provided by such resources that we can draw the overall picture as the necessary condition to grasp the textual data, which in turn makes it possible to better understand their content and, ultimately, to yield new knowledge. We are convinced that now is the time for the research area dealing with the development and distribution of linguistic resources for Latin to find a way to harmonize such differences in data and metadata, as a requirement raising both from data providers and from data users. Indeed, the lack of interoperability between resources prevents them from benefiting the large research community working in the broad area of the Humanities, which often deals with Latin texts, but is not provided with sufficient expertise to make distributed resources using different annotation schemes, tag sets and data formats interact. The result of such situation is that a large set of valuable linguistic resources for Latin, built with remarkable effort by data providers in long lasting projects, still remains unused (and sometimes unknown) by their reference community. Towards the Modeling of Polarity in a Latin Knowledge Base 69 LiLa was launched just to overcome such state of affair. The first result of the project was to provide the LiLa KB with its very core component, i.e. the collection of Latin lemmas, which is used as the connecting point between the resources that LiLa wants to make interact. Once the lemma collection was ready, we started to include into the KB the first linguistic resources for Latin, among which is LatinAffectus. In the near future, we plan to include the Latin Wordnet ([13], [21]) so that, besides the prior polarity sentiment score provided by LatinAffectus, we will be able to assign a specific score to the single meanings of the words (assigned to different synsets of WordNet), by relying on previous work done for the SentiWordNet resource [1]. References 1. Baccianella, S., Esuli, A., Sebastiani, F.: SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Lrec. vol. 10, pp. 2200–2204 (2010) 2. Buitelaar, P., Arcan, M., Iglesias, C.A., Sánchez-Rada, J.F., Strapparava, C.: Lin- guistic Linked Data for Sentiment Analysis. In: Proceedings of the 2nd Workshop on Linked Data in Linguistics (LDL-2013): Representing and linking lexicons, ter- minologies and other language data. pp. 1–8 (2013) 3. Buitelaar, P., Cimiano, P., Haase, P., Sintek, M.: Towards linguistically grounded ontologies. In: European Semantic Web Conference. pp. 111–125. Springer (2009) 4. Busa, R.: Index Thomisticus: Sancti Thomae Aquinatis operum omnium indices et concordantiae in quibus verborum omnium et singulorum formae et lemmata cum suis frequentiis et contextibus variis modis referuntur. Index thomisticus: Sancti Thomae Aquinatis operum omnium indices et concordantiae, Frommann-Holzboog (1974) 5. Chen, Y., Skiena, S.: Building sentiment lexicons for all major languages. In: Pro- ceedings of the 52nd Annual Meeting of the Association for Computational Lin- guistics (Volume 2: Short Papers). pp. 383–389 (2014) 6. Davis, B. (ed.): Thomas Aquinas. On Evil. Oxford University Press, Oxford (2003) 7. Declerck, T.: Representation of Polarity Information of Elements of German Com- pound Words. In: LDL 2016 5th Workshop on Linked Data in Linguistics: Manag- ing, Building and Using Linked Language Resources. p. 46 (2016) 8. Deferrary, R., Barry, I.: A lexicon of St. Thomas Aquinas based on the Summa theologica and selected passages of his other works. Catholic University of America Press, Washington (1948) 9. Denooz, J.: Opera Latina: une base de données sur internet. Euphrosyne 32, 79–88 (2004) 10. Doerr, M.: The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata. AI magazine 24(3), 75–75 (2003) 11. Fang, X., Zhan, J.: Sentiment analysis using product review data. Journal of Big Data 2(1), 5 (2015) 12. Fillmore, C.J.e.a.: Linguistics in the morning calm. Linguistics Society of Korea. Frame Semantics. Seou: Hanshin (1982) 13. Franzini, G., Peverelli, A., Ruffolo, P., Passarotti, M., Sanna, H., Signoroni, E., Ventura, V., Zampedri, F.: Nunc Est Aestimandum. Towards an Evaluation of the Latin WordNet. In: Proceedings of the Sixth Italian Conference on Computational Linguistics (CLiC-it 2029). CEUR-WS. org (2019) 70 R. Sprugnoli, F. Mambrini, G. Moretti, M. Passarotti 14. Khan, A.F.: Towards the Representation of Etymological Data on the Semantic Web. Information 9(12), 304 (Dec 2018) 15. Lau, D.: Der lateinische Begriff Labor. Fink, Munich (1975) 16. Liu, B.: Sentiment analysis: Mining opinions, sentiments, and emotions. Cambridge University Press (2015) 17. Mambrini, F., Passarotti, M.: Harmonizing Different Lemmatization Strategies for Building a Knowledge Base of Linguistic Resources for Latin. In: Proceedings of the 13th Linguistic Annotation Workshop. pp. 71–80. Association for Computational Linguistics, Florence, Italy (Aug 2019) 18. Mambrini, F., Passarotti, M.: Representing Etymology in the LiLa Knowledge Base of Linguistic Resources for Latin. In: Kernerman, I., Krek, S. (eds.) Proceedings of Globalex Workshop on Linked Lexicography (GLOBALEX 2020). European Language Resources Association (elra), Paris, France (May 2020) 19. McCrae, J., Spohr, D., Cimiano, P.: Linking lexical resources and ontologies on the semantic web with lemon. In: Extended Semantic Web Conference. pp. 245–259. Springer (2011) 20. McCrae, J.P., Bosque-Gil, J., Gracia, J., Buitelaar, P., Cimiano, P.: The Ontolex- Lemon model: development and applications. In: Proceedings of eLex 2017 confer- ence. pp. 19–21 (2017) 21. Minozzi, S.: Latin WordNet, una rete di conoscenza semantica per il latino e alcune ipotesi di utilizzo nel campo dell’Information Retrieval. In: Mastandrea, P. (ed.) Strumenti digitali e collaborativi per le Scienze dell’Antichità, pp. 123–134. No. 14 in Antichistica (2017), http://doi.org/10.14277/6969-182-9/ANT-14-10 22. Mohammad, S.: Obtaining reliable human ratings of valence, arousal, and domi- nance for 20,000 English words. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 174–184 (2018) 23. Moro, A.: The Raising of Predicates: Predicative Noun Phrases and the Theory of Clause Structure. Cambridge University Press, Cambridge (1997) 24. Nakov, P., Rosenthal, S., Kiritchenko, S., Mohammad, S.M., Kozareva, Z., Ritter, A., Stoyanov, V., Zhu, X.: Developing a successful SemEval task in sentiment anal- ysis of Twitter and other social media texts. Language Resources and Evaluation 50(1), 35–65 (2016) 25. Passarotti, M.C., Cecchini, F.M., Franzini, G., Litta, E., Mambrini, F., Ruffolo, P.: The LiLa Knowledge Base of Linguistic Resources and NLP Tools for Latin. In: 2nd Conference on Language, Data and Knowledge (LDK 2019). pp. 6–11. CEUR-WS. org (2019) 26. Passarotti, M.C.: The Project of the Index Thomisticus Treebank. In: Berti, M. (ed.) Classical Philology. Ancient Greek and Latin in the Digital Revolution, pp. 299–320. Berlin, Boston: De Gruyter (2019) 27. Simpson, D.P.: Cassell’s Latin dictionary. Simon & Schuster Macmillan Company (1959) 28. Sprugnoli, R., Passarotti, M., Corbetta, D., Peverelli, A.: Odi et Amo. Creating, Evaluating and Extending Sentiment Lexicons for Latin. In: Proceedings of LREC 2020 (2020) 29. de Vaan, M.: Etymological Dictionary of Latin: and the other Italic Languages. Brill, Amsterdam (2008), https://brill.com/view/title/12612 30. Westerski, A., Sánchez-Rada, J.F.: Marl Ontology Specification, V1. 0 May 2013 (2013)