Lifeless Winter without Break: Ovid’s Exile Works and the LiLa Knowledge Base Aurora Alagni1,* , Francesco Mambrini1 and Marco Passarotti1 1 Università Cattolica del Sacro Cuore, Largo Gemelli 1, Milano, 20123, Italy Abstract In this paper we describe the process of semi-automatic annotation and linking performed to connect two works by the Latin poet Ovid to the LiLa Knowledge Base. Written after Ovid’s exile from Rome, the Tristia and the Epistulae ex Ponto mark the beginning of the “literature of exile”. In spite of their importance, no lemmatized version existed and the two collections were not part of the major annotated corpora linked to LiLa. The paper discusses the workflow used to annotate and publish the works as Linked Open Data connected to the LiLa Knowledge Base. On account of their subject and the emotional tone attached to the theme of exile, the two works are particularly relevant for sentiment analysis. We discuss some results of a lexicon-based analysis that is enabled by the interlinking with LiLa. We use LatinAffectus, a manually-generated polarity lexicon for Latin nouns and adjectives, to perform Sentiment Analysis on the aforementioned works and interpret the (replicable) results by consulting and simultaneously enriching the available literary scholarship with new information. Keywords Linked Open Data, Lemmatization, Latin, Sentiment Analysis, Humanities Computing 1. Introduction history of Western literature. His mythological poem in 15 books (the Metamorphoses, written between 2 and 8 The World Wide Web provides Latin scholars with a CE) has been a crucial source of inspiration for artists like plethora of free, high-quality resources, issued from a Dante, Shakespeare, or Titian. His body of elegiac poetry long tradition of linguistic and philological study; many of erotic subject won him immense popularity during digital libraries, such as the Perseus Digital Library [1] his life and afterwards. In spite of his importance, the or the Digital Latin Library [2], supply electronic and work of Ovid is not represented in full neither in the LiLa often machine-actionable versions of some of the most network, nor in any other annotated corpora. The LASLA studied texts in world literature. In the last years, the corpus provides only his earlier works (Ars Amatoria, CIRCSE Research Center has developed the LiLa Knowl- Remedia Amoris, Medicamina, Amores, Heroids) and other edge Base with the objective of making the distributed poems (Fasti, Halieutica, Ibis), while the annotation of knowledge about Latin texts interoperable through the the Metamorphoses is listed as “in progress”. application of the principles of the Linked Data paradigm Among the works that are utterly missing figure two [3]. LiLa (presented below in sec. 3) now includes a of the last books of Ovid’s career, the Tristia (“Sorrows” number of lexicons and annotated corpora. In particular, or “Lamentations”, written between 9–13 CE) and the the Opera Latina LASLA corpus, a manually lemmatized Epistulae ex Ponto (“Letters from the Black Sea”, 12–17 and morphosyntactically annotated corpus of more than CE, henceforth Epistulae) that were partly published af- 1.5 million words mainly belonging to Classical Latin ter the poet’s death. These two poetic collections center literature that was recently added to LiLa [4], has sig- around Ovid’s forced departure from Rome and exile to nificantly expanded the textual heritage within the LiLa the town of Tomis (modern-day Constant, a in Romania), Knowledge Base, which now provides a Linked Open at the furthest ends of the Roman empire. Despite his Data (LOD) compliant edition of many widely studied many attempts, Ovid would never come back from this literary works. “utmost part of an unknown world” (extremis ignoti part- Publius Ovidius Naso (anglicized as Ovid, 43 BCE - 17 ibus orbis, Tr. 3.3.31 ) nor was he ever restored to his CE) is arguably one of the most influential writers in the previous status. The two works are a fundamental source for the biography of the poet. Moreover, they are a foun- CLiC-it 2024: Tenth Italian Conference on Computational Linguistics, Dec 04 — 06, 2024, Pisa, Italy dational archetype of a peculiar sub-genre that is still * influential in modern days, the “exile literature” [6]. Corresponding author. † These authors contributed equally. Ovid’s exilic works were banished from libraries, and $ aurora.alagni01@icatt.it (A. Alagni); although they survived, were often judged unfavorably francesco.mambrini@unicatt.it (F. Mambrini); by the critics [7, xxxvi]. The present study aims, in part, marco.passarotti@unicatt.it (M. Passarotti) at revoking the ban that still seems to weigh on these  0000-0003-0834-7562 (F. Mambrini); 0000-0002-9806-7187 (M. Passarotti) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License 1 Attribution 4.0 International (CC BY 4.0). All English translations are by Wheeler [5]. CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Ovidian poetic collections, allowing them to enter the coincides with the so-called "affective turn" in the hu- LiLa network. In what follows we describe how we pre- manities and social sciences, which has fostered renewed pared a lemmatized and part-of-speech (POS) tagged ver- engagement with emotion [15]. However, there remain sion of the two poems and how we linked this edition significant limitations in the application of sentiment to the network of textual and lexical resources for Latin analysis within Computational Literary Studies, two of connected to LiLa. Our work fills the significant gap cre- which are addressed in this paper. ated by the absence of the exilic works of Ovid from the First, while the World Wide Web and social media available annotated corpora. In addition, it also links to represent an ostensibly infinite repository of emotions, LiLa two collections of poems that, on account of their annotated corpora of literary texts are still infrequently subject, foreground the emotional tone, and were suc- available. This is especially true for classical languages. cessful in shaping the conventions of exilic literature; As previously mentioned and will be further illustrated in these works established the literary codification of the this paper, this limitation can be mitigated through the de- psychological reactions to banishment, within a veritable velopment and dissemination of interoperable resources. poetics of exile. Their content and historical relevance To our knowledge, there are only a few experiments make them ideal candidates for a computationally based conducted in classical languages. Sprugnoli et al. [16] study on the sentiment analysis of literary texts. evaluated two distinct approaches to automatic polarity The paper is structured as follows. Section 2 reviews classification of eight odes by the Latin author Horace: related work, with a specific focus on sentiment analy- a lexicon-based approach, grounded in the first version sis within the field of Computational Literary Studies. of LatinAffectus, and a zero-shot classification method. Section 3 introduces the LiLa Knowledge Base and the Sprugnoli et al. [17] present an example of how to use language resources connected to it. Section 4 describes interoperable resources to analyse the sentiment value of the workflow followed for the annotation, publication the Latin epistles by Dante Alighieri, employing SPARQL and linking of the works. Section 5 discusses the type queries that access an extended version of LatinAffectus, of knowledge that can be gained by combining the data the LiLa Knowledge Base, and UDante. Pavlopoulos et al. from LatinAffectus, a prior polarity lexicon of Latin in- [18] annotated the sentiment of a modern Greek transla- cluded in LiLa, and the newly prepared edition of the tion of the first book of the Iliad and demonstrated that a works, for a lexicon-based approach to their sentiment. fine-tuned version of GreekBERT can achieve a low error Section 6 presents the conclusions and discusses plans rate. Zhao et al. [19] proposed a model based on transfer for future work. learning to classify a dataset of Tang Dynasty Chinese poems and compared the sentiment analysis results with social history analysis. After constructing a sentiment 2. Related Work lexicon for Classical Chinese poetry, Hou et al. [20] eval- uated it both intrinsically and extrinsically, highlighting Sentiment analysis (SA) is the field of study that analyses that their analysis results align with the main findings people’s opinions, sentiments, appraisals, attitudes, and established in Classical Chinese literary studies. emotions toward entities and their attributes expressed Second, although sentiment analysis in the field of in written text [8]. Considering that opinions have now Computational Literary Studies is employed to address a fundamental role in everyday life, SA is not just an questions related to literary theory, the results often lack object of research in the field of NLP, but also in busi- connection to a rigorous analysis, focusing solely on per- ness, economic, political, even medical domains. Indeed, 2 formance metrics. The aforementioned studies exemplify sentiment analysis has numerous applications , ranging this tendency, particularly since only those conducted from investigating product reviews to enhance product on Classical Chinese take literary studies into account. development [10], analysing news related to the stock Rarely do they contribute to advancements in literary market to predict price trends [11], monitoring social criticism, an area that could greatly benefit from clear and media to forecast election outcomes [12], and evaluating reproducible results, considering that it typically relies public health through tweets about patient experiences on the intuition of critics. This issue has been highlighted [13]. by Rebora [21], who notes that while the strongest con- Furthermore, sentiment analysis has recently emerged nection between literary theory and sentiment analysis as one of the most discussed topics within the realm of 3 occurs in the field of narratology, the actual points of in- Computational Literary Studies . This rise in prominence tersection reveal themselves to be problematic and based on questionable assumptions. This paper will also ad- 2 See Wankhade et al. [9] for an in-depth overview of the applications dress these concerns, as the results of sentiment analysis of sentiment analysis, as well as the methods for conducting this conducted on Ovid’s exilic works are closely intertwined task. 3 For an extensive survey on sentiment and emotion analysis applied with the literary scholarship surrounding those texts. Al- to literature, see the paper by Kim and Klinger [14]. though our findings may not be generalisable due to their basis in a small, yet highly controlled dataset, our method values expressing their prior polarity, that is their senti- is clearly reproducible and shareable. ment orientation regardless of the context of use [8], have been associated. The classification adopts five numeric values: -1.0 (fully negative, as e.g. uulnus, “wound”), -0.5 3. Latin resources in LiLa (negative, grauis, “serious”), 0 (neutral, ianua, “door”), +0.5 (positive, ius, “justice”), +1.0 (fully positive, pietas, LiLa is a network of interconnected language resources “devotion”). for Latin aimed at insuring interoperability between cor- In the second part of this paper (Sec. 5) we will make pora, lexicons and natural language processing (NLP) use of data from LatinAffectus to perform lexicon-based tools. To pursue its goal, it adopts the Linked Data Sentiment Analysis of Ovid’s exilic works. The results paradigm. At the heart of the project, the interlinking be- obtained from the SA conducted on the Tristia and the tween the different components is ensured by the Lemma Epistulae, clear and reproducible, and their interpreta- Bank [22], a collection of canonical forms (lemmas) that tion carried on in light of the previous results of literary can be used to lemmatize texts and index entries in dictio- criticism on the subject allowed us to investigate the evo- naries. Each lemma of the Lemma Bank is provided with lution of Ovid’s poetic journey (Sec. 5.1) and the decline a unique identifier, in the form a URL resolvable on the of relationships with those left behind in Rome (Sec. 5.2). World Wide Web, and described by a series of properties modeled with the help of OWL ontologies for Linguistic LOD, such as Ontolex [23, 45-59]. 4. Ovid’s exile works as LOD Currently, the Lemma Bank includes 226,775 canonical forms, which are used to link 14 lexical resources and The Tristia are a collection of 50 poems in elegiac meter 7 corpora. The latter include collections of texts from (i.e. couplet of lines with an hexameter followed by a different times and genres (from the works of Medieval pentameter) divided into 5 books. The Epistulae include authors like the mathematician Fibonacci [24], Thomas 46 letters in elegiac couplets divided into 4 books. The Aquinas [25] or Dante Alighieri [26], to inscriptions from poetry in both works mixes the themes of lamentation various areas of the Roman Empire [27]). The largest col- over the exile and the desperate plead (peroratio) directed lection of Classical literary texts is provided by the Opera towards the loved ones and potential allies in Rome. Latina, a manually crafted corpus with morphological The starting point of our edition was a plain-text ver- annotation and lemmatization developed since the 1960s sion of the two works, which we obtained from The Latin by the LASLA laboratory of the University of Liège. The Library6 . The two works consists of a total of 43,438 to- LASLA corpus (which is still in development) includes kens (without punctuation), and 3,061 sentences. Few 131 Latin works by 19 authors, ranging chronologically preprocessing operations were performed over the texts, from Plautus (c. 254 – 184 BC) to Juvenal (55 – 128 CE). namely the addition of three missing lines, which were As said, however, even such comprehensive collection omitted by mistake in the original source (Tr. 3.10.44 and does not cover the whole extant production, also for some 52, Tr. 5.12.50), the correction of evident transcription of the major authors within that time span; Ovid’s exilic errors (most likely due to OCR issues, e.g. virunique for words are a prominent example of missing texts. To fill virumque, Tr. 2.372), the standardization of capitalization the gaps in LASLA, and widen the chronological span usage, and the adoption of the "u" character even for the of ancient authors to the end of the Roman era in the voiced labiodental fricative [v], following the convention 6th Century CE, the CIRCSE has launched a new collec- adopted in the LiLa Lemma Bank. tion (natively linked to LiLa) called the “CIRCSE Latin Tokenization, sentence splitting, lemmatization and Library”4 . POS tagging were performed automatically by the LiLa Among the lexical resources produced within LiLa5 , Text Linker, a POS-tagger and lemmatizer for the Latin LatinAffectus [28] is a manually generated polarity lex- language developed as one of the user-dedicated services icon of Latin adjectives and nouns. The lexicon was of LiLa that also links the output of the NLP operations designed to support research in Sentiment Analysis (SA) to the entries in the Lemma Bank [29]. For POS-tagging [8], an approach to the linguistic and literary studies of and lemmatization the Text Linker uses a custom-trained ancient texts that, although still in its infancy, is gaining UDPipe model (as documented in [29]). The output of growing recognition [18][16]. the tasks performed automatically was systematically re- In its latest version, LatinAffectus contains 6,018 lem- viewed and manually corrected by one annotator adopt- mas, 2,216 adjectives and 3,802 nouns, to which numerical ing a scholarly annotation approach [30]. 42 tokenization 4 http://lila-erc.eu/data/corpora/CIRCSELatinLibrary/id/corpus. errors were identified (on average between 4 and 5 per 5 For a complete list of the resources currently linked to LiLa, see: book), often due to a failure to segment punctuation (e.g. https://lila-erc.eu/data-page/. Please note that all LiLa’s resources the sequence legent? in Tr. 5.1.94). are assigned DOIs registered through Zenodo and are also available 6 in CLARIN. http://www.m.thelatinlibrary.com/ovid.html. Table 1 (e.g. the Thracian tribe of the “Corallis”, Ep. 5.2.37, or Accuracy of POS tagging and lemmatization per book of Epis- the unknown poet “Marius”, mentioned in Ep. 4.16.24). tulae and Tristia as performed by the LiLa Text Linker Table 2 shows the performances of the POS-tagger for the 12 out of 17 tags that were used more than 1,000 Accuracy Book Nr. of tokens POS Tagging Lemmatization times8 . With an F1-score sensibly under 90%, proper nouns (PROPN) is the most challenging class for the model Ep. 1 5,923 0.95 0.93 to predict. Ep. 2 5,770 0.97 0.94 All tasks (tokenization, POS-tagging, lemmatization Ep. 3 5,671 0.97 0.95 and linking) are closely interconnected: an error in tok- Ep. 4 7,099 0.97 0.94 Tr. 1 5,805 0.96 0.94 enization inevitably leads to an error in lemmatization Tr. 2 4,427 0.96 0.93 and POS tagging, which then causes a wrong or missing Tr. 3 6,214 0.96 0.95 linking. For example, 18 forms of the verb addo, “to add”, Tr. 4 5,311 0.97 0.95 in the second person singular imperative, adde, “add”, Tr. 5 5,980 0.96 0.94 were mislabeled as proper nouns (PROPN), and thus as- TOT 52,200 0.94 0.96 signed to a nonexistent lemma "Ads". Once disambigua- tions and corrections were performed, the digital edi- tions of the Tristia and the Epistulae were prepared and Table 2 published as Linked Data, as part of the “CIRCSE Latin Evaluation of POS tagging for the 11 tags with support > 1,000 Library”9 . tokens POS-Tag Precision Recall F1-score Support 5. Sentiment analysis and Ovid’s VERB 0.98 0.97 0.97 10,960 NOUN 0.96 0.97 0.96 10,626 exile works PUNCT 1.00 1.00 1.00 8667 ADJ 0.95 0.90 0.92 Thanks to the work performed in the linking process, 4,702 ADV 0.96 0.95 0.95 each token of the two exilic poems is now connected 3,955 DET 0.95 0.99 0.97 3,836 to the respective lemma within the Lemma Bank via a PRON 0.99 0.93 0.96 dedicated property (hasLemma)10 defined in the OWL 3,276 CCONJ 0.99 0.99 0.99 1,698 ontology of the LiLa project [3]. As the lemma’s URI is ADP 0.96 0.99 0.98 1,625 the same that is used as canonical form for the entries of PROPN 0.79 0.90 0.84 1,353 LatinAffectus, this step effectively enables users to cross- SCONJ 0.88 0.94 0.91 1,304 check the textual information within the two works and the scores recorded in the prior polarity lexicon. Following the same methodology discussed in Sprug- The accuracy score reached by the model of the LiLa noli et al. for Horace [16, 61-2], we proceeded to match Text Linker are reported in table 17 . As it can be seen, the each token of Tristia and Epistulae to the polarity score tool performed quite satisfactorily in both tasks, reaching recorded in LatinAffectus for their respective lemma. The an average accuracy across the different books of the two sentiment scores are obtained by automatically assigning works of 96% and 94% respectively. Accurate lemmatiza- the score found in LatinAffectus to the tokens that are tion also lead to good scores for the linking process, with lemmatized under lemmas that also have an entry in the approximately 87% of the word forms uniquely associated polarity lexicon. For instance, the adjective malus “bad” with one lemma. Of the remaining lemmas, 10% were is found with a polarity value of -1.0 in LatinAffectus. ambiguous, as they were associated with two or more All tokens lemmatized as malus (adj.) are thus given a potential candidates in LiLa, mainly due to homography score of -1.0. A score of 0.0 is assigned to both words (e.g. the lemma string volo can be linked to both the first- expressly annotated as neutral in LatinAffectus and to conjugation verb volare, “to fly” and the irregular verb those that do not have an entry in the lexicon. The cover- volere, “will”), and required manual disambiguation. age of polarity-laden tokens (both adjectives and nouns) Of the 3% of no-matches, most were proper names. is reported in table 3. Ovid mentions barbarian tribes and figures belonging to Roman cultural circles rarely or never cited elsewhere. 8 The model uses the Universal POS tagset of Universal Dependencies; In the fourth book of the Epistulae, out of a total of 42 see: https://universaldependencies.org/u/pos/index.html. tokens not linked to any lemma, 32 are proper names 9 http://lila-erc.eu/data/corpora/CIRCSELatinLibrary/id/corpus/ P.%20Ovidii%20Tristia and http://lila-erc.eu/data/corpora/ 7 Note that, in the evaluation, we omitted the 3 missing lines that CIRCSELatinLibrary/id/corpus/P.%20Ovidii%20Epistulae%20ex% were added in the revision stage. For this reason table 1 has slightly 20Ponto. 10 fewer tokens than table 3. http://lila-erc.eu/ontologies/lila/hasLemma. Table 3 Token coverage of polarity-laden nouns and adjectives in the books of Epistulae and Tristia. Per each book, the total nr. of adj. and nouns are reported, as well as the nr. of adj./nouns with polarity score ̸= 0 (pos/neg) Book Nouns Adjectives Tot Tokens tot pos/neg tot pos/neg Epistulae.b1 1,195 1,061 545 360 5,923 Epistulae.b2 1,214 1,088 561 425 5,770 Epistulae.b3 1,135 1,013 452 335 5,671 Epistulae.b4 1,447 1,282 676 464 7,099 Tristia.b1 1,153 995 513 358 5,805 Tristia.b2 922 817 386 268 4,427 Tristia.b3 1,272 1,131 555 386 6,227 Tristia.b4 1,140 1,020 493 353 5,311 Tristia.b5 1,152 1,037 523 386 5,989 TOT 10,630 9,444 4,704 3,335 52,222 In what follows, due to space constraints, only some of reaching 0.020 (99 occurrences). The focus of Ovidian the results obtained from the sentiment analysis con- epistles seems to split, with the once uncontested domain ducted on Ovid’s exilic works will be discussed. In of the "I" beginning to be accompanied by the equally analysing these results, we will focus on the distribution large realm of the "you". The solipsism of the sender of sentiment-laden words and what this reveals about starts to giving way to the celebration of the recipient, Ovid’s emotional state during his exile. transmuting the once famous and now banished elegiac poet into a potential celebratory poet, who could excep- 5.1. Ovid’s “last metamorphosis” tionally glorify his future patron if only he is given the chance to (and after, of course, being recalled back home). To investigate how Ovid’s attitude evolves throughout Commentators have never doubted that Ovid, after his exilic works, we calculated the overall sentiment for some attempts in the third book (e.g. Ep. 3.4-5), dedicates each book (fig. 1). Specifically, we summed the polarity himself to panegyric poetry in the fourth book, no doubt scores and divided the total by the number of sentences in order to win powerful allies who could intercede for to mitigate skewness resulting from the varying lengths his return [31, 120-121] [32]. However, this intention of the books [17]11 . This book-level score reveals a nega- was never noted or at least imagined for the Epistulae’s tive emotional state persisting until the first book of the second book. Epistulae. From the second book onward, however, the It is undeniable that we witness the last metamorpho- sentiment undergoes a polarity shift, becoming positive sis in the poetic trajectory of Ovidian elegy. Our results and remaining so until the last book. The reasons behind suggest that this metamorphosis, still so premature that such a radical change in the poet’s emotional state are it has not been detected by critics, is clearly recorded by worth investigating. Ovid’s polarity lexicon, that is, the sentiment analysis already in the second book of the Epis- most frequently used sentiment-laden words in the ex- tulae. Indeed, when the sentiment analysis is conducted ilic works, does not show any particular change in the 9 at a finer grain, and thus at the level of individual com- books considered here. An interesting change that we do positions , it reveals an increase in positivity precisely in observe in the last books concerns the distribution of the the verse-epistles sent to new and powerful recipients. personal pronouns. In Epistulae 1, the relative frequency This reflects a new poetic purpose for Ovid’s poetry. of the 1st p. singular pronoun, ego, is 0.018 (93 over 4,983 lemmas), while for the second person singular pronoun, 5.2. Facing the abandonment tu, it is 0.010 (52 occurrences). In Epistulae 2, the former has an identical relative frequency (0.018, or 89 occur- Another advantage of lexicon-based SA is the possibility rences over 4,920), while the latter increases significantly, to directly engage with a list of sentiment words mostly used by an author in their entire production or in specific 11 Ovid’s sentences tend to correspond with the elegiac couplet. The works of interest. A close observation of this specialized two works have 3,4044 sentences with an average length of 17.16 lexicon can lead to interesting outcomes too. tokens (stdev = 11.43). The books tend to have a rather similar The sentiment words used in the exilic works are rel- number of sentences, ranging from 261 (Tr. 2) to 388 (Ep. 4), with atively stable in quality and quantity. Five distinct se- a mean length of 338.22 (stdev = 37.74). Note, however, that we relied on the sentence splitter of TextLinker and the results were mantic spheres [33, 203] can be identified: friendship, not corrected manually. politics, justice, intellect, and sadness (fig. 2). Among these, the semantic sphere of friendship and love contains 6. Conclusion and future work abstract qualities and feelings (amor “love”, fides “trust, faith”, honor “honor”, nobilitas “nobility”, pietas “devo- The work that we presented in the paper had two out- tion”, virtus “virtue”), as well as nouns and qualifying comes. Firstly, our LOD edition of Ovid demonstrates the adjectives typical of friendly and romantic relationships benefits of interoperability among resources for Latin. (bonus “good”, carus “dear”, dignus “worthy”, pius “du- Interoperability greatly facilitates the work of scholars, tiful, affectionate”). Although this sphere is frequently allowing them to benefit from lexicon, corpora, and NLP recurring throughout the exilic production, the words tools useful for every stage of their research through a composing it do not appear with the same consistency. single point of access. The LiLa project already provides Between the third and fourth book of the Tristia, new a paradigm of this model, but to continue doing so, it lemmas become part of this semantic sphere, indicating requires constant integration. This is true not only for a change in Ovid’s relationship with the affections left corpora, whose enrichment this paper testifies to. Despite behind in Rome. the important results that SA conducted with LatinAffec- In Tristia 3, the only epithets fitting for his friends tus already provides for Ovid, there remains several ways (lemma amicus, 10) were “dear” (carus, 10) and “good” for enhancing its performance. The coverage of LatinAf- (bonus, 6). These friends, along with the wife, represented fectus is extensive with regard to nouns and adjectives, as Ovid’s only hope of salvation. In Tristia 4, Ovid reaches clearly demonstrated by its performance on the dataset the fourth year of exile and sees the possibility of relying discussed in this paper (see table 3). However, it is evi- on them slipping further out of his grasp. The poet begins dent that a current limitation is its failure to account for to perceive that the friendship and love shown to him in the sentiment of verbs. This is why LatinAffectus, like Rome and at the height of his success might have been the other linguistic resources available in LiLa, should more superficial than he believed. His friends fail to write not be regarded as a static resource, but rather as one (Tr. 4.7.3-5) and Ovid catches himself wondering if his that is continually evolving and being updated. Addi- wife still thinks of him (Tr. 4.3.10). However, the bonds tionally, improvements could be made by accounting for of friendship and marriage could still be exploited. syntactic phenomena such as polarity shifters [16] and In Tristia’s book 4, as the occurrences of the adjectives by taking into consideration the poetic nature of the text “friend” (amicus, 3) and “dear” (carus, 3) decrease, the (e.g. by providing access to metrical information12 ). In a use of words such as “devoted”, “virtuous”, “worthy”, broader sense, there is a lack of sufficient consideration and “husband” increases. This lexicon here suggests a for the context in which sentiment words are collocated. form of conditional praise: only by proving themselves However, context-sensitive sentiment analysis is still in worthy of the friend and spouse in need can those left its early stages within NLP13 , and clearly, much work re- in Rome earn their title. Thus, if his friends are truly mains to be done to effectively incorporate context into “virtuous” (bonus, 8) and “devoted” (pius, 6) and wish their sentiment analysis. “fame” (fama, 7) to be such among contemporaries and The second outcome is in suggesting the undeniable posterity, they must show themselves worthy of such potential of a hybrid approach, such as the one employed a connotation. His wife must, similarly, prove herself in this study, crossing literary criticism with the use of worthy of being his husband’s (vir, 13) wife, even though quantitative methods and computational resources. The he is exiled. Consequently, Ovid would sooner credit theories developed within literary criticism and the in- a ten-verses long series of adynata rather than believe vestigative tools provided by computational linguistics that his friend decided to abandon him (Tr. 4.7.10-20). can and should effectively collaborate, mutually enrich- At the same time, his wife, dutiful as she is (Tr. 4.3.71), ing each other. In this specific context, the reflections surely must be existing solely to work for and diligently developed within literary criticism regarding Ovid’s ex- lament her absent husband (Tr. 4.3.17-38). Moreover, his ile works were crucial for interpreting the data derived misfortune gives her a unique chance for fame, for her from sentiment analysis. In turn, sentiment analysis was loyalty to be forever remembered (Tr. 4.3.81-84). This fundamental for confirming and deepening these obser- logic of coercion begins to be employed in book 4 of vation, providing interpretable and reproducible data. the Tristia, and finds full employment in the Epistulae. If a classic is a book which has never exhausted all it It consists of imposing fundamental moral models and has to say to its readers (as Calvino wrote [35, 5]), it is values of the Roman citizen on his recipients through also because scholars are capable of interrogating it with targeted praises, so that the recipients feel obliged to new methods to address longstanding and unresolved comply with the requests. Here too, sentiment analysis questions. reveals in its embryonic state what the critical eye has 12 only caught later in full development. For instance, this can be achieved by linking existing resources, such as Musisque Deoque, to LiLa. 13 See Teng et al. [34] paper for an overview of state-of-the-art studies on context-sensitive sentiment analysis. Book-level sentiment score Sentiment_Score 0.3 0.28 0.26 0.2 Sentiment Score 0.16 0.1 0.0 -0.04 -0.07 -0.1 -0.09 -0.09 -0.09 0.1 1 2 3 4 5 e1 e2 e3 e4 ia ia ia ia ia ula ula ula ula st st st st st Tri Tri Tri Tri Tri ist ist ist ist Ep Ep Ep Ep Book Figure 1: Ovid’s overall sentiment (i.e. sum of all polarity words in each book divided by the number of sentences in each book) across the 5 books of the Tristia and the 4 books of the Epistulae. Polarity lexicon by semantic classes across the books 50 40 30 Count 20 Category 10 Friendship Politics Justice Intellectual Life 0 Sadness 1 2 3 4 5 e1 e2 e3 e4 ia ia ia ia ia ula ula ula ula st st st st st Tri Tri Tri Tri Tri ist ist ist ist Ep Ep Ep Ep Book Figure 2: Distribution of polarized words according to semantic class across the 5 books of the Tristia and the 4 books of the Epistulae. Appendix References The appendix contains the figures cited in section 5. [1] G. Crane, The perseus digital library and the fu- ture of libraries, International Journal of Digital Libraries 24 (2024) 117–128. URL: https://doi.org/ 10.1007/s00799-022-00333-2. [2] S. J. Huskey, The digital latin library: Cataloging and publishing critical editions of latin texts, in: M. Berti (Ed.), Digital Classical Philology. Ancient Greek and Latin in the Digital Revolution, De Gruyter, Berlin, Boston, 2019, pp. 19–34. doi:doi: (2019). URL: http://arxiv.org/abs/1808.03137. doi:10. 10.1515/9783110599572-003. 17175/2019_008, arXiv:1808.03137 [cs]. [3] M. Passarotti, F. Mambrini, G. Franzini, F. M. Cec- [15] P. C. Hogan, B. J. Irish, L. P. Hogan (Eds.), The chini, E. Litta, G. Moretti, P. Ruffolo, R. Sprugnoli, Routledge Companion to Literature and Emo- Interlinking through Lemmas. The Lexical Collec- tion, Routledge, London, 2022. doi:10.4324/ tion of the LiLa Knowledge Base of Linguistic Re- 9780367809843. sources for Latin, Studi e Saggi Linguistici 58 (2020) [16] R. Sprugnoli, F. Mambrini, M. Passarotti, G. Moretti, 177–212. doi:10.4454/ssl.v58i1.277, number: The Sentiment of Latin Poetry. Annotation and Au- 1. tomatic Analysis of the Odes of Horace, IJCoL. Ital- [4] M. Fantoli, M. Passarotti, F. Mambrini, G. Moretti, ian Journal of Computational Linguistics 9 (2023). P. Ruffolo, Linking the LASLA Corpus in the LiLa doi:10.4000/ijcol.1125. Knowledge Base of Interoperable Linguistic Re- [17] R. Sprugnoli, M. C. Passarotti, M. Testori, G. Moretti, sources for Latin, in: Proceedings of the 8th Work- Extending and using a sentiment lexicon for latin shop on Linked Data in Linguistics within the 13th in a linked data framework, 2021. URL: https://api. Language Resources and Evaluation Conference, semanticscholar.org/CorpusID:248149526. European Language Resources Association, Mar- [18] J. Pavlopoulos, A. Xenos, D. Picca, Sentiment seille, France, 2022, pp. 26–34. Analysis of Homeric Text: The 1st Book of Iliad, [5] A. L. Wheeler, Publius Ovidius Naso. Tristia. Ex in: N. Calzolari, F. Béchet, P. Blache, K. Choukri, Ponto, Harvard University Press, Cambridge, MA, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Mae- 1959. gaard, J. Mariani, H. Mazo, J. Odijk, S. Piperidis [6] J.-M. Claassen, Ovid revisited: The poet in exile, (Eds.), Proceedings of the Thirteenth Language Bloomsbury Academic, London, 2008. Resources and Evaluation Conference, European [7] P. Green, Ovid. The Poems of Exile. Tristia and the Language Resources Association, Marseille, France, Black Sea Letters, University of California Press, 2022, pp. 7071–7077. URL: https://aclanthology.org/ Berkeley, 2005. 2022.lrec-1.765. [8] B. Liu, Sentiment Analysis: Mining Opinions, Sen- [19] H. Zhao, B. Wu, H. Wang, C. Shi, Sentiment analy- timents, and Emotions, 2nd edition ed., Cambridge sis based on transfer learning for Chinese ancient University Press, Cambridge ; New York, 2020. literature, in: 2014 International Conference on Be- [9] M. Wankhade, A. C. S. Rao, C. Kulkarni, A survey havioral, Economic, and Socio-Cultural Computing on sentiment analysis methods, applications, and (BESC2014), 2014, pp. 1–7. URL: https://ieeexplore. challenges, Artif. Intell. Rev. 55 (2022) 5731–5780. ieee.org/document/7059510. doi:10.1109/BESC. URL: https://doi.org/10.1007/s10462-022-10144-1. 2014.7059510. doi:10.1007/s10462-022-10144-1. [20] Y. Hou, A. Frank, Analyzing Sentiment in Classical [10] R. Bose, R. Dey, S. Roy, D. Sarddar, Sentiment Anal- Chinese Poetry, in: K. Zervanou, M. van Erp, B. Alex ysis on Online Product Reviews, 2018. (Eds.), Proceedings of the 9th SIGHUM Workshop [11] F. Xing, E. Cambria, R. Welsch, Natural lan- on Language Technology for Cultural Heritage, guage based financial forecasting: a survey, Ar- Social Sciences, and Humanities (LaTeCH), As- tificial Intelligence Review 50 (2018). doi:10.1007/ sociation for Computational Linguistics, Beijing, s10462-017-9588-9. China, 2015, pp. 15–24. URL: https://aclanthology. [12] B. O’Connor, R. Balasubramanyan, B. Routledge, org/W15-3703. doi:10.18653/v1/W15-3703. N. Smith, From Tweets to Polls: Linking Text Senti- [21] S. Rebora, Sentiment Analysis in Lit- ment to Public Opinion Time Series, Proceedings erary Studies. A Critical Survey, Digi- of the International AAAI Conference on Web and tal Humanities Quarterly 17 (2023). URL: Social Media 4 (2010) 122–129. URL: https://ojs.aaai. https://www.proquest.com/scholarly-journals/ org/index.php/ICWSM/article/view/14031. doi:10. sentiment-analysis-literary-studies-critical/ 1609/icwsm.v4i1.14031, number: 1. docview/2842908301/se-2?accountid=9941, place: [13] E. M. Clark, T. A. James, C. A. Jones, A. Alap- Providence. ati, P. Ukandu, C. M. Danforth, P. S. Dodds, A [22] F. Mambrini, M. C. Passarotti, The lila lemma bank: sentiment analysis of breast cancer treatment ex- A knowledge base of latin canonical forms, Jour- periences and healthcare perceptions across twit- nal of Open Humanities Data (2023). doi:10.5334/ ter, ArXiv abs/1805.09959 (2018). URL: https://api. johd.145. semanticscholar.org/CorpusID:44063573. [23] P. Cimiano, C. Chiarcos, J. P. McCrae, J. Gra- [14] E. Kim, R. Klinger, A Survey on Sentiment and cia, Linguistic Linked Data: Representation, Emotion Analysis for Computational Literary Stud- Generation and Applications, Springer Interna- ies, Zeitschrift für digitale Geisteswissenschaften tional Publishing, Cham, 2020. doi:10.1007/ 978-3-030-30225-2. [31] J.-M. Claassen, Displaced Persons: The Literature [24] F. Grotto, R. Sprugnoli, M. Fantoli, M. Simi, F. M. of Exile from Cicero to Boethius, Duckworth, 1999. Cecchini, M. C. Passarotti, The annotation of Liber Google-Books-ID: 1FkXAQAAIAAJ. Abbaci, a domain-specific latin resource, in: Pro- [32] M. Labate, Elegia triste ed elegia lieta. Un caso ceedings of the Eighth Italian Conference on Com- di riconversione letteraria, Materiali e discussioni putational Linguistics (CLiC-it 2021), aAccademia per l’analisi dei testi classici (1987) 91–129. doi:10. University Press, Milan, 2021, pp. 176–183. URL: 2307/40235896. https://doi.org/10.4000/books.aaccademia.10659. [33] M. C. Gaetano Berruto, La linguistica. Un corso [25] F. Mambrini, M. Passarotti, G. Moretti, M. Pelle- introduttivo, 3. edizione ed., UTET Università, grini, The Index Thomisticus Treebank as Linked [Grugliasco], 2022. Data in the LiLa Knowledge Base, in: C. Cal- [34] Z. Teng, D. T. Vo, Y. Zhang, Context-Sensitive Lexi- zolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, con Features for Neural Sentiment Analysis, in: Pro- T. Declerck, S. Goggi, H. Isahara, B. Maegaard, ceedings of the 2016 Conference on Empirical Meth- J. Mariani, H. Mazo, S. Odijk, Janand Piperidis ods in Natural Language Processing, Association (Eds.), Proceedings of the Thirteenth Language Re- for Computational Linguistics, Austin, Texas, 2016, sources and Evaluation Conference (lrec 2022), pp. 1629–1638. URL: http://aclweb.org/anthology/ European Language Resources Association (elra), D16-1169. doi:10.18653/v1/D16-1169. Marseille, France, 2022, pp. 4022–4029. URL: https: [35] I. Calvino, Why read the classics? Perchè leggere i //aclanthology.org/2022.lrec-1.428. classici?, Penguin., Londra, 2009. [26] F. M. Cecchini, R. Sprugnoli, G. Moretti, M. Pas- sarotti, UDante: First Steps Towards the Universal Dependencies Treebank of Dante’s Latin Works, in: J. Monti, F. Dell’Orletta, F. Tamburini (Eds.), Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020, Bologna, Italy, March 1–3 2021), Associazione italiana di linguistica computazionale (ailc), Accademia Uni- versity Press, Turin, Italy, 2020, pp. 99–105. URL: http://ceur-ws.org/Vol-2769/paper_14.pdf. [27] I. De Felice, L. Tamponi, F. Iurescia, M. Passarotti, Linking the corpus classes to the lila knowledge base of interoperable linguistic resources for latin, in: Proceedings of CLiC-it 2023: 9th Italian Con- ference on Computational Linguistics, Nov 30 — Dec 02, 2023, CEUR Workshop Proceedings, Venice, 2023, pp. 1–7. URL: https://ceur-ws.org/Vol-3596/ paper20.pdf. [28] R. Sprugnoli, M. Passarotti, D. Corbetta, A. Peverelli, Odi et Amo. Creating, Evaluating and Extending Sentiment Lexicons for Latin., in: Proceedings of the Twelfth Language Resources and Evaluation Conference, European Language Resources Associ- ation, Marseille, France, 2020, pp. 3078–3086. [29] M. Passarotti, F. Mambrini, G. Moretti, The ser- vices of the LiLa knowledge base of interoperable linguistic resources for Latin, in: Proceedings of the 9th Workshop on Linked Data in Linguistics @ LREC-COLING 2024, ELRA and ICCL, Torino, Italia, 2024, pp. 75–83. [30] D. Bamman, F. Mambrini, G. Crane, An Ownership Model of Annotation: The Ancient Greek Depen- dency Treebank, in: Proceedings of the Eighth International Workshop on Treebanks and Linguis- tic Theories, EDUCatt, Milan, Italy, 2009, pp. pp. 5–15.