Sentiment Analysis of Latin Poetry: First Experiments on the Odes of Horace Rachele Sprugnoli, Francesco Mambrini, Marco Passarotti, Giovanni Moretti CIRCSE Research Centre, Università Cattolica del Sacro Cuore Largo Agostino Gemelli 1, 20123 Milano {rachele.sprugnoli,francesco.mambrini marco.passarotti,giovanni.moretti}@unicatt.it Abstract (1) lead to replicable results, (2) benefit from tech- niques developed for analysing the sentiment con- In this paper we present a set of anno- veyed by any type of text and (3) be performed tated data and the results of a number of with freely available lexical and textual resources. unsupervised experiments for the analy- As for the latter, the research area dedicated to sis of sentiment in Latin poetry. More building and using linguistic resources for Clas- specifically, we describe a small gold stan- sical languages has seen a substantial growth dur- dard made of eight poems by Horace, in ing the last two decades (Sprugnoli and Passarotti, which each sentence is labeled manually 2020). For what concerns SA, we recently built for the sentiment using a four-value clas- a polarity lexicon for Latin nouns and adjectives, sification (positive, negative, neutral and called LatinAffectus. The current version of the mixed). Then, we report on how this gold lexicon includes 4,125 Latin lemmas with their standard has been used to evaluate two au- corresponding prior polarity value (Sprugnoli et tomatic approaches for sentiment classifi- al., 2020b). LatinAffectus was developed in the cation: one is lexicon-based and the other context of the LiLa: Linking Latin project (2018- adopts a zero-shot transfer approach.1 2023)2 (Passarotti et al., 2020) which aims at building a Knowledge Base of linguistic resources 1 Introduction for Latin based on the Linked Data paradigm, i.e. a collection of several data sets described us- The task of automatically classifying a (piece of) ing the same vocabulary of knowledge description text according to the sentiment conveyed by it, and linked together. LatinAffectus is connected to known as Sentiment Analysis (SA), is usually per- the Knowledge Base, thus making it interoperable formed for purposes such as monitoring contents with the other linguistic resources linked so far to of social media or evaluating customer experi- LiLa (Sprugnoli et al., 2020a). ence, by analysing texts like tweets, comments, In this paper we describe the use of LatinAf- and micro-blogs. fectus to perform SA of the Odes (Carmina) by A still under-investigated yet promising re- Horace (65 - 8 BCE). Written between 35 and 13 search area where developing and applying SA BCE, the Odes are a collection of lyric poems in resources and techniques is the study of literary four books. Following the models of Greek lyrical texts written in historical and, particularly, Classi- poets like Alcaeus, Sappho, and Pindar, the Odes cal languages (e.g. Ancient Greek and Latin). Ac- cover a wide range of topics related to the indi- tually, investigating the lexical properties of Clas- vidual and social life in Rome during the age of sical literary texts is a century-long common prac- Augustus, like love, friendship, religion, morality, tice. However, such investigation can nowadays patriotism, the uncertainty of life, the cultivation of tranquility and the observance of moderation. Copyright © 2021 for this paper by its authors. Use per- mitted under Creative Commons License Attribution 4.0 In- In spite of a rather lukewarm initial reception, the ternational (CC BY 4.0). Odes quickly became a capital source of influence, 1 This paper is the result of the collaboration between the in particular as a model of authorial voice and four authors. For the specific concerns of the Italian academic attribution system, Rachele Sprugnoli is responsible for Sec- tions 2, 3, 4.2, 5; Marco Passarotti is responsible for Section 1; Francesco Mambrini is responsible for Section 4.1. Gio- 2 vanni Moretti developed the zero-shot classification script. https://lila-erc.eu identity.3 Considering not only the importance of As for automatic classification systems, the lit- the Odes in the history of Latin and European lit- erature reports both lexicon-based (Bonta and Ja- erature, but also the diversity of the contents and nardhan, 2019) and machine learning approaches, tones of the poems collected therein, we argue that with a constant increasing use of deep learning performing SA on such work can lead to interest- techniques (Zhang et al., 2018). For example, Mo- ing results and might represent a use case to open hanty et al. (2018) experiment with Linear-SVM, a discussion about the pros and cons of applying Naive-Bayes and Logistic Regression classifiers SA techniques and resources to literary texts writ- on Odia poems, while Haider et al. (2020) perform ten in ancient languages. multi-label classification on German stanzas with All data presented in this paper are publicly re- BERT. Given the lack of training data for Latin po- leased: https://github.com/CIRCSE/La etry, in this paper we will instead test unsupervised tin Sentiment Analysis . approaches. 2 Related Work 3 Gold Standard Creation The majority of linguistic resources and applica- 3.1 Annotation tions in the field of SA involve non-literary and non-poetic texts, such as news and user-generated The Gold Standard (GS) consists of eight ran- content on the web (Medhat et al., 2014). How- domly selected odes,5 two from each of the four ever, affective information plays a crucial role in books that make up the work, for a total of 955 to- literature and, in particular, in poetry where au- kens, without punctuation, and 44 sentences (aver- thors try to provoke an emotional response in the age sentence length: 21, standard deviation: 11). reader (Johnson-Laird and Oatley, 2016). Anno- Texts were taken from the corpus prepared by the tated corpora of poems and SA systems specifi- LASLA laboratory in Liège.6 We performed a cally designed for poetry are not as numerous as single-label annotation of the original Latin text by those in other areas of research, first of all that Horace at sentence level. We have chosen the sen- of social media, but works have been carried out tence as unit of annotation because it represents an for several languages,4 including Arabic (Alsharif intermediate degree of granularity between that of et al., 2013), Spanish (Barros et al., 2013), Odia the verse and that of the stanza. In fact, the limited (Mohanty et al., 2018), German (Haider et al., length of a verse can hinder the full understanding 2020), Classical Chinese (Hou and Frank, 2015) of the sentiment it conveys, while a stanza, being and, of course, English (Sheng and Uthus, 2020; longer, risks to contain very different content and Sreeja and Mahalakshmi, 2019). thus, potentially, even opposite sentiments. Fur- Available annotated corpora of poems differ thermore, not all poems can be divided into stan- from each other from at least four points of view: zas, as this depends on the metric scheme of the annotation procedure (either involving experts or poem. Instead, sentences can be detected in every using crowdsourcing techniques), unit of analysis poem regardless of its metric scheme, and repre- (verse, stanza, whole poem), granularity of classi- sent a unit of meaning in their own right. fication (from binary classes, such as positive and In the annotation phase, we involved two ex- negative, to wide sets of emotions), foci of the perts in Latin language and literature (A1 and A2) emotions (annotation of the emotions as depicted and another annotator with basic knowledge of in the text by the author or as felt by the reader). Latin but provided with previous experience in With respect to previous work, in this paper we sentiment annotation (A3). Annotators were asked chose to involve experts, to perform annotation at to identify the sentiment conveyed by each sen- the sentence level (as an intermediate degree of tence in the GS, taking into consideration both the granularity between verse and stanza), to assign vocabulary used by the author and the images that four generic classes without defining the specific are evoked in the ode. More specifically, annota- emotion conveyed by the text, and to focus on the tors were asked to answer the following question: sentiment as depicted by the author. which of the following classes best describes how 3 5 For an orientation on the vast subject of the fortune and Book I: odes 10 and 17; Book II: odes 7 and 13; Book reception of the Odes see Baldo (2012). III: odes 13 and 23; Book IV: odes 7 and 11. 4 6 For a recent survey on sentiment and emotion analysis http://web.philo.ulg.ac.be/lasla/oper applied to literature, see Kim and Klinger (2018). a-latina/. are the emotions conveyed by the poet in the sen- this sentence as mixed, considering that it is im- tence under analysis? possible to identify a prevailing emotion between the negativity expressed by the verb ‘vietare’ (‘to • positive: the only emotions that are con- forbid’) and the positivity of ‘giorno radioso’ (‘ra- veyed at lexical level and the only images that diant day’). However, the translation of the Latin are evoked are positive, or positive emotions verb rapio is not appropriate: the Italian verb ‘in- are clearly prevalent; volare’ (‘to steal’) does not convey the idea of the • negative: the only emotions that are con- violent force inherent in rapio, which can be more veyed at lexical level and the only images that correctly translated with the verb ‘to plunder’.7 are evoked are negative, or negative emotions are clearly prevalent; 3.3 Reconciliation • neutral: there are no emotions conveyed Disagreements were discussed and reconciled by by the text; the three annotators: Table 1 presents the num- ber of sentences and tokens per sentiment class. • mixed: lexicon and evoked images produce Our GS includes a majority of positive sentences opposite emotions; it is not possible to find a (45.4%). Positive (average length: 21, standard clearly prevailing emotion. deviation: 11), negative (average length: 24, stan- The annotation of the GS was organized in four dard deviation: 14), and mixed (average length: phases. In the first phase, annotators worked 25, standard deviation: 9) sentences are consid- together collaboratively assigning the sentiment erably longer than neutral ones (average length: class to four of the eight odes (21 sentences): the 8, standard deviation: 3). Annotated examples task was discussed and a common procedure was are given in Table 2: English translations by defined. In the second phase, annotators worked Kaimowitz et al. (2008) are included for clarity. independently on the other four odes (23 sen- Sentences Tokens tences): A1 and A2 annotated the original Latin positive 20 411 text, while A3 annotated the same odes using an negative 12 292 Italian translation (Horace and Nuzzo, 2009) to understand how the use of texts not in the origi- neutral 3 23 nal language can alter the annotation of the senti- mixed 9 229 ment. In the third phase, we calculated the Inter- TOTAL 44 955 Annotator Agreement, whereas in the last phase Table 1: Gold Standard statistics. disagreements were discussed and reconciled. 3.2 Inter-Annotator Agreement 4 Experiments Cohen’s k between A1 and A2 resulted in 0.5, 4.1 Lexicon-Based Sentiment Analysis while Fleiss’s k among the three annotators (A1- A2-A3) resulted in 0.48 (both these results are The dataset for this experiment is obtained by considered moderate agreement). In particular, the means of a simple dictionary lookup of the lem- negative class proved to be the easiest to be mas in the LatinAffectus sentiment lexicon. En- annotated (with a Fleiss’s k of 0.64), followed by tries in the lexicon are assigned a score of: -1.0, neutral (0.57) and positive (0.45), whereas -0.5 (negative polarity), 0 (neutral polarity), +0.5, mixed was the most problematic class (0.23). +1.0 (positive polarity). The tokens in the Odes We noticed that the Italian translation was that are lemmatized under lemmas that also have sometimes misleading, resulting in cases of dis- an entry in the LatinAffectus are assigned the score agreement: e.g., the sentence inmortalia ne speres that is found in the lexicon. For instance, the ad- monet annus et almum quae rapit hora diem, (ode jective malus ‘bad’ is found with a polarity value IV, 7) is translated as ‘speranze di eterno ti vietano of -1.0 in LatinAffectus. All tokens lemmatized as gli anni e le ore che involano il giorno radioso’ malus (adj.) are thus given a score of -1.0. Note (literal translation of the Italian sentence into En- 7 See for instance the English translation by Kaimowitz et glish: ‘hopes of eternity forbid you the years and al. (2008): “Do not hope for what’s immortal, the year warns, the hours that steal the radiant day’). A3 marked and the hour which plunders the day”. Ode Sent. Text Translation Class Here for you will flow hic tibi copia manabit ad plenum 1.17 103 abundance from the horn that positive benigno ruris honorum opulenta cornu spills the country’s splendors All that you bestow upon cuncta manus auidas fugient 4.7 549 your heart escapes the greedy negative heredis amico quae dederis animo hands of an heir With the Zephyrs cold grows frigora mitescunt Zephyris uer mild, summer tramples proterit aestas interitura simul springtime, soon to die, 2.13 265 mixed pomifer autumnus fruges effuderit once productive autumn pours et mox bruma recurrit iners forth its fruits, and shortly lifeless winter is back Who will Venus name as 2.7 235 quem Venus arbitrum dicet bibendi neutral master of the wine? Table 2: Annotated examples taken from the Gold Standard. that a score of 0.0 is assigned to both words ex- negative, 67% neutral, while no correct pressly annotated as neutral in LatinAffectus and predictions were given for mixed. to those that do not have an entry in the lexicon. The dictionary lookup required some manual 4.2 Zero-Shot Classification disambiguation in cases of ambiguity due to ho- We trained a language model for SA on English mography. For 18 lemmas (corresponding to 49 and tested it on our GS by relying on two state- tokens in the Odes), the sentiment lexicon pro- of-the-art multilingual models. More specifically, vides multiple values; in most cases, as with ales we fine-tuned Multilingual BERT (mBERT) (Pires ‘winged’ (adj.), but also ‘bird’ (n.), the variation et al., 2019) and XLM-RoBERTa (Conneau et al., is due to a different polarity attributed to the syn- 2020) with the GoEmotions corpus (Demszky et tactic uses of the word (in the example, to the ad- al., 2020) using the Hugging Face’s PyTorch im- jective and the noun). In such cases, the PoS an- plementation.8 GoEmotions is a dataset of com- notation in the LASLA corpus was used to dis- ments posted on Reddit manually annotated for ambiguate and assign the correct score. We also 27 emotion categories or Neutral. In order to reviewed those words that, although not tagged as adapt this dataset to our needs, we mapped the nouns or adjectives in LASLA, still yield a match emotions into sentiment categories as suggested in LatinAffectus. After revision, we decided to by the authors themselves. For example, joy and keep the scores for a series of lemmas annotated love were converged into a unique positive as numerals in the corpus (simplex ‘simple, plain’, class, whereas fear and grief were merged under primus and primum ‘first’, prius ‘former, prior’) the same negative class. The neutral cat- and the indefinite pronoun solus ‘alone, only’ that egory remained intact and comments annotated in LatinAffectus are marked as adjectives. with emotions belonging to opposite sentiments A sentence score (S) was computed by sum- were marked as mixed. Comments labeled with ming the values of all words. Thus, we attributed ambiguous emotions (i.e. realization, surprise, cu- the label positive to all the sentences with riosity, confusion) were instead left out.9 With this score S > 0 and negative where S < 0. procedure, we built a training set made of 18,617 For S = 0, we attributed neutral to sen- positive, 10,133 negative, 1,965 neutral and 1,581 tences where all words had a score of 0 and mixed comments. For fine-tuning, we chose the mixed where positive and negative words were 8 equivalent. The overall accuracy of this method https://huggingface.co/transformers/ is 48% (macro-average F1 37, weighted macro- index.html 9 For the full mapping, please see: https://github average F1 44) with unbalanced scores among .com/google-research/google-research/blo the four classes: 70% for positive, 42% for b/master/goemotions/data/sentiment mappi ng.json. Language Test Set Genre mBERT XLM-RoBERTa GoEmotions social media 86% 73% English AIT-2018 social media 64% 59% Poem Sentiment literary - poetry 50% 70% MultiEmotions-It social media 70% 75% Italian AriEmozione literary - opera 50% 52% Latin Horace GS literary - poetry 32% 30% Table 3: Accuracy of the mono-lingual and cross-lingual (zero-shot) classification method. Lexicon-Based SA Zero-Shot mBERT Zero-Shot XML-RoBERTa P R F1 P R F1 P R F1 positive 0.56 0.70 0.62 0.83 0.25 0.38 1.00 0.10 0.18 negative 0.62 0.42 0.50 0.75 0.50 0.60 0.53 0.67 0.59 neutral 0.25 0.67 0.36 0.10 1.00 0.18 0.11 1.00 0.20 mixed 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Table 4: Precision (P), recall (R) and F1-score (F1) for the lexicon-based method and for the zero-shot classification experiment. following hyperparameters: 32 for batch size, 2e-5 • MultiEmotions-It: a multi-labeled emotion for learning rate, 6 epoches, AdamW optimizer.10 dataset made of Italian comments posted on We evaluated the trained model on different YouTube and Facebook (Sprugnoli, 2020). datasets, including our GS. For each of the follow- The original emotion labels were converted ing test sets, we randomly selected 44 texts so to into our four classes. have the same number of input data as in our GS: Table 3 reports the results of mono-lingual • GoEmotions: test set taken from the same and cross-lingual classification for the different corpus used for training the English model. datasets briefly described above and for the two • Poem Sentiment: collection of English verses pre-trained multilingual models. There is no clear annotated with the same sentiment classes as prevalence of one model over the other: results in our GS (Sheng and Uthus, 2020). vary greatly from one dataset to another. On • AIT-2018: English data of the emotion clas- the same language (thus without zero-shot trans- sification task of SemEval-2018 Task 1: Af- fer), we notice a drop in the performance for both fect in Tweets (Mohammad et al., 2018). mBERT and XML-RoBERTa when moving from Each tweet is annotated as neutral or as one, Reddit comments, that is the same type of text or more, of eleven emotions. The original an- as the training data, to tweets, but even more so notation was mapped onto our four sentiment when they are evaluated on poems. As for the classes, leaving out ambiguous emotions. zero-shot classification, results on Italian YouTube • AriEmozione: verses taken from 18th cen- and Facebook comments are better than the ones tury Italian opera texts annotated with one registered on English tweets, but accuracy drops or two emotions and the level confidence of when applied to opera verses. However, the worst the annotators (Fernicola et al., 2020). We results are recorded for Latin with an accuracy randomly selected our test set from verses equal to, or slightly above 30% (for mBERT: with high confidence scores, mapping emo- macro-average F1 29, weighted macro-average F1 tions onto our four sentiment classes. Since 35; for XML-RoBERTa: macro-average F1 24, the dataset does not contain verses annotated weighted macro-average F1 26). For both mBERT with opposite emotions, the class mixed is and XML-RoBERTa, we register the same trend not present in the test set we built. at class level: perfect accuracy for neutral, 10 good accuracy for negative (50% with mBERT We adapted the following implementation: https:// gist.github.com/sayakmisra/b0cd67f406b4e and 67% with XML-RoBERTa), low accuracy for 4d5972f339eb20e64a5. positive (25% with mBERT and 10% with XML-RoBERTa) and no correct predictions for should reverses the positive polarity of facundia mixed. ‘eloquence’ and pietas ‘devotion’. This problem could be mitigated by modifying the script with 5 Conclusions and Future Work rules that take into account negations and their fo- cus. In this paper we have presented a new GS, made Regarding the zero-shot classification approach, of odes written by Horace, for the annotation of the very low performances on Latin deserve fur- sentiment in Latin poetry. The extension of the ther investigation. It is possible that the problem manually annotated dataset is one of our future lies in the data used to build the pre-trained mod- work: the goal is to have a sufficient amount of els: i.e., Wikipedia for mBERT and Common- data to test supervised systems. We have also ex- crawl for XML-RoBERTa. Both resources were perimented two different SA approaches that do developed by relying on automatic language de- not require training data: both of them are not able tection engines and are highly noisy due to the to correctly identify sentences with mixed senti- presence of languages other than Latin and of ments, which, in any case, are the most problem- terms related to modern times. An additional im- atic also for human annotators. Table 4 reports a provement may also come from using for fine- comparison in terms of precision, recall and F1- tuning an annotated in-domain corpus in a well- score among the lexicon-based approach and the resource language, that is a corpus of annotated zero-shot classification experiments with both the poems: unfortunately, the currently available cor- mBERT and the XML-RoBERTa models. The pora are not big enough for such purpose. former performs better on the positive class whereas the zero-shot method achieves a higher Acknowledgments F1-score on the negative one even if this class This project has received funding from the Eu- is not the most frequent in the training data. Both ropean Research Council (ERC) under the Euro- mBERT and XML-RoBERTa obtain a very high pean Union’s Horizon 2020 research and innova- precision on the sentences marked as positive tion programme – Grant Agreement No. 769994. (0.83 and 1.00 respectively) but the recall is ex- tremely low (0.25 and 0.10 respectively). On the contrary, for the neutral class, the recall is per- References fect (1.00 for both models) but the precision is Ouais Alsharif, Deema Alshamaa, and Nada Ghneim. very low (0.10 and 0.11 respectively). 2013. Emotion classification in arabic poetry using A manual inspection of the output of the machine learning. International Journal of Com- lexicon-based method revealed two main prob- puter Applications, 65(16). lems of that approach: i) the limited coverage Gianluigi Baldo. 2012. Horace (Quintus Horatius of LatinAffectus and ii) sentiment shifters are not Flaccus), Carmina. In Christine Walde and Brigitte properly taken into consideration. As for the first Egger, editors, Brill’s New Pauly Supplements I - point, LatinAffectus covers the 43% of nominal Volume 5 : The Reception of Classical Literature. and adjectival lemmas in the GS, leaving out lem- Brill, Amsterdam, October. Publisher: Brill. mas with a clear sentiment orientation. To over- Linda Barros, Pilar Rodriguez, and Alvaro Ortigosa. come this issue, we are currently working on the 2013. Automatic Classification of Literature Pieces extension of the lexicon with additional 10,000 by Emotion Detection: A Study on Quevedo’s Po- lemmas. Regarding the sentiment shifters, their etry. In 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, impact is exemplified by the following sentence: pages 141–146. IEEE. cum semel occideris et de te splendida Minos fe- cerit arbitria non Torquate genus non te facun- Venkateswarlu Bonta and Nandhini Kumaresh2and N Janardhan. 2019. A comprehensive study on dia non te restituet pietas (‘When you at last have lexicon based approaches for sentiment analysis. died and Minos renders brillant judgement on your Asian Journal of Computer Science and Technology, life, no Torquatus, not birth, not eloquence, not 8(S2):1–6. your devotion will bring you back.’ - ode IV, 7). Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Here, the sentiment score calculated by the script Vishrav Chaudhary, Guillaume Wenzek, Francisco is very positive (3) because it does not handle Guzmán, Edouard Grave, Myle Ott, Luke Zettle- the frequent negations: however, the particle non moyer, and Veselin Stoyanov. 2020. Unsupervised cross-lingual representation learning at scale. In Proceedings of the Eleventh International Confer- Proceedings of the 58th Annual Meeting of the Asso- ence on Language Resources and Evaluation (LREC ciation for Computational Linguistics, pages 8440– 2018), Paris, France, may. European Language Re- 8451, Online, July. Association for Computational sources Association (ELRA). Linguistics. Marco Passarotti, Francesco Mambrini, Greta Franzini, Dorottya Demszky, Dana Movshovitz-Attias, Jeong- Flavio Massimiliano Cecchini, Eleonora Litta, Gio- woo Ko, Alan Cowen, Gaurav Nemade, and Su- vanni Moretti, Paolo Ruffolo, and Rachele Sprug- jith Ravi. 2020. GoEmotions: A Dataset of Fine- noli. 2020. Interlinking through lemmas. the lexical Grained Emotions. In 58th Annual Meeting of the collection of the LiLa knowledge base of linguis- Association for Computational Linguistics (ACL). tic resources for Latin. Studi e Saggi Linguistici, 58(1):177–212. Francesco Fernicola, Shibingfeng Zhang, Federico Garcea, Paolo Bonora, and Alberto Barrón-Cedeño. Telmo Pires, Eva Schlinger, and Dan Garrette. 2019. 2020. AriEmozione: Identifying Emotions in Opera How Multilingual is Multilingual BERT? In Pro- Verses. In Proceedings of the Seventh Italian ceedings of the 57th Annual Meeting of the Asso- Conference on Computational Linguistics (CLiC-it ciation for Computational Linguistics, pages 4996– 2020). Accademia University Press. 5001. Thomas Haider, Steffen Eger, Evgeny Kim, Roman Emily Sheng and David C Uthus. 2020. Investigat- Klinger, and Winfried Menninghaus. 2020. PO- ing societal biases in a poetry composition system. EMO: Conceptualization, annotation, and model- In Proceedings of the Second Workshop on Gender ing of aesthetic emotions in German and English Bias in Natural Language Processing, pages 93– poetry. In Proceedings of the 12th Language Re- 106. sources and Evaluation Conference, pages 1652– 1663, Marseille, France, May. European Language Rachele Sprugnoli and Marco Passarotti, editors. 2020. Resources Association. Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Horace and Gianfranco Nuzzo. 2009. I quattro libri Languages, Marseille, France, May. European Lan- delle Odi e l’Inno secolare di Quinto Orazio Flacco. guage Resources Association (ELRA). Flaccovio. Rachele Sprugnoli, Francesco Mambrini, Giovanni Yufang Hou and Anette Frank. 2015. Analyzing sen- Moretti, and Marco Passarotti. 2020a. Towards the timent in classical chinese poetry. In Proceedings Modeling of Polarity in a Latin Knowledge Base. In of the 9th SIGHUM Workshop on Language Tech- Proceedings of the Third Workshop on Humanities nology for Cultural Heritage, Social Sciences, and in the Semantic Web (WHiSe 2020), pages 59–70. Humanities (LaTeCH), pages 15–24. Rachele Sprugnoli, Marco Passarotti, Daniela Cor- betta, and Andrea Peverelli. 2020b. Odi et Amo. Philip N. Johnson-Laird and Keith Oatley. 2016. Creating, Evaluating and Extending Sentiment Lexi- Emotions in music, literature, and film. In Lisa cons for Latin. In Proceedings of the 12th Language Feldman Barrett, Michael Lewis, and Jeannette M. Resources and Evaluation Conference, pages 3078– Haviland-Jones, editors, Handbook of emotions, 3086. chapter 3, pages 82–97. The Guildford Press. Rachele Sprugnoli. 2020. MultiEmotions-it: A new Jeffrey H Kaimowitz, Ronnie Ancona, et al. 2008. The dataset for opinion polarity and emotion analysis odes of Horace. Johns Hopkins University Press. for Italian. In Proceedings of the Seventh Italian Conference on Computational Linguistics (CLiC-it Evgeny Kim and Roman Klinger. 2018. A survey on 2020), pages 402–408. Accademia University Press. sentiment and emotion analysis for computational literary studies. arXiv preprint arXiv:1808.03137. PS Sreeja and GS Mahalakshmi. 2019. Perc-an emo- tion recognition corpus for cognitive poems. In Walaa Medhat, Ahmed Hassan, and Hoda Korashy. 2019 International Conference on Communication 2014. Sentiment analysis algorithms and applica- and Signal Processing (ICCSP), pages 0200–0207. tions: A survey. Ain Shams engineering journal, IEEE. 5(4):1093–1113. Lei Zhang, Shuai Wang, and Bing Liu. 2018. Deep Saif M. Mohammad, Felipe Bravo-Marquez, Moham- learning for sentiment analysis: A survey. Wiley In- mad Salameh, and Svetlana Kiritchenko. 2018. terdisciplinary Reviews: Data Mining and Knowl- Semeval-2018 Task 1: Affect in tweets. In Proceed- edge Discovery, 8(4):e1253. ings of International Workshop on Semantic Evalu- ation (SemEval-2018), New Orleans, LA, USA. Gaurav Mohanty, Pruthwik Mishra, and Radhika Mamidi. 2018. Kabithaa: An annotated corpus of odia poems with sentiment polarity information. In