=Paper=
{{Paper
|id=Vol-3878/2_main_long
|storemode=property
|title=Lifeless Winter without Break: Ovid's Exile Works and the LiLa Knowledge Base
|pdfUrl=https://ceur-ws.org/Vol-3878/2_main_long.pdf
|volume=Vol-3878
|authors=Aurora Alagni,Francesco Mambrini,Marco Passarotti
|dblpUrl=https://dblp.org/rec/conf/clic-it/AlagniMP24
}}
==Lifeless Winter without Break: Ovid's Exile Works and the LiLa Knowledge Base==
Lifeless Winter without Break: Ovid’s Exile Works and the
LiLa Knowledge Base
Aurora Alagni1,* , Francesco Mambrini1 and Marco Passarotti1
1
Università Cattolica del Sacro Cuore, Largo Gemelli 1, Milano, 20123, Italy
Abstract
In this paper we describe the process of semi-automatic annotation and linking performed to connect two works by the
Latin poet Ovid to the LiLa Knowledge Base. Written after Ovid’s exile from Rome, the Tristia and the Epistulae ex Ponto
mark the beginning of the “literature of exile”. In spite of their importance, no lemmatized version existed and the two
collections were not part of the major annotated corpora linked to LiLa. The paper discusses the workflow used to annotate
and publish the works as Linked Open Data connected to the LiLa Knowledge Base. On account of their subject and the
emotional tone attached to the theme of exile, the two works are particularly relevant for sentiment analysis. We discuss some
results of a lexicon-based analysis that is enabled by the interlinking with LiLa. We use LatinAffectus, a manually-generated
polarity lexicon for Latin nouns and adjectives, to perform Sentiment Analysis on the aforementioned works and interpret
the (replicable) results by consulting and simultaneously enriching the available literary scholarship with new information.
Keywords
Linked Open Data, Lemmatization, Latin, Sentiment Analysis, Humanities Computing
1. Introduction history of Western literature. His mythological poem in
15 books (the Metamorphoses, written between 2 and 8
The World Wide Web provides Latin scholars with a CE) has been a crucial source of inspiration for artists like
plethora of free, high-quality resources, issued from a Dante, Shakespeare, or Titian. His body of elegiac poetry
long tradition of linguistic and philological study; many of erotic subject won him immense popularity during
digital libraries, such as the Perseus Digital Library [1] his life and afterwards. In spite of his importance, the
or the Digital Latin Library [2], supply electronic and work of Ovid is not represented in full neither in the LiLa
often machine-actionable versions of some of the most network, nor in any other annotated corpora. The LASLA
studied texts in world literature. In the last years, the corpus provides only his earlier works (Ars Amatoria,
CIRCSE Research Center has developed the LiLa Knowl- Remedia Amoris, Medicamina, Amores, Heroids) and other
edge Base with the objective of making the distributed poems (Fasti, Halieutica, Ibis), while the annotation of
knowledge about Latin texts interoperable through the the Metamorphoses is listed as “in progress”.
application of the principles of the Linked Data paradigm Among the works that are utterly missing figure two
[3]. LiLa (presented below in sec. 3) now includes a of the last books of Ovid’s career, the Tristia (“Sorrows”
number of lexicons and annotated corpora. In particular, or “Lamentations”, written between 9–13 CE) and the
the Opera Latina LASLA corpus, a manually lemmatized Epistulae ex Ponto (“Letters from the Black Sea”, 12–17
and morphosyntactically annotated corpus of more than CE, henceforth Epistulae) that were partly published af-
1.5 million words mainly belonging to Classical Latin ter the poet’s death. These two poetic collections center
literature that was recently added to LiLa [4], has sig- around Ovid’s forced departure from Rome and exile to
nificantly expanded the textual heritage within the LiLa the town of Tomis (modern-day Constant, a in Romania),
Knowledge Base, which now provides a Linked Open at the furthest ends of the Roman empire. Despite his
Data (LOD) compliant edition of many widely studied many attempts, Ovid would never come back from this
literary works. “utmost part of an unknown world” (extremis ignoti part-
Publius Ovidius Naso (anglicized as Ovid, 43 BCE - 17 ibus orbis, Tr. 3.3.31 ) nor was he ever restored to his
CE) is arguably one of the most influential writers in the previous status. The two works are a fundamental source
for the biography of the poet. Moreover, they are a foun-
CLiC-it 2024: Tenth Italian Conference on Computational Linguistics,
Dec 04 — 06, 2024, Pisa, Italy
dational archetype of a peculiar sub-genre that is still
* influential in modern days, the “exile literature” [6].
Corresponding author.
†
These authors contributed equally. Ovid’s exilic works were banished from libraries, and
$ aurora.alagni01@icatt.it (A. Alagni); although they survived, were often judged unfavorably
francesco.mambrini@unicatt.it (F. Mambrini); by the critics [7, xxxvi]. The present study aims, in part,
marco.passarotti@unicatt.it (M. Passarotti) at revoking the ban that still seems to weigh on these
0000-0003-0834-7562 (F. Mambrini); 0000-0002-9806-7187
(M. Passarotti)
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License 1
Attribution 4.0 International (CC BY 4.0). All English translations are by Wheeler [5].
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
Ovidian poetic collections, allowing them to enter the coincides with the so-called "affective turn" in the hu-
LiLa network. In what follows we describe how we pre- manities and social sciences, which has fostered renewed
pared a lemmatized and part-of-speech (POS) tagged ver- engagement with emotion [15]. However, there remain
sion of the two poems and how we linked this edition significant limitations in the application of sentiment
to the network of textual and lexical resources for Latin analysis within Computational Literary Studies, two of
connected to LiLa. Our work fills the significant gap cre- which are addressed in this paper.
ated by the absence of the exilic works of Ovid from the First, while the World Wide Web and social media
available annotated corpora. In addition, it also links to represent an ostensibly infinite repository of emotions,
LiLa two collections of poems that, on account of their annotated corpora of literary texts are still infrequently
subject, foreground the emotional tone, and were suc- available. This is especially true for classical languages.
cessful in shaping the conventions of exilic literature; As previously mentioned and will be further illustrated in
these works established the literary codification of the this paper, this limitation can be mitigated through the de-
psychological reactions to banishment, within a veritable velopment and dissemination of interoperable resources.
poetics of exile. Their content and historical relevance To our knowledge, there are only a few experiments
make them ideal candidates for a computationally based conducted in classical languages. Sprugnoli et al. [16]
study on the sentiment analysis of literary texts. evaluated two distinct approaches to automatic polarity
The paper is structured as follows. Section 2 reviews classification of eight odes by the Latin author Horace:
related work, with a specific focus on sentiment analy- a lexicon-based approach, grounded in the first version
sis within the field of Computational Literary Studies. of LatinAffectus, and a zero-shot classification method.
Section 3 introduces the LiLa Knowledge Base and the Sprugnoli et al. [17] present an example of how to use
language resources connected to it. Section 4 describes interoperable resources to analyse the sentiment value of
the workflow followed for the annotation, publication the Latin epistles by Dante Alighieri, employing SPARQL
and linking of the works. Section 5 discusses the type queries that access an extended version of LatinAffectus,
of knowledge that can be gained by combining the data the LiLa Knowledge Base, and UDante. Pavlopoulos et al.
from LatinAffectus, a prior polarity lexicon of Latin in- [18] annotated the sentiment of a modern Greek transla-
cluded in LiLa, and the newly prepared edition of the tion of the first book of the Iliad and demonstrated that a
works, for a lexicon-based approach to their sentiment. fine-tuned version of GreekBERT can achieve a low error
Section 6 presents the conclusions and discusses plans rate. Zhao et al. [19] proposed a model based on transfer
for future work. learning to classify a dataset of Tang Dynasty Chinese
poems and compared the sentiment analysis results with
social history analysis. After constructing a sentiment
2. Related Work lexicon for Classical Chinese poetry, Hou et al. [20] eval-
uated it both intrinsically and extrinsically, highlighting
Sentiment analysis (SA) is the field of study that analyses
that their analysis results align with the main findings
people’s opinions, sentiments, appraisals, attitudes, and
established in Classical Chinese literary studies.
emotions toward entities and their attributes expressed
Second, although sentiment analysis in the field of
in written text [8]. Considering that opinions have now
Computational Literary Studies is employed to address
a fundamental role in everyday life, SA is not just an
questions related to literary theory, the results often lack
object of research in the field of NLP, but also in busi-
connection to a rigorous analysis, focusing solely on per-
ness, economic, political, even medical domains. Indeed,
2 formance metrics. The aforementioned studies exemplify
sentiment analysis has numerous applications , ranging
this tendency, particularly since only those conducted
from investigating product reviews to enhance product
on Classical Chinese take literary studies into account.
development [10], analysing news related to the stock
Rarely do they contribute to advancements in literary
market to predict price trends [11], monitoring social
criticism, an area that could greatly benefit from clear and
media to forecast election outcomes [12], and evaluating
reproducible results, considering that it typically relies
public health through tweets about patient experiences
on the intuition of critics. This issue has been highlighted
[13].
by Rebora [21], who notes that while the strongest con-
Furthermore, sentiment analysis has recently emerged
nection between literary theory and sentiment analysis
as one of the most discussed topics within the realm of
3 occurs in the field of narratology, the actual points of in-
Computational Literary Studies . This rise in prominence
tersection reveal themselves to be problematic and based
on questionable assumptions. This paper will also ad-
2
See Wankhade et al. [9] for an in-depth overview of the applications dress these concerns, as the results of sentiment analysis
of sentiment analysis, as well as the methods for conducting this conducted on Ovid’s exilic works are closely intertwined
task.
3
For an extensive survey on sentiment and emotion analysis applied with the literary scholarship surrounding those texts. Al-
to literature, see the paper by Kim and Klinger [14]. though our findings may not be generalisable due to their
basis in a small, yet highly controlled dataset, our method values expressing their prior polarity, that is their senti-
is clearly reproducible and shareable. ment orientation regardless of the context of use [8], have
been associated. The classification adopts five numeric
values: -1.0 (fully negative, as e.g. uulnus, “wound”), -0.5
3. Latin resources in LiLa (negative, grauis, “serious”), 0 (neutral, ianua, “door”),
+0.5 (positive, ius, “justice”), +1.0 (fully positive, pietas,
LiLa is a network of interconnected language resources
“devotion”).
for Latin aimed at insuring interoperability between cor-
In the second part of this paper (Sec. 5) we will make
pora, lexicons and natural language processing (NLP)
use of data from LatinAffectus to perform lexicon-based
tools. To pursue its goal, it adopts the Linked Data
Sentiment Analysis of Ovid’s exilic works. The results
paradigm. At the heart of the project, the interlinking be-
obtained from the SA conducted on the Tristia and the
tween the different components is ensured by the Lemma
Epistulae, clear and reproducible, and their interpreta-
Bank [22], a collection of canonical forms (lemmas) that
tion carried on in light of the previous results of literary
can be used to lemmatize texts and index entries in dictio-
criticism on the subject allowed us to investigate the evo-
naries. Each lemma of the Lemma Bank is provided with
lution of Ovid’s poetic journey (Sec. 5.1) and the decline
a unique identifier, in the form a URL resolvable on the
of relationships with those left behind in Rome (Sec. 5.2).
World Wide Web, and described by a series of properties
modeled with the help of OWL ontologies for Linguistic
LOD, such as Ontolex [23, 45-59]. 4. Ovid’s exile works as LOD
Currently, the Lemma Bank includes 226,775 canonical
forms, which are used to link 14 lexical resources and The Tristia are a collection of 50 poems in elegiac meter
7 corpora. The latter include collections of texts from (i.e. couplet of lines with an hexameter followed by a
different times and genres (from the works of Medieval pentameter) divided into 5 books. The Epistulae include
authors like the mathematician Fibonacci [24], Thomas 46 letters in elegiac couplets divided into 4 books. The
Aquinas [25] or Dante Alighieri [26], to inscriptions from poetry in both works mixes the themes of lamentation
various areas of the Roman Empire [27]). The largest col- over the exile and the desperate plead (peroratio) directed
lection of Classical literary texts is provided by the Opera towards the loved ones and potential allies in Rome.
Latina, a manually crafted corpus with morphological The starting point of our edition was a plain-text ver-
annotation and lemmatization developed since the 1960s sion of the two works, which we obtained from The Latin
by the LASLA laboratory of the University of Liège. The Library6 . The two works consists of a total of 43,438 to-
LASLA corpus (which is still in development) includes kens (without punctuation), and 3,061 sentences. Few
131 Latin works by 19 authors, ranging chronologically preprocessing operations were performed over the texts,
from Plautus (c. 254 – 184 BC) to Juvenal (55 – 128 CE). namely the addition of three missing lines, which were
As said, however, even such comprehensive collection omitted by mistake in the original source (Tr. 3.10.44 and
does not cover the whole extant production, also for some 52, Tr. 5.12.50), the correction of evident transcription
of the major authors within that time span; Ovid’s exilic errors (most likely due to OCR issues, e.g. virunique for
words are a prominent example of missing texts. To fill virumque, Tr. 2.372), the standardization of capitalization
the gaps in LASLA, and widen the chronological span usage, and the adoption of the "u" character even for the
of ancient authors to the end of the Roman era in the voiced labiodental fricative [v], following the convention
6th Century CE, the CIRCSE has launched a new collec- adopted in the LiLa Lemma Bank.
tion (natively linked to LiLa) called the “CIRCSE Latin Tokenization, sentence splitting, lemmatization and
Library”4 . POS tagging were performed automatically by the LiLa
Among the lexical resources produced within LiLa5 , Text Linker, a POS-tagger and lemmatizer for the Latin
LatinAffectus [28] is a manually generated polarity lex- language developed as one of the user-dedicated services
icon of Latin adjectives and nouns. The lexicon was of LiLa that also links the output of the NLP operations
designed to support research in Sentiment Analysis (SA) to the entries in the Lemma Bank [29]. For POS-tagging
[8], an approach to the linguistic and literary studies of and lemmatization the Text Linker uses a custom-trained
ancient texts that, although still in its infancy, is gaining UDPipe model (as documented in [29]). The output of
growing recognition [18][16]. the tasks performed automatically was systematically re-
In its latest version, LatinAffectus contains 6,018 lem- viewed and manually corrected by one annotator adopt-
mas, 2,216 adjectives and 3,802 nouns, to which numerical ing a scholarly annotation approach [30]. 42 tokenization
4
http://lila-erc.eu/data/corpora/CIRCSELatinLibrary/id/corpus. errors were identified (on average between 4 and 5 per
5
For a complete list of the resources currently linked to LiLa, see: book), often due to a failure to segment punctuation (e.g.
https://lila-erc.eu/data-page/. Please note that all LiLa’s resources the sequence legent? in Tr. 5.1.94).
are assigned DOIs registered through Zenodo and are also available
6
in CLARIN. http://www.m.thelatinlibrary.com/ovid.html.
Table 1 (e.g. the Thracian tribe of the “Corallis”, Ep. 5.2.37, or
Accuracy of POS tagging and lemmatization per book of Epis- the unknown poet “Marius”, mentioned in Ep. 4.16.24).
tulae and Tristia as performed by the LiLa Text Linker Table 2 shows the performances of the POS-tagger for
the 12 out of 17 tags that were used more than 1,000
Accuracy
Book Nr. of tokens POS Tagging Lemmatization
times8 . With an F1-score sensibly under 90%, proper
nouns (PROPN) is the most challenging class for the model
Ep. 1 5,923 0.95 0.93 to predict.
Ep. 2 5,770 0.97 0.94 All tasks (tokenization, POS-tagging, lemmatization
Ep. 3 5,671 0.97 0.95
and linking) are closely interconnected: an error in tok-
Ep. 4 7,099 0.97 0.94
Tr. 1 5,805 0.96 0.94 enization inevitably leads to an error in lemmatization
Tr. 2 4,427 0.96 0.93 and POS tagging, which then causes a wrong or missing
Tr. 3 6,214 0.96 0.95 linking. For example, 18 forms of the verb addo, “to add”,
Tr. 4 5,311 0.97 0.95 in the second person singular imperative, adde, “add”,
Tr. 5 5,980 0.96 0.94 were mislabeled as proper nouns (PROPN), and thus as-
TOT 52,200 0.94 0.96 signed to a nonexistent lemma "Ads". Once disambigua-
tions and corrections were performed, the digital edi-
tions of the Tristia and the Epistulae were prepared and
Table 2 published as Linked Data, as part of the “CIRCSE Latin
Evaluation of POS tagging for the 11 tags with support > 1,000 Library”9 .
tokens
POS-Tag Precision Recall F1-score Support
5. Sentiment analysis and Ovid’s
VERB 0.98 0.97 0.97 10,960
NOUN 0.96 0.97 0.96 10,626 exile works
PUNCT 1.00 1.00 1.00 8667
ADJ 0.95 0.90 0.92 Thanks to the work performed in the linking process,
4,702
ADV 0.96 0.95 0.95 each token of the two exilic poems is now connected
3,955
DET 0.95 0.99 0.97 3,836
to the respective lemma within the Lemma Bank via a
PRON 0.99 0.93 0.96 dedicated property (hasLemma)10 defined in the OWL
3,276
CCONJ 0.99 0.99 0.99 1,698
ontology of the LiLa project [3]. As the lemma’s URI is
ADP 0.96 0.99 0.98 1,625
the same that is used as canonical form for the entries of
PROPN 0.79 0.90 0.84 1,353
LatinAffectus, this step effectively enables users to cross-
SCONJ 0.88 0.94 0.91 1,304
check the textual information within the two works and
the scores recorded in the prior polarity lexicon.
Following the same methodology discussed in Sprug-
The accuracy score reached by the model of the LiLa noli et al. for Horace [16, 61-2], we proceeded to match
Text Linker are reported in table 17 . As it can be seen, the each token of Tristia and Epistulae to the polarity score
tool performed quite satisfactorily in both tasks, reaching recorded in LatinAffectus for their respective lemma. The
an average accuracy across the different books of the two sentiment scores are obtained by automatically assigning
works of 96% and 94% respectively. Accurate lemmatiza- the score found in LatinAffectus to the tokens that are
tion also lead to good scores for the linking process, with lemmatized under lemmas that also have an entry in the
approximately 87% of the word forms uniquely associated polarity lexicon. For instance, the adjective malus “bad”
with one lemma. Of the remaining lemmas, 10% were is found with a polarity value of -1.0 in LatinAffectus.
ambiguous, as they were associated with two or more All tokens lemmatized as malus (adj.) are thus given a
potential candidates in LiLa, mainly due to homography score of -1.0. A score of 0.0 is assigned to both words
(e.g. the lemma string volo can be linked to both the first- expressly annotated as neutral in LatinAffectus and to
conjugation verb volare, “to fly” and the irregular verb those that do not have an entry in the lexicon. The cover-
volere, “will”), and required manual disambiguation. age of polarity-laden tokens (both adjectives and nouns)
Of the 3% of no-matches, most were proper names. is reported in table 3.
Ovid mentions barbarian tribes and figures belonging to
Roman cultural circles rarely or never cited elsewhere. 8 The model uses the Universal POS tagset of Universal Dependencies;
In the fourth book of the Epistulae, out of a total of 42 see: https://universaldependencies.org/u/pos/index.html.
tokens not linked to any lemma, 32 are proper names 9 http://lila-erc.eu/data/corpora/CIRCSELatinLibrary/id/corpus/
P.%20Ovidii%20Tristia and http://lila-erc.eu/data/corpora/
7
Note that, in the evaluation, we omitted the 3 missing lines that CIRCSELatinLibrary/id/corpus/P.%20Ovidii%20Epistulae%20ex%
were added in the revision stage. For this reason table 1 has slightly 20Ponto.
10
fewer tokens than table 3. http://lila-erc.eu/ontologies/lila/hasLemma.
Table 3
Token coverage of polarity-laden nouns and adjectives in the books of Epistulae and Tristia. Per each book, the total nr. of adj.
and nouns are reported, as well as the nr. of adj./nouns with polarity score ̸= 0 (pos/neg)
Book Nouns Adjectives Tot Tokens
tot pos/neg tot pos/neg
Epistulae.b1 1,195 1,061 545 360 5,923
Epistulae.b2 1,214 1,088 561 425 5,770
Epistulae.b3 1,135 1,013 452 335 5,671
Epistulae.b4 1,447 1,282 676 464 7,099
Tristia.b1 1,153 995 513 358 5,805
Tristia.b2 922 817 386 268 4,427
Tristia.b3 1,272 1,131 555 386 6,227
Tristia.b4 1,140 1,020 493 353 5,311
Tristia.b5 1,152 1,037 523 386 5,989
TOT 10,630 9,444 4,704 3,335 52,222
In what follows, due to space constraints, only some of reaching 0.020 (99 occurrences). The focus of Ovidian
the results obtained from the sentiment analysis con- epistles seems to split, with the once uncontested domain
ducted on Ovid’s exilic works will be discussed. In of the "I" beginning to be accompanied by the equally
analysing these results, we will focus on the distribution large realm of the "you". The solipsism of the sender
of sentiment-laden words and what this reveals about starts to giving way to the celebration of the recipient,
Ovid’s emotional state during his exile. transmuting the once famous and now banished elegiac
poet into a potential celebratory poet, who could excep-
5.1. Ovid’s “last metamorphosis” tionally glorify his future patron if only he is given the
chance to (and after, of course, being recalled back home).
To investigate how Ovid’s attitude evolves throughout Commentators have never doubted that Ovid, after
his exilic works, we calculated the overall sentiment for some attempts in the third book (e.g. Ep. 3.4-5), dedicates
each book (fig. 1). Specifically, we summed the polarity himself to panegyric poetry in the fourth book, no doubt
scores and divided the total by the number of sentences in order to win powerful allies who could intercede for
to mitigate skewness resulting from the varying lengths his return [31, 120-121] [32]. However, this intention
of the books [17]11 . This book-level score reveals a nega- was never noted or at least imagined for the Epistulae’s
tive emotional state persisting until the first book of the second book.
Epistulae. From the second book onward, however, the It is undeniable that we witness the last metamorpho-
sentiment undergoes a polarity shift, becoming positive sis in the poetic trajectory of Ovidian elegy. Our results
and remaining so until the last book. The reasons behind suggest that this metamorphosis, still so premature that
such a radical change in the poet’s emotional state are it has not been detected by critics, is clearly recorded by
worth investigating. Ovid’s polarity lexicon, that is, the sentiment analysis already in the second book of the Epis-
most frequently used sentiment-laden words in the ex- tulae. Indeed, when the sentiment analysis is conducted
ilic works, does not show any particular change in the 9 at a finer grain, and thus at the level of individual com-
books considered here. An interesting change that we do positions , it reveals an increase in positivity precisely in
observe in the last books concerns the distribution of the the verse-epistles sent to new and powerful recipients.
personal pronouns. In Epistulae 1, the relative frequency This reflects a new poetic purpose for Ovid’s poetry.
of the 1st p. singular pronoun, ego, is 0.018 (93 over 4,983
lemmas), while for the second person singular pronoun,
5.2. Facing the abandonment
tu, it is 0.010 (52 occurrences). In Epistulae 2, the former
has an identical relative frequency (0.018, or 89 occur- Another advantage of lexicon-based SA is the possibility
rences over 4,920), while the latter increases significantly, to directly engage with a list of sentiment words mostly
used by an author in their entire production or in specific
11
Ovid’s sentences tend to correspond with the elegiac couplet. The works of interest. A close observation of this specialized
two works have 3,4044 sentences with an average length of 17.16 lexicon can lead to interesting outcomes too.
tokens (stdev = 11.43). The books tend to have a rather similar The sentiment words used in the exilic works are rel-
number of sentences, ranging from 261 (Tr. 2) to 388 (Ep. 4), with
atively stable in quality and quantity. Five distinct se-
a mean length of 338.22 (stdev = 37.74). Note, however, that we
relied on the sentence splitter of TextLinker and the results were mantic spheres [33, 203] can be identified: friendship,
not corrected manually. politics, justice, intellect, and sadness (fig. 2). Among
these, the semantic sphere of friendship and love contains 6. Conclusion and future work
abstract qualities and feelings (amor “love”, fides “trust,
faith”, honor “honor”, nobilitas “nobility”, pietas “devo- The work that we presented in the paper had two out-
tion”, virtus “virtue”), as well as nouns and qualifying comes. Firstly, our LOD edition of Ovid demonstrates the
adjectives typical of friendly and romantic relationships benefits of interoperability among resources for Latin.
(bonus “good”, carus “dear”, dignus “worthy”, pius “du- Interoperability greatly facilitates the work of scholars,
tiful, affectionate”). Although this sphere is frequently allowing them to benefit from lexicon, corpora, and NLP
recurring throughout the exilic production, the words tools useful for every stage of their research through a
composing it do not appear with the same consistency. single point of access. The LiLa project already provides
Between the third and fourth book of the Tristia, new a paradigm of this model, but to continue doing so, it
lemmas become part of this semantic sphere, indicating requires constant integration. This is true not only for
a change in Ovid’s relationship with the affections left corpora, whose enrichment this paper testifies to. Despite
behind in Rome. the important results that SA conducted with LatinAffec-
In Tristia 3, the only epithets fitting for his friends tus already provides for Ovid, there remains several ways
(lemma amicus, 10) were “dear” (carus, 10) and “good” for enhancing its performance. The coverage of LatinAf-
(bonus, 6). These friends, along with the wife, represented fectus is extensive with regard to nouns and adjectives, as
Ovid’s only hope of salvation. In Tristia 4, Ovid reaches clearly demonstrated by its performance on the dataset
the fourth year of exile and sees the possibility of relying discussed in this paper (see table 3). However, it is evi-
on them slipping further out of his grasp. The poet begins dent that a current limitation is its failure to account for
to perceive that the friendship and love shown to him in the sentiment of verbs. This is why LatinAffectus, like
Rome and at the height of his success might have been the other linguistic resources available in LiLa, should
more superficial than he believed. His friends fail to write not be regarded as a static resource, but rather as one
(Tr. 4.7.3-5) and Ovid catches himself wondering if his that is continually evolving and being updated. Addi-
wife still thinks of him (Tr. 4.3.10). However, the bonds tionally, improvements could be made by accounting for
of friendship and marriage could still be exploited. syntactic phenomena such as polarity shifters [16] and
In Tristia’s book 4, as the occurrences of the adjectives by taking into consideration the poetic nature of the text
“friend” (amicus, 3) and “dear” (carus, 3) decrease, the (e.g. by providing access to metrical information12 ). In a
use of words such as “devoted”, “virtuous”, “worthy”, broader sense, there is a lack of sufficient consideration
and “husband” increases. This lexicon here suggests a for the context in which sentiment words are collocated.
form of conditional praise: only by proving themselves However, context-sensitive sentiment analysis is still in
worthy of the friend and spouse in need can those left its early stages within NLP13 , and clearly, much work re-
in Rome earn their title. Thus, if his friends are truly mains to be done to effectively incorporate context into
“virtuous” (bonus, 8) and “devoted” (pius, 6) and wish their sentiment analysis.
“fame” (fama, 7) to be such among contemporaries and The second outcome is in suggesting the undeniable
posterity, they must show themselves worthy of such potential of a hybrid approach, such as the one employed
a connotation. His wife must, similarly, prove herself in this study, crossing literary criticism with the use of
worthy of being his husband’s (vir, 13) wife, even though quantitative methods and computational resources. The
he is exiled. Consequently, Ovid would sooner credit theories developed within literary criticism and the in-
a ten-verses long series of adynata rather than believe vestigative tools provided by computational linguistics
that his friend decided to abandon him (Tr. 4.7.10-20). can and should effectively collaborate, mutually enrich-
At the same time, his wife, dutiful as she is (Tr. 4.3.71), ing each other. In this specific context, the reflections
surely must be existing solely to work for and diligently developed within literary criticism regarding Ovid’s ex-
lament her absent husband (Tr. 4.3.17-38). Moreover, his ile works were crucial for interpreting the data derived
misfortune gives her a unique chance for fame, for her from sentiment analysis. In turn, sentiment analysis was
loyalty to be forever remembered (Tr. 4.3.81-84). This fundamental for confirming and deepening these obser-
logic of coercion begins to be employed in book 4 of vation, providing interpretable and reproducible data.
the Tristia, and finds full employment in the Epistulae. If a classic is a book which has never exhausted all it
It consists of imposing fundamental moral models and has to say to its readers (as Calvino wrote [35, 5]), it is
values of the Roman citizen on his recipients through also because scholars are capable of interrogating it with
targeted praises, so that the recipients feel obliged to new methods to address longstanding and unresolved
comply with the requests. Here too, sentiment analysis questions.
reveals in its embryonic state what the critical eye has 12
only caught later in full development. For instance, this can be achieved by linking existing resources,
such as Musisque Deoque, to LiLa.
13
See Teng et al. [34] paper for an overview of state-of-the-art studies
on context-sensitive sentiment analysis.
Book-level sentiment score
Sentiment_Score
0.3 0.28
0.26
0.2
Sentiment Score 0.16
0.1
0.0
-0.04
-0.07
-0.1 -0.09 -0.09 -0.09
0.1
1
2
3
4
5
e1
e2
e3
e4
ia
ia
ia
ia
ia
ula
ula
ula
ula
st
st
st
st
st
Tri
Tri
Tri
Tri
Tri
ist
ist
ist
ist
Ep
Ep
Ep
Ep
Book
Figure 1: Ovid’s overall sentiment (i.e. sum of all polarity words in each book divided by the number of sentences in each
book) across the 5 books of the Tristia and the 4 books of the Epistulae.
Polarity lexicon by semantic classes across the books
50
40
30
Count
20
Category
10 Friendship
Politics
Justice
Intellectual Life
0 Sadness
1
2
3
4
5
e1
e2
e3
e4
ia
ia
ia
ia
ia
ula
ula
ula
ula
st
st
st
st
st
Tri
Tri
Tri
Tri
Tri
ist
ist
ist
ist
Ep
Ep
Ep
Ep
Book
Figure 2: Distribution of polarized words according to semantic class across the 5 books of the Tristia and the 4 books of the
Epistulae.
Appendix References
The appendix contains the figures cited in section 5. [1] G. Crane, The perseus digital library and the fu-
ture of libraries, International Journal of Digital
Libraries 24 (2024) 117–128. URL: https://doi.org/
10.1007/s00799-022-00333-2.
[2] S. J. Huskey, The digital latin library: Cataloging
and publishing critical editions of latin texts, in:
M. Berti (Ed.), Digital Classical Philology. Ancient
Greek and Latin in the Digital Revolution, De
Gruyter, Berlin, Boston, 2019, pp. 19–34. doi:doi: (2019). URL: http://arxiv.org/abs/1808.03137. doi:10.
10.1515/9783110599572-003. 17175/2019_008, arXiv:1808.03137 [cs].
[3] M. Passarotti, F. Mambrini, G. Franzini, F. M. Cec- [15] P. C. Hogan, B. J. Irish, L. P. Hogan (Eds.), The
chini, E. Litta, G. Moretti, P. Ruffolo, R. Sprugnoli, Routledge Companion to Literature and Emo-
Interlinking through Lemmas. The Lexical Collec- tion, Routledge, London, 2022. doi:10.4324/
tion of the LiLa Knowledge Base of Linguistic Re- 9780367809843.
sources for Latin, Studi e Saggi Linguistici 58 (2020) [16] R. Sprugnoli, F. Mambrini, M. Passarotti, G. Moretti,
177–212. doi:10.4454/ssl.v58i1.277, number: The Sentiment of Latin Poetry. Annotation and Au-
1. tomatic Analysis of the Odes of Horace, IJCoL. Ital-
[4] M. Fantoli, M. Passarotti, F. Mambrini, G. Moretti, ian Journal of Computational Linguistics 9 (2023).
P. Ruffolo, Linking the LASLA Corpus in the LiLa doi:10.4000/ijcol.1125.
Knowledge Base of Interoperable Linguistic Re- [17] R. Sprugnoli, M. C. Passarotti, M. Testori, G. Moretti,
sources for Latin, in: Proceedings of the 8th Work- Extending and using a sentiment lexicon for latin
shop on Linked Data in Linguistics within the 13th in a linked data framework, 2021. URL: https://api.
Language Resources and Evaluation Conference, semanticscholar.org/CorpusID:248149526.
European Language Resources Association, Mar- [18] J. Pavlopoulos, A. Xenos, D. Picca, Sentiment
seille, France, 2022, pp. 26–34. Analysis of Homeric Text: The 1st Book of Iliad,
[5] A. L. Wheeler, Publius Ovidius Naso. Tristia. Ex in: N. Calzolari, F. Béchet, P. Blache, K. Choukri,
Ponto, Harvard University Press, Cambridge, MA, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Mae-
1959. gaard, J. Mariani, H. Mazo, J. Odijk, S. Piperidis
[6] J.-M. Claassen, Ovid revisited: The poet in exile, (Eds.), Proceedings of the Thirteenth Language
Bloomsbury Academic, London, 2008. Resources and Evaluation Conference, European
[7] P. Green, Ovid. The Poems of Exile. Tristia and the Language Resources Association, Marseille, France,
Black Sea Letters, University of California Press, 2022, pp. 7071–7077. URL: https://aclanthology.org/
Berkeley, 2005. 2022.lrec-1.765.
[8] B. Liu, Sentiment Analysis: Mining Opinions, Sen- [19] H. Zhao, B. Wu, H. Wang, C. Shi, Sentiment analy-
timents, and Emotions, 2nd edition ed., Cambridge sis based on transfer learning for Chinese ancient
University Press, Cambridge ; New York, 2020. literature, in: 2014 International Conference on Be-
[9] M. Wankhade, A. C. S. Rao, C. Kulkarni, A survey havioral, Economic, and Socio-Cultural Computing
on sentiment analysis methods, applications, and (BESC2014), 2014, pp. 1–7. URL: https://ieeexplore.
challenges, Artif. Intell. Rev. 55 (2022) 5731–5780. ieee.org/document/7059510. doi:10.1109/BESC.
URL: https://doi.org/10.1007/s10462-022-10144-1. 2014.7059510.
doi:10.1007/s10462-022-10144-1. [20] Y. Hou, A. Frank, Analyzing Sentiment in Classical
[10] R. Bose, R. Dey, S. Roy, D. Sarddar, Sentiment Anal- Chinese Poetry, in: K. Zervanou, M. van Erp, B. Alex
ysis on Online Product Reviews, 2018. (Eds.), Proceedings of the 9th SIGHUM Workshop
[11] F. Xing, E. Cambria, R. Welsch, Natural lan- on Language Technology for Cultural Heritage,
guage based financial forecasting: a survey, Ar- Social Sciences, and Humanities (LaTeCH), As-
tificial Intelligence Review 50 (2018). doi:10.1007/ sociation for Computational Linguistics, Beijing,
s10462-017-9588-9. China, 2015, pp. 15–24. URL: https://aclanthology.
[12] B. O’Connor, R. Balasubramanyan, B. Routledge, org/W15-3703. doi:10.18653/v1/W15-3703.
N. Smith, From Tweets to Polls: Linking Text Senti- [21] S. Rebora, Sentiment Analysis in Lit-
ment to Public Opinion Time Series, Proceedings erary Studies. A Critical Survey, Digi-
of the International AAAI Conference on Web and tal Humanities Quarterly 17 (2023). URL:
Social Media 4 (2010) 122–129. URL: https://ojs.aaai. https://www.proquest.com/scholarly-journals/
org/index.php/ICWSM/article/view/14031. doi:10. sentiment-analysis-literary-studies-critical/
1609/icwsm.v4i1.14031, number: 1. docview/2842908301/se-2?accountid=9941, place:
[13] E. M. Clark, T. A. James, C. A. Jones, A. Alap- Providence.
ati, P. Ukandu, C. M. Danforth, P. S. Dodds, A [22] F. Mambrini, M. C. Passarotti, The lila lemma bank:
sentiment analysis of breast cancer treatment ex- A knowledge base of latin canonical forms, Jour-
periences and healthcare perceptions across twit- nal of Open Humanities Data (2023). doi:10.5334/
ter, ArXiv abs/1805.09959 (2018). URL: https://api. johd.145.
semanticscholar.org/CorpusID:44063573. [23] P. Cimiano, C. Chiarcos, J. P. McCrae, J. Gra-
[14] E. Kim, R. Klinger, A Survey on Sentiment and cia, Linguistic Linked Data: Representation,
Emotion Analysis for Computational Literary Stud- Generation and Applications, Springer Interna-
ies, Zeitschrift für digitale Geisteswissenschaften tional Publishing, Cham, 2020. doi:10.1007/
978-3-030-30225-2. [31] J.-M. Claassen, Displaced Persons: The Literature
[24] F. Grotto, R. Sprugnoli, M. Fantoli, M. Simi, F. M. of Exile from Cicero to Boethius, Duckworth, 1999.
Cecchini, M. C. Passarotti, The annotation of Liber Google-Books-ID: 1FkXAQAAIAAJ.
Abbaci, a domain-specific latin resource, in: Pro- [32] M. Labate, Elegia triste ed elegia lieta. Un caso
ceedings of the Eighth Italian Conference on Com- di riconversione letteraria, Materiali e discussioni
putational Linguistics (CLiC-it 2021), aAccademia per l’analisi dei testi classici (1987) 91–129. doi:10.
University Press, Milan, 2021, pp. 176–183. URL: 2307/40235896.
https://doi.org/10.4000/books.aaccademia.10659. [33] M. C. Gaetano Berruto, La linguistica. Un corso
[25] F. Mambrini, M. Passarotti, G. Moretti, M. Pelle- introduttivo, 3. edizione ed., UTET Università,
grini, The Index Thomisticus Treebank as Linked [Grugliasco], 2022.
Data in the LiLa Knowledge Base, in: C. Cal- [34] Z. Teng, D. T. Vo, Y. Zhang, Context-Sensitive Lexi-
zolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, con Features for Neural Sentiment Analysis, in: Pro-
T. Declerck, S. Goggi, H. Isahara, B. Maegaard, ceedings of the 2016 Conference on Empirical Meth-
J. Mariani, H. Mazo, S. Odijk, Janand Piperidis ods in Natural Language Processing, Association
(Eds.), Proceedings of the Thirteenth Language Re- for Computational Linguistics, Austin, Texas, 2016,
sources and Evaluation Conference (lrec 2022), pp. 1629–1638. URL: http://aclweb.org/anthology/
European Language Resources Association (elra), D16-1169. doi:10.18653/v1/D16-1169.
Marseille, France, 2022, pp. 4022–4029. URL: https: [35] I. Calvino, Why read the classics? Perchè leggere i
//aclanthology.org/2022.lrec-1.428. classici?, Penguin., Londra, 2009.
[26] F. M. Cecchini, R. Sprugnoli, G. Moretti, M. Pas-
sarotti, UDante: First Steps Towards the Universal
Dependencies Treebank of Dante’s Latin Works,
in: J. Monti, F. Dell’Orletta, F. Tamburini (Eds.),
Proceedings of the Seventh Italian Conference on
Computational Linguistics (CLiC-it 2020, Bologna,
Italy, March 1–3 2021), Associazione italiana di
linguistica computazionale (ailc), Accademia Uni-
versity Press, Turin, Italy, 2020, pp. 99–105. URL:
http://ceur-ws.org/Vol-2769/paper_14.pdf.
[27] I. De Felice, L. Tamponi, F. Iurescia, M. Passarotti,
Linking the corpus classes to the lila knowledge
base of interoperable linguistic resources for latin,
in: Proceedings of CLiC-it 2023: 9th Italian Con-
ference on Computational Linguistics, Nov 30 —
Dec 02, 2023, CEUR Workshop Proceedings, Venice,
2023, pp. 1–7. URL: https://ceur-ws.org/Vol-3596/
paper20.pdf.
[28] R. Sprugnoli, M. Passarotti, D. Corbetta, A. Peverelli,
Odi et Amo. Creating, Evaluating and Extending
Sentiment Lexicons for Latin., in: Proceedings of
the Twelfth Language Resources and Evaluation
Conference, European Language Resources Associ-
ation, Marseille, France, 2020, pp. 3078–3086.
[29] M. Passarotti, F. Mambrini, G. Moretti, The ser-
vices of the LiLa knowledge base of interoperable
linguistic resources for Latin, in: Proceedings of
the 9th Workshop on Linked Data in Linguistics
@ LREC-COLING 2024, ELRA and ICCL, Torino,
Italia, 2024, pp. 75–83.
[30] D. Bamman, F. Mambrini, G. Crane, An Ownership
Model of Annotation: The Ancient Greek Depen-
dency Treebank, in: Proceedings of the Eighth
International Workshop on Treebanks and Linguis-
tic Theories, EDUCatt, Milan, Italy, 2009, pp. pp.
5–15.