<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards the Modeling of Polarity in a Latin Knowledge Base?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rachele Sprugnoli[</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Mambrini[</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanni Moretti</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>rotti[</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CIRCSE Research Centre, Universita Cattolica del Sacro Cuore Largo Agostino Gemelli 1</institution>
          ,
          <addr-line>20123 Milano</addr-line>
        </aff>
      </contrib-group>
      <fpage>59</fpage>
      <lpage>70</lpage>
      <abstract>
        <p>In this paper, we describe the process of inclusion of a prior polarity lexicon of Latin lemmas, called LatinA ectus, in a knowledge base of interoperable linguistic resources developed within the LiLa: Linking Latin project. More speci cally, a manually-curated list of lemmasentiment pairs is linked to a comprehensive collection of Latin lemmas by using Semantic Web and Linked Data standards and practices. LatinA ectus is modeled relying on three formal representation frameworks: Lemon and Ontolex to describe the lexicon, and the Marl ontology to describe the sentiment properties of each of its lexical entries. We present the lexicon, the methodology and the results of the linking process, as well as a use case and the planned future work.1</p>
      </abstract>
      <kwd-group>
        <kwd>Linguistic Linked Open Data</kwd>
        <kwd>Sentiment Analysis</kwd>
        <kwd>Latin</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Throughout the recent years, several linguistic resources and tools were created
for many languages to support sentiment analysis, i.e. the task of automatically
classifying a piece of text according to the sentiment conveyed by it. Although the
main applications of such resources and tools fall into categories like social media
and customer experience monitoring , there is a growing interest in the research
community to develop resources and tools to perform sentiment analysis of texts
written in ancient languages. Such interest mirrors the substantial growth of
the area dedicated to building and using linguistic resources for ancient and
historical languages, which has primarily concerned Latin and Ancient Greek as
essential media for accessing and understanding the so-called Classical tradition.
? This work is supported by the European Research Council (ERC) under the
European Union's Horizon 2020 research and innovation programme via the \LiLa:
Linking Latin" project - Grant Agreement No. 769994.
1 Copyright ©2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>In particular, Latin plays a central role in this context, as texts written in
Latin are spread all over Europe, covering a time span of almost two millennia
and being testimonials of the common, but still diverse, past that contributed to
shape the cultural heritage of Europe. Exploiting the most advanced techniques
for preserving, investigating and sharing such heritage assets that have survived
from the past times is at the same time a challenge and an obligation for the
research area dealing with developing linguistic resources and tools. Given the
wide variety of the Latin texts in terms of their era, place and literary genre, the
achievements of this eld of research promise to impact a large and heterogeneous
community made of historians, philologists, archaeologists and literary scholars,
in di erent ways all dealing with textual and lexical data written in Latin.</p>
      <p>The recent launch of projects aimed at automatically extracting structured
knowledge from ancient sources provided by linguistic resources, like for instance
eAqua2, Logeion3 and Corpus Corporum4, shows how the current availability of
linguistic resources for ancient languages, and particularly Latin, is such that
there is a large need for making them interact.</p>
      <p>
        To address the issue of interoperability between lexical and textual resources
for Latin, the LiLa: Linking Latin project (2018-2023)5 was launched with the
objective of building a Knowledge Base (KB) of linguistic resources for Latin
based on the Linked Data paradigm, i.e. a collection of several data sets
represented using the same vocabulary of knowledge description and linked together
[
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]. Within the LiLa project, aside from interlinking the already available
resources for Latin, we are also building a number of new ones, among which is
LatinA ectus, a lexicon that assigns a prior sentiment score to a selection of
Latin adjectives and nouns [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ].
      </p>
      <p>This paper describes the process of inclusion of LatinA ectus into the LiLa
KB and presents a simple use-case showing how the interaction of the linguistic
resources currently linked through LiLa can be exploited to address a speci c
research question.</p>
      <p>
        The core component of LiLa is a large collection of Latin lemmas, whose role
is to connect the di erent (and possibly distributed) linguistic resources that
interact in LiLa [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Particularly, the textual resources are included into LiLa
by linking the occurrences of the words in their texts to the lemmas of LiLa, while
lexical resources connect to LiLa by linking the contents of their lexical entries to
the lemmas of the KB. The result is an interlinked ecosystem where textual and
lexical (meta)data provided by several resources become interoperable. Including
LatinA ectus into LiLa enhances a subset of the lemmas provided by the LiLa
collection with a prior positive/negative polarity. Such a black or white approach
is at the same time a limitation and an advantage of LatinA ectus. As for the
former, the lexicon does not account for the di erent meanings that words may
have, some of which can show di erent polarity values, thus failing to represent
2 http://www.eaqua.net/
3 https://logeion.uchicago.edu/lexidium
4 http://www.mlat.uzh.ch/MLS/
5 https://lila-erc.eu
the span of possible sentiments of a word. As for the latter, assigning one prior,
prototypical polarity value to the lexical entries helps the application of the
(meta)data from LatinA ectus to real texts. Indeed, no su ciently accurate tools
for word sense disambiguation are currently available for Latin, which prevents
from analyzing texts with the help of sentiment lexicons that provide di erent
polarity values for the same word, as this implies to consider all the possible
values while computing the overall polarity of a sentence, or a text.6 Instead, by
grounding on one, prior polarity value it becomes possible to apply LatinA ectus
to Latin texts without the need of pre-processing data with a layer of word
sense disambiguation. This aspect becomes an added value when LatinA ectus
interacts with all the other resources included in LiLa, because its (meta)data
are not anymore available in isolation, but they are interoperable with those of
other resources, thus exploiting to the best the contribution provided by each of
them in applications to address research questions.
      </p>
      <p>The paper is organized as follows. Section 2 provides a brief overview of the
related work on polarity lexicons and on the strategies to represent linguistic
resources and services for sentiment analysis in the Linked Data framework.
Section 3 describes LatinA ectus and Section 4 details the process of modeling
it an including it into the LiLa KB. Section 5 presents a simple use case to show
how the interoperability between the resources connected in LiLa, and
particularly LatinA ectus, can support research in the Humanities. Finally, Section
6 concludes the paper with a discussion about the need to make the linguistic
resources for Latin interact and sketches our future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Sentiment Analysis and related tasks [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], such as emotion analysis, subjectivity
detection and opinion mining, are very popular both in academic research and
in business applications where the focus is mostly given to the analysis of
contemporary texts like product or service reviews [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and social media posts [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ].
In this tasks, the creation of polarity lexicons, that is lists of words associated
to their out-of-context sentiment orientation, is of fundamental importance but
also a very time-consuming process. Several approaches have been developed
to automatize this process and build multi-lingual lexicons covering also
lessresources and ancient languages. For example, Mohammad [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] adopts
crowdsourcing techniques to generate an English valence, arousal, dominance (VAD)
lexicon and then automatically translates it into other 103 languages, including
Latin. Chan and Skiena [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] use instead a knowledge graph propagation algorithm
starting from Wikipedia to build lexicons of positive and negative words in 136
languages inclusive of Latin.
      </p>
      <p>
        These two resources, although of undoubted value, have two main drawbacks:
they are noisy due to the presence of English words, such as microchip and
reli6 An example of such approach is provided by the API of the Latin WordNet project of
the University of Exeter, which can perform sentiment analysis of individual strings
via HTTP POST requests to https://latinwordnet.exeter.ac.uk/sentiment/.
able7 and their content was not checked by a Latin expert or evaluated against
a gold standard. Another limitation is that these lexicons have not been
published in the Linked Data framework, thus limiting their usability and semantic
interoperability. However, various e orts have been made to develop a formal
representation of linguistic resources and services for sentiment analysis. More
speci cally, the Marl ontology is designed for the publication of data about
opinions and the sentiments expressed in them [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] and the EuroSentiment project
has proposed a model that integrates it with the lexicon model for ontologies
(lemon) [
        <xref ref-type="bibr" rid="ref19 ref3">3,19</xref>
        ], so to represent lexical resources for sentiment and emotion
analysis such as lexicons and annotated corpora [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. This approach has been applied,
for example, to represent polarity information of German compound words,
relying, in particular, on Ontolex [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], that is the core module of lemon [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>In this paper we aim to overcome the aforementioned limitations of the
currently available polarity lexicons for Latin by publishing, using Linked Data
principles, a list of lemmas with their prior sentiment orientation created by
experts of Latin language and culture, and then expanded by exploiting a set
of already available manually curated linguistic resources. Each entry in the
resources we created is linked to the collection of Latin lemmas provided by the
LiLa KB, so to achieve interoperability.
3</p>
    </sec>
    <sec id="sec-3">
      <title>Latin Prior Polarity Lexicons</title>
      <p>
        A Gold Standard (GS) lexicon was manually developed by two Latin language
and culture experts who assigned a sentiment score to out-of-context lemmas
using a ve-value classi cation: 1 (fully positive), 0.5 (somewhat positive), 0
(neutral), -0.5 (somewhat negative), -1 (fully negative). We chose to take into
consideration nouns and adjectives only because their polarity is more easy to
de ne at a lexical level, i.e. out of context, than that of verbs whose semantics is
more strictly connected to that of their arguments [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Lemmas were taken from
the William Whitaker's Words morphological analyzer and digital dictionary8,
the Cassell's Latin dictionary9 [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] and the lemmatized version of Opera Latina
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], a corpus of Classical authors manually annotated with lemmas and
Partof-Speech (PoS) tags. In addition, a Silver Standard (SS) lexicon was built by
deriving new entries in two ways: i) by exploiting derivational, synonym and
antonym relations with the lemmas in the GS; ii) by adding graphical variants
of lemmas present in the GS. Original polarity scores were propagated or reversed
onto the newly derived lemmas: for example, scores were preserved in case of
synonyms and graphical variants, whereas they were reversed for antonyms (see
Table 1). Details on the composition of the GS and the SS are reported in Table
2: the resources are freely available online at https://github.com/CIRCSE/
7 By manually revising the two lexicons, we calculated the percentage of English words:
14% in the VAD lexicon and 9% in the other.
8 https://mk270.github.io/whitakers-words/
9 https://github.com/nikita-moor/latin-dictionary
Latin_Sentiment_Lexicons and their detailed description is given in [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. The
GS and the SS have been merged in a unique resource called LatinA ectus.
      </p>
      <p>Lemma-GS Score-GS Extension Type Lemma-SS Score-SS
purus `pure' +1 derivational inpurus `impure' -1
innoxius `harmless' +0.5 synonym innocens `innocent' +0.5
aqua `water' 0 derivational aquarius `of/for water' 0
apsentia `absence' -0.5 variant absentia `absence' -0.5
scelus `crime' -1 antonym bene cium `bene t' +1</p>
      <p>PoS
LEXICON ADJ NOUN TOT
Gold Standard 454 (39.7%) 690 (60.3%) 1,144
Silver Standard 512 (39.6%) 781 (60.4%) 1,293</p>
      <p>LatinA ectus 966 (39.6%) 1,471 (60.4%) 2,437
4</p>
    </sec>
    <sec id="sec-4">
      <title>Modeling and Linking Polarity</title>
      <p>For modeling LatinA ectus, we rely on three formal representation frameworks:
Lemon10 and Ontolex11 to describe the lexical resource and the Marl ontology12
to describe the sentiment properties of each entry.</p>
      <p>
        In our approach the polarity lexicon is de ned as an instance of the class E31
Document13 of the CIDOC Conceptual Reference Model (CRM), an ontology
formally describing concepts and relations in the cultural heritage domain [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
Moreover, LatinA ectus is also de ned as an object of type lexicon following the
LInguistic MEtadata (lime) module of Ontolex. We link the lexicon to its entries,
which are de ned as instances of the Ontolex class LexicalEntry, through the
property called entry belonging to the lime module. Each lexical entry has a
label, an ontolex:canonicalForm property connecting it to the corresponding
lemma in the LiLa KB, and an ontolex:sense property corresponding to the
lexical meaning of a lexical entry. Given that LatinA ectus deals with prior
polarities, each lexical entry has only one sense. modeled as an instance of an
object of the class ontolex:LexicalSense. Each sense is characterized by a
10 https://lemon-model.net/
11 https://www.w3.org/2016/05/ontolex/
12 http://www.gsi.dit.upm.es/ontologies/marl/1.1/
13 The class E31 \comprises identi able immaterial items that make propositions about
reality", http://www.cidoc-crm.org/Entity/e31-document/version-6.2
label, the relation marl:hasPolarity and the property marl:polarityValue.
More speci cally, the relation marl:hasPolarity connects the sense to the class
marl:Polarity, indicating if the sentiment is positive, negative or neutral. On
the other side, marl:polarityValue speci es the numeric decimal value of the
sentiment that, in our case, can be 1.0, 0.5, 0.0, -0.5, or -1.0.
      </p>
      <p>The lexical entries of LatinA ectus were linked to their corresponding lemmas
in the LiLa KB in a semi-automatic way. First, we performed an automatic
matching between the two resources: this revealed the presence of 246 ambiguous
lemmas, that is lemmas having the same written representation and the same
PoS tag: for example, the entry des, having polarity value 1, can be linked either
to the lemma of the fth declension meaning `trust' or to the lemma of the third
declension meaning `lyre' . These lemmas were manually disambiguated; thus
des was linked to the rst of the aforementioned lemmas in the LiLa KB.
Further 107 entries, such as Medieval or New Latin words like praesuppositio
`assumption' and radioactiuus `radioactive', were not present in the KB and
were therefore added.</p>
      <p>Figure 1 illustrates the modeling of the lexical entry malus `evil' and of the
negative prior polarity of its lexical sense as recorded in LatinA ectus.</p>
      <p>
        Thanks to the linking between entries coming from di erent resources, among
which are LatinA ectus and the collection of lemmas of the LiLa KB, it is
possible to get a rich set of lexical information. An example is given in Figure 214:
a number of morphological features, including PoS, degree and in ectional
category, are assigned to the node for the lemma malus in the KB. This node is also
connected to a Base node that plays the role of connecting together all the
lemmas belonging to the same derivational family: in this Figure the Base2302 node
interlinks the lemmas malus and male cus `wicked'. The Lemma node malus is
also connected to its etymology taken from the Etymological Dictionary of Latin
14 The LiLa KB can be explored using a query interface: https://lila-erc.eu/
query/.
and the other Italic Languages [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] and modeled by relying on the
Ontolexlemon ontology and the lemonEty extension [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In particular, the Lemma node
is linked to the canonical form present in the etymological dictionary, which
in turn is linked to the Proto-Italic and/or Proto-Indo-European reconstructed
forms of the word (e.g. *malo-) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The left hand side of the image shows that
both malus and male cus are included in LatinA ectus and they both have a
negative polarity.
The only corpus that is presently connected to the LiLa KB is the Index
Thomisticus Treebank (ITTB) [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ], which provides a complete morpho-syntactic
annotation, based on a form of dependency grammar, to the Latin works of the
philosopher Thomas Aquinas (13th Century). At the moment, the LiLa KB
stores the connections between the 277,547 tokens, taken from the rst four
books of the treatise Summa contra Gentiles (SCG), and the corresponding
lemma under which each token is lemmatized. For every token we report also
the basic morpho-syntactic information derived from the treebank, such as the
link between head and dependent in the dependency tree and the label of their
syntactic relation (e.g. \Subject" or \Predicate").
      </p>
      <p>Although the original annotation stored in the ITTB already allows users to
perform complex queries on the language of the SCG, the inclusion of the corpus
into the LiLa KB expands the range of possible research to integrate new crucial
dimensions for researcher; polarity provides an outstanding example.</p>
      <p>One of the philosophical problems that Thomas Aquinas engaged in his
teaching and writing is the nature of evil.15 From the linguistic point of view, one of
the prototypical constructions for providing de nitions to concepts is the
nominal predicate, where a subject is associated with a predicate via the copula verb,
like \to be" in English.16 One example of this form of de nition is the sentence:
\evil is a de ciency of good". While the ITTB already allows the users to retrieve
the occurrences of copular constructions, it is only by crossing the information
with LatinA ectus that we can specify the constraint on the polarity of either
of the terms (the subject or the predicate nominal). In this way, we can explore
the de nitions of the negative pole across the work of Thomas Aquinas, and
consider its relation to the problem of the nature of evil.</p>
      <p>
        With a series of federated queries across the three endpoints of LiLa17 (the
collection of lemmas, called Lemma Bank, the corpora, and the lexical resources),
it is possible to obtain such results. On the negative pole, the ITTB includes
67 tokens labeled with a negative polarity that are the subjects of copular
constructions; the 5 most frequently attested lemmas are reported in Table 3. Not
surprisingly, by far the most numerous attestations are those of the technical
word for the concept of evil itself, malum. As Davis observes [
        <xref ref-type="bibr" rid="ref14 ref6">6, 14</xref>
        ], the term
malum is, in its philosophical sense, broader than English `evil'. The latter
carries very strong connotations and is usually not applied to what is perceived as
generically unpleasant or troublesome; on the other hand, in the philosophical
literature, malum covers all the entities that can be said to fall short of the
opposite pole of bonum (`good').
      </p>
      <p>
        It is also informative to extract all the couplets subject-predicate nominal
where the subject is negative. The word that is constantly associated with
fornicatio `fornication' as predicate nominal in the ITTB is peccatum `sin', as all the
four occurrences come from a section of the SCG discussing `why simple
fornication is a sin according to divine law' (qua ratione fornicatio simplex secundum
legem divinam sit peccatum, SCG 3.120).
15 Thomas Aquinas authored a full treatise On Evil in the form of a disputatio, a
formalized treatment of a subject articulated into questions and arguments and
counterarguments, issuing from public debates or school seminars [
        <xref ref-type="bibr" rid="ref10 ref11 ref12 ref13 ref14 ref15 ref16 ref17 ref18 ref19 ref20 ref21 ref22 ref23 ref24 ref25 ref26 ref27 ref28 ref29 ref3 ref30 ref4 ref5 ref6 ref6 ref7 ref8 ref9">6, 3-53</xref>
        ].
16 On the copular sentences and on the history of the notion of copula, which is also
closely tied to the history of Western philosophy, see Moro [
        <xref ref-type="bibr" rid="ref23">23, 248-261</xref>
        ].
17 https://lila-erc.eu/sparql/index.html
      </p>
      <p>
        The words labor `labour, toil, exertion' and especially corruptio `destruction,
corruption' are interesting as well. The former is a complex term that embraces
the notion of physical e ort, then that of economic production, but also of pain
and fatigue. While its main association with extortion justi es the negative
polarity, it is a word that could also carry positive association, in the moral and
economic sphere, both in the Pagan and Christian cultures [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. Indeed, Thomas
Aquinas also uses it with the adjective bonus, to refer (in the plural) to the `good
words [. . . ] by which we satisfy God for our sins' [
        <xref ref-type="bibr" rid="ref8">8, 622</xref>
        ]. In the ITTB, all the
three occurrences of labor are coupled with the adjective necessarius `necessary',
to point to the economic prerequisite of manual work for survival (SCG 3.140)
and to rebut the claim that it is equally necessary on moral grounds (SCG
3.140.15 and 16).
      </p>
      <p>
        The de nitions of corruptio point to a more complex picture. With the basic
meaning of `destruction', `decay' the word has a clear negative orientation. At
the same time, the nuances in its use re ect the complexities in the question of
`evil'. Destruction can in itself be a good thing, if it implies the destruction of
evil, and indeed this is exactly one of the rst points raised by the philosopher in
a sentence that provides a striking example of association of a negative subject
with a positive predicate nominal: cum corruptio mali sit bona `for the
destruction of evil is good' (SCG 3.11.4). Again, there are particular goods for which
the corruption (corruptio) of the ones means the generation (generatio) of the
other (SCG 1.11.12). In fact, the opposition between the two terms (corruptio
and generatio) [
        <xref ref-type="bibr" rid="ref8">8, 252</xref>
        ] is well re ected also in another passage (SCG 3.140.4)
where the concept that an evil is subordered to the creation of a good is
exempli ed by the fact that the corruption (corruptio) of air is the generation of re
(generatio).18
      </p>
      <p>This case study has presented only a very preliminary and partial analysis,
as we have focused our attention only on one of the many syntactical
constructions that may be worth investigating.19 In addition, it is important to note that
one crucial issue that could limit the value of the results presented above is the
coverage of our polarity lexicon. Given the limited number of lemmas included
in the lexicon, it is possible that other negative terms, which do not have a
polarity annotation, are used in copular constructions and are not detected by
our query. In order to verify this situation, we extracted the list of the 150 most
frequent lemmas that are used as subject of copular constructions. In the list we
identi ed 4 lemmas (out of a total of 5, including malum, which ranked 37th)
that should have had a negative polarity but are not included in LatinA ectus :
privatio `deprivation' (25 occurrences), paupertas `poverty' (9), peccatum `sin'
(7), defectus `difect' (6). As these lemmas should have ranked between the
sec18 It is worth noticing that, although in Thomas Aquinas the two terms are opposite
and corruptio is assigned a negative polarity, generatio does not have a polarity label
in LatinA ectus.
19 Another interesting example could be the instances of coordination, to extract all
terms that are associated to positive and/or negative concepts in `and' or `or'
construction.
ond and the third position of Table 3, it is clear that this form of evaluation
is required before using the corpus data. Nevertheless, it is already possible to
see how fruitful is the crossing between lexical data on polarity and linguistic
annotation for a broad range of corpus-based researches.
6</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion and Future Work</title>
      <p>In this paper, we have described the process of inclusion of a sentiment lexicon
for Latin (LatinA ectus ) into the LiLa KB, an infrastructure of interoperable
linguistic resources based on the Linked Data paradigm. Instead of developing
from scratch a new model for representing the lexicon in Linked Data, we selected
three already available and widely used models, following the general
recommendation of the Linked Data world to re-use existing ontologies and vocabularies
as much as possible, in order to enhance interoperability with other resources.</p>
      <p>
        Interoperability is the key word here. To show the bene t of working with
linguistic resources that interact with each other, we presented a simple use
case, where (meta)data from di erent resources for Latin (a dependency
treebank, the LatinA ectus lexicon and the collection of lemmas of the LiLa KB)
interact to investigate a basic topic of Thomistic philosophy. Although the work
of philosophical investigation on the texts of Thomas Aquinas is centuries-old, it
was never possible until now to join automatically (and, thus, to exploit to the
best) the information provided by separate resources, like lexicons, dictionaries
and corpora, to nd the answers to fundamental research questions. As for the
speci c case of Thomas Aquinas, this goes back to the history itself of linguistic
computing, since his texts were among the rst to be automatically processed
with computers, when in the 1950s the Jesuit Roberto Busa started to use IBM
machines to build the large corpus of the Index Thomisticus [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>Today we can make the data of the Index Thomisticus (now partly
treebanked) speak the same language of several other linguistic resources for Latin
that were created across the last decades. It is from the synergy of the (meta)data
provided by such resources that we can draw the overall picture as the necessary
condition to grasp the textual data, which in turn makes it possible to better
understand their content and, ultimately, to yield new knowledge.</p>
      <p>We are convinced that now is the time for the research area dealing with the
development and distribution of linguistic resources for Latin to nd a way to
harmonize such di erences in data and metadata, as a requirement raising both
from data providers and from data users. Indeed, the lack of interoperability
between resources prevents them from bene ting the large research community
working in the broad area of the Humanities, which often deals with Latin texts,
but is not provided with su cient expertise to make distributed resources using
di erent annotation schemes, tag sets and data formats interact. The result of
such situation is that a large set of valuable linguistic resources for Latin, built
with remarkable e ort by data providers in long lasting projects, still remains
unused (and sometimes unknown) by their reference community.</p>
      <p>
        LiLa was launched just to overcome such state of a air. The rst result of
the project was to provide the LiLa KB with its very core component, i.e. the
collection of Latin lemmas, which is used as the connecting point between the
resources that LiLa wants to make interact. Once the lemma collection was
ready, we started to include into the KB the rst linguistic resources for Latin,
among which is LatinA ectus. In the near future, we plan to include the Latin
Wordnet ([
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]) so that, besides the prior polarity sentiment score provided
by LatinA ectus, we will be able to assign a speci c score to the single meanings
of the words (assigned to di erent synsets of WordNet ), by relying on previous
work done for the SentiWordNet resource [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Baccianella</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Esuli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sebastiani</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining</article-title>
          .
          <source>In: Lrec</source>
          . vol.
          <volume>10</volume>
          , pp.
          <volume>2200</volume>
          {
          <issue>2204</issue>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arcan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iglesias</surname>
            ,
            <given-names>C.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanchez-Rada</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Strapparava</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Linguistic Linked Data for Sentiment Analysis</article-title>
          .
          <source>In: Proceedings of the 2nd Workshop on Linked Data in Linguistics (LDL-</source>
          <year>2013</year>
          ):
          <article-title>Representing and linking lexicons, terminologies and other language data</article-title>
          . pp.
          <volume>1</volume>
          {
          <issue>8</issue>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haase</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sintek</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Towards linguistically grounded ontologies</article-title>
          .
          <source>In: European Semantic Web Conference</source>
          . pp.
          <volume>111</volume>
          {
          <fpage>125</fpage>
          . Springer (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Busa</surname>
          </string-name>
          , R.: Index Thomisticus:
          <article-title>Sancti Thomae Aquinatis operum omnium indices et concordantiae in quibus verborum omnium et singulorum formae et lemmata cum suis frequentiis et contextibus variis modis referuntur. Index thomisticus: Sancti Thomae Aquinatis operum omnium indices et concordantiae</article-title>
          , Frommann-Holzboog (
          <year>1974</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skiena</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Building sentiment lexicons for all major languages</article-title>
          .
          <source>In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)</source>
          . pp.
          <volume>383</volume>
          {
          <issue>389</issue>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>B</given-names>
          </string-name>
          . (ed.): Thomas Aquinas. On Evil. Oxford University Press, Oxford (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Declerck</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Representation of Polarity Information of Elements of German Compound Words</article-title>
          .
          <source>In: LDL 2016 5th Workshop on Linked Data in Linguistics: Managing, Building and Using Linked Language Resources</source>
          . p.
          <volume>46</volume>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Deferrary</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Barry</surname>
            ,
            <given-names>I.:</given-names>
          </string-name>
          <article-title>A lexicon of St. Thomas Aquinas based on the Summa theologica and selected passages of his other works</article-title>
          . Catholic University of America Press, Washington (
          <year>1948</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Denooz</surname>
          </string-name>
          , J.:
          <article-title>Opera Latina: une base de donnees sur internet</article-title>
          .
          <source>Euphrosyne</source>
          <volume>32</volume>
          ,
          <issue>79</issue>
          {
          <fpage>88</fpage>
          (
          <year>2004</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Doerr</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata</article-title>
          .
          <source>AI</source>
          magazine
          <volume>24</volume>
          (
          <issue>3</issue>
          ),
          <volume>75</volume>
          {
          <fpage>75</fpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Fang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhan</surname>
          </string-name>
          , J.:
          <article-title>Sentiment analysis using product review data</article-title>
          .
          <source>Journal of Big Data</source>
          <volume>2</volume>
          (
          <issue>1</issue>
          ),
          <volume>5</volume>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Fillmore</surname>
            ,
            <given-names>C.J.e.a.</given-names>
          </string-name>
          :
          <article-title>Linguistics in the morning calm</article-title>
          .
          <source>Linguistics Society of Korea. Frame Semantics. Seou: Hanshin</source>
          (
          <year>1982</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Franzini</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peverelli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ru</surname>
            <given-names>olo</given-names>
          </string-name>
          , P.,
          <string-name>
            <surname>Passarotti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanna</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Signoroni</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ventura</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zampedri</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Nunc Est Aestimandum</article-title>
          .
          <article-title>Towards an Evaluation of the Latin WordNet</article-title>
          .
          <source>In: Proceedings of the Sixth Italian Conference on Computational Linguistics</source>
          (CLiC-it
          <year>2029</year>
          ).
          <article-title>CEUR-WS. org (</article-title>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Khan</surname>
            ,
            <given-names>A.F.</given-names>
          </string-name>
          :
          <source>Towards the Representation of Etymological Data on the Semantic Web. Information</source>
          <volume>9</volume>
          (
          <issue>12</issue>
          ),
          <volume>304</volume>
          (Dec
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Lau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          : Der lateinische
          <string-name>
            <given-names>Begri</given-names>
            <surname>Labor</surname>
          </string-name>
          . Fink,
          <string-name>
            <surname>Munich</surname>
          </string-name>
          (
          <year>1975</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Sentiment analysis: Mining opinions, sentiments, and emotions</article-title>
          . Cambridge University Press (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Mambrini</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passarotti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Harmonizing Di erent Lemmatization Strategies for Building a Knowledge Base of Linguistic Resources for Latin</article-title>
          .
          <source>In: Proceedings of the 13th Linguistic Annotation Workshop</source>
          . pp.
          <volume>71</volume>
          {
          <fpage>80</fpage>
          . Association for Computational Linguistics, Florence,
          <source>Italy (Aug</source>
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Mambrini</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passarotti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Representing Etymology in the LiLa Knowledge Base of Linguistic Resources for Latin</article-title>
          . In: Kernerman,
          <string-name>
            <given-names>I.</given-names>
            ,
            <surname>Krek</surname>
          </string-name>
          , S. (eds.)
          <source>Proceedings of Globalex Workshop on Linked Lexicography (GLOBALEX</source>
          <year>2020</year>
          ).
          <source>European Language Resources Association (elra)</source>
          , Paris, France (May
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>McCrae</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Spohr</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Linking lexical resources and ontologies on the semantic web with lemon</article-title>
          .
          <source>In: Extended Semantic Web Conference</source>
          . pp.
          <volume>245</volume>
          {
          <fpage>259</fpage>
          . Springer (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>McCrae</surname>
            ,
            <given-names>J.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bosque-Gil</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gracia</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buitelaar</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cimiano</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>The OntolexLemon model: development and applications</article-title>
          .
          <source>In: Proceedings of eLex 2017 conference</source>
          . pp.
          <volume>19</volume>
          {
          <issue>21</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>Minozzi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Latin WordNet, una rete di conoscenza semantica per il latino e alcune ipotesi di utilizzo nel campo dell'Information Retrieval</article-title>
          . In: Mastandrea,
          <string-name>
            <surname>P</surname>
          </string-name>
          . (ed.)
          <article-title>Strumenti digitali e collaborativi per le Scienze dell'</article-title>
          <source>Antichita</source>
          , pp.
          <volume>123</volume>
          {
          <fpage>134</fpage>
          . No. 14 in
          <string-name>
            <surname>Antichistica</surname>
          </string-name>
          (
          <year>2017</year>
          ), http://doi.org/10.14277/
          <fpage>6969</fpage>
          -182-9/ANT-14-10
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Mohammad</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Obtaining reliable human ratings of valence, arousal, and dominance for 20,000 English words</article-title>
          . In:
          <article-title>Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</article-title>
          . pp.
          <volume>174</volume>
          {
          <issue>184</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Moro</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>The Raising of Predicates: Predicative Noun Phrases and the Theory of Clause Structure</article-title>
          . Cambridge University Press, Cambridge (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Nakov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosenthal</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kiritchenko</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mohammad</surname>
            ,
            <given-names>S.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kozareva</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ritter</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stoyanov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          :
          <article-title>Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts</article-title>
          .
          <source>Language Resources and Evaluation</source>
          <volume>50</volume>
          (
          <issue>1</issue>
          ),
          <volume>35</volume>
          {
          <fpage>65</fpage>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Passarotti</surname>
            ,
            <given-names>M.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cecchini</surname>
            ,
            <given-names>F.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Franzini</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Litta</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mambrini</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ru</surname>
            <given-names>olo</given-names>
          </string-name>
          , P.:
          <article-title>The LiLa Knowledge Base of Linguistic Resources and NLP Tools for Latin</article-title>
          .
          <source>In: 2nd Conference on Language, Data and Knowledge (LDK</source>
          <year>2019</year>
          ). pp.
          <volume>6</volume>
          {
          <fpage>11</fpage>
          .
          <string-name>
            <surname>CEUR-WS. org</surname>
          </string-name>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Passarotti</surname>
            ,
            <given-names>M.C.</given-names>
          </string-name>
          :
          <article-title>The Project of the Index Thomisticus Treebank</article-title>
          . In: Berti, M. (ed.)
          <source>Classical Philology. Ancient Greek and Latin in the Digital Revolution</source>
          , pp.
          <volume>299</volume>
          {
          <fpage>320</fpage>
          . Berlin, Boston: De Gruyter (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Simpson</surname>
            ,
            <given-names>D.P.</given-names>
          </string-name>
          :
          <article-title>Cassell's Latin dictionary</article-title>
          . Simon &amp; Schuster Macmillan Company (
          <year>1959</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Sprugnoli</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Passarotti</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corbetta</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Peverelli</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <string-name>
            <surname>Odi</surname>
          </string-name>
          et Amo.
          <article-title>Creating, Evaluating and Extending Sentiment Lexicons for Latin</article-title>
          .
          <source>In: Proceedings of LREC</source>
          <year>2020</year>
          (
          <year>2020</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29. de Vaan, M.:
          <article-title>Etymological Dictionary of Latin: and the other Italic Languages</article-title>
          . Brill, Amsterdam (
          <year>2008</year>
          ), https://brill.com/view/title/12612
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Westerski</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sanchez-Rada</surname>
            ,
            <given-names>J.F.</given-names>
          </string-name>
          :
          <article-title>Marl Ontology Speci cation</article-title>
          ,
          <source>V1. 0 May</source>
          <year>2013</year>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>