=Paper=
{{Paper
|id=Vol-2481/paper11
|storemode=property
|title=Annotating Shakespeare’s Sonnets with Appraisal Theory to Detect Irony
|pdfUrl=https://ceur-ws.org/Vol-2481/paper11.pdf
|volume=Vol-2481
|authors=Nicolò Busetto,Rodolfo Delmonte
|dblpUrl=https://dblp.org/rec/conf/clic-it/BusettoD19
}}
==Annotating Shakespeare’s Sonnets with Appraisal Theory to Detect Irony==
Annotating Shakespeare’s Sonnets with Appraisal Theory to Detect Irony
Nicolò Busetto Rodolfo Delmonte
Department of Linguistic Studies Department of Linguistic Studies
Ca Foscari University Ca Foscari University
Ca Bembo - Venezia Ca Bembo - Venezia
830070@stud.unive.it delmont@unive.it
Abstract stata poi raccolta automaticamente e con-
frontata con il gold standard per verificare
English. In this paper we propose an ap- la persistenza di certi schemi che possono
proach to irony detection based on Ap- essere identificati come ironici, satirici o
praisal Theory(Martin and White(2005)) sarcastici, raggiungendo una corrispon-
in Shakespeare’s Sonnets, a well-known denza finale del 80%.
data set that is statistically valuable. In
order to produce meaningful experiments,
we created a gold standard by collecting 1 Introduction
opinions from famous literary critics on
Shakespeare’s Sonnets focusing on irony. Shakespeare’s Sonnets are a collection of 154 po-
We started by manually annotating the ems which is renowned for being full of ironic
data using Appraisal Theory as a refer- content (Weiser(1983)), (Weiser(1987)) and for its
ence theory. This choice is motivated by ambiguity thus sometimes reverting the overall in-
the fact that Appraisal annotation schemes terpretation of the sonnet. Lexical mbiguity, i.e.
allow smooth evaluation of highly elab- a word with several meanings, emanates from the
orated texts like political commentaries. way in which the author uses words that can be
The annotation is then automatically com- interpreted in more ways not only because inher-
piles and checked against the gold stan- ently polysemous, but because sometimes the ad-
dard in order to verify the persistence of ditional meaning meaning they evoke can some-
certain schemes that can be identified as times be derived on the basis of the sound, i.e.
ironic, satiric or sarcastic. Upon observa- homophone (see “eye”, “I” in sonnet 152). The
tion, irony detection reaches a final match sonnets are also full of metaphors which many
of 80%1 . times requires contextualising the content to the
historical Elizabethan life and society. Further-
Italiano. In questo articolo si propone un more, there is an abundance of words related to
approccio basato sulla Appraisal Theory specific language domains in the sonnets. For in-
per l’individuazione dell’ironia nei Sonetti stance, there are words related to the language of
di Shakespeare, un dataset che è statistica- economy, war, nature and to the discoveries of the
mente valido. Allo scopo di produrre es- modern age, and each of these words may be used
perimenti significativi, abbiamo creato un as a metaphor of love. Many of the sonnets are
gold standard raccogliendo le opinioni di organized around a conceptual contrast, an oppo-
famosi critici letterari sullo stesso corpus, sition that runs parallel and then diverges, some-
con l’ironia come tema. Abbiamo poi an- times with the use of the rhetorical figure of the
notato manualmente i sonetti utilizzando chiasmus. It is just this contrast that generates
gli strumenti e i tratti della Appraisal The- irony, sometimes satire, sarcasm, and even par-
ory che permettono di ottenere una valu- ody. Irony may be considered in turn as: what
tazione di testi altamente elaborati come one means using language that normally signifies
gli articoli di politica. L’annotazione è the opposite, typically for humorous or emphatic
1
effect; a state of affairs or an event that seems
Copyright c 2019 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0 contrary to what one expects and is amusing as
International (CC BY 4.0) a result. As to sarcasm this may be regarded the
use of irony to mock or convey contempt. Par- THEME: One against many ACTION: Young
ody is obtained by using the words or thoughts man urged to reproduce METAPHOR:
of a person but adapting them to a ridiculously Through progeny the young man will not be
inappropriate subject. There are several types of alone NEG.EVAL: The young man seems
irony, though we select verbal irony which, in the to be disinterested POS.EVAL: Young man
strict sense, is saying the opposite of what you positive aesthetic evaluation CONTRAST:
mean for outcome, and it depends on the extra- Between one and many
linguistics context(Attardo(1994)). As a result,
Satire and Irony are slightly overlapping but con- • SONNET 21
stitute two separate techniques; eventually Sar- SEQUENCE: 18-86 Time and Immortal-
casm can be regarded as a specialization or a sub- ity MAIN THEME: Love ACTION: The
set of Irony. It is important to remark that in many Young man must understand the sincerity
cases, these linguistic structures may require the of poet’s love METAPHOR: True love is
use of nonliteral or figurative language, i.e. the use sincere NEG.EVAL: The young man listens
of metaphors. This has been carefully taken into the false praise made by others POS.EVAL:
account when annotating the sonnets by means Young Man positive aesthetic evaluation
of Appraisal Theory Framework (hence ATF). In CONTRAST: Between true and fictitious love
our approach we will follow the so-called incon-
gruity presumption or incongruity-resolution pre- As can be seen, we indicate SEQUENCE for
sumption. Theories connected to the incongruity the thematic sequence into which the sonnet is in-
presumption are mostly cognitive-based and re- cluded; this is followed by MAIN THEME which
lated to concepts highlighted for instance, in (At- is the theme the sonnet deals with; ACTION re-
tardo(2000)). The focus of theorization under this ports the possible action proposed by the poet
presumption is that in humorous texts, or broadly to the protagonist of the poem; METAPHOR is
speaking in any humorous situation, there is an op- the main metaphor introduced in the poem some-
position between two alternative dimensions. As a times using words from a specialized domain;
result, we will look for contrast in our study of the NEG.EVAL and POS.EVAL stand for Negative
sonnets, produced by the contents of manual clas- Evaluation and Positive Evaluation contained in
sification. The purpose of this study is to show the poem in relation to the theme and the protag-
how ATF can be useful for detecting irony, con- onist(s); finally, CONTRAST is the key to signal
sidering its ambiguity and its elusive traits. presence of opposing concrete or abstract concepts
used by Shakespeare to reinforce the arguments
2 Producing the Gold Standard purported in the poem. Many sonnets have re-
ceived more than one possible pragmatic category.
In order to produce a gold standard that may en-
This is due to the difficulty in choosing one cate-
compass strong hints to classification in terms of
gory over another. In particular, it has been par-
humour as explained above, we collected literary
ticular hard to distinguish Irony from Satire, and
critics’ reviews of the sonnets. We used criticism
Irony from Sarcasm. Overall, we ended up with 54
from a set of authors including (Frye(1957))
sonnets receiving a double marking over 98, rep-
(Calimani(2009)) (Melchiori(1971)) (Ea-
resenting the total number of sonnets with some
gle(1916)) (Marelli(2015)) (Schoenfeldt(2010))
kind of pragmatic label by the literary critics, with
(Weiser(1987)) (Serpieri(2002)) all listed in the
a ratio of 98/154, corresponding to a percentage of
reference section. The gold standard classification
63.64%. We ended up with the count of annotated
has been produced by second author and checked
sonnets reported above in Table 1.
by first author. It is organized into a number
of separate fields in a sequence to allow the Eventually, as commented in the section be-
reader to get a better picture of the sonnet in the low, the introduction of annotations based on Ap-
collection. All classifications are reported in a praisal Theory has helped in choosing best prag-
supplementary file in the Appendix. Here below matic classification. In fact, literary critics were
are the classifications for two sonnets: simply hinting at "irony" or "satire", but the anno-
tation gave us a precise measure of the level of
• SONNET 8 contrast present in each of the sonnets regarded
SEQUENCE: 1-17 Procreation MAIN generically as "ironic".
• Judgement is any kind of ethical evaluation of
Table 1: Final distribution of sonnets in the 5 prag-
human behaviour, (e.g. good/bad), and con-
matic categories
siders the ethical evaluation on people and
Type Quantity their behaviours.
Blank 57
Irony 73 • Appreciation is every aesthetic or functional
Satire 20 evaluation of things, processes and state of
Parody 4 affairs (e.g. beautiful/ugly; useful/useless),
Sarcasm 47 and represent any aesthetic evaluation of
Duplicated 54 things, both man-made and natural phenom-
ena.
2.1 Appraisal Theory for Poetry and Eventually, we end up with six different classes:
Literary Texts Affect positive, Affect Negative, Judgement Pos-
itive, Judgement Negative, Appreciation Positive,
The experiment we have been working on is an Appreciation Negative. Overall in the annotation
attempt to describe irony, parody and sarcasm in there is a total majority of positive polarities with
terms of a strict scientifically viable linguistic the- a ratio of 0.511, in comparison to negative anno-
ory, the Appraisal Framework Theory (Martin and tations with a ratio of 0.488. In short, the whole
White(2005)), as has already been done in the past of the positive poles is 607, and the totality of the
by other authors (see (Taboada and Grieve(2004)) negative poles is 579 for a total number of 1186
(Read and Carrol(2012)) but also (Stingo and Del- annotations. Judgement is the more interesting
monte(2016)) (Delmonte and Marchesini(2017)) . category because it allows social moral sanction,
The idea is as follows: produce a complete anno- in that it refers to two subfields, Social Esteem
tation of the sonnets using the tools made avail- and Social Sanction - which however we decided
able by the theory and then verify how well it fits not to mark. In particular, whereas the positive
into the gold standard produced. The primary pur- polarity annotation of Judgement extends to Ad-
pose of the Appraisal Framework Theory(hence miration and Praise, the negative polarity annota-
AFT) is to delineate the interpersonal dimension tion deals with Criticism and Condemnation or So-
of communication, supplying schemes by which cial Esteem and Social Sanction (see (Martin and
it is possible to recognize evaluative sequences White(2005)), p.52). In particular, Judgement is
within texts and information about the positioning found mainly in the final couplet of the sonnets.
of the author in relation to evaluated targets.2 The annotation work on the texts has been
The annotation has been organized around only accomplished by first author and checked by
one category, Attitude, and its direct subcate- second author. Given the level of objective
gories, in order to keep the annotation at a more difficulty in understanding the semantic content
workable level, and to optimize time and space in of the sonnets, we have decided not to resort to
the XML annotation. Attitude includes different additional annotators - second author produced
options for expressing positive or negative evalua- the annotation as part of his Master thesis work.
tion, and expresses the author’s feelings. The main So far, we have not been able to produce a mea-
category is divided into three primary fields with sure for interannotator agreement: however, since
their relative positive or negative polarity, namely: I was obliged to correct 35% of all annotations
that measure could be approximated by 65% of
• Affect is every emotional evaluation of agreement. The tags we used for the annotation
things, processes or states of affairs, (e.g. include a tag for contains the whole text
like/dislike), it describes proper feelings and of the sonnet; to mark stanzas, and
any emotional reaction within the text aimed to mark lines. Focusing on the annotation of
towards human behaviour/process and phe- the evaluative sequences instead, every time we
nomena. found an evaluative word (or sequence of words),
2
we delimited the item/phrase within the tags
Further information can be found on the dedicated
website dedicated to the Appraisal Framework Theory: . Subsequently, following the
http://www.languageofevaluation.info/appraisal/ general indications mentioned above provided by
(Martin and White(2005)), we assigned one of In the choice of which and how many items
the three subcategories – affect, judgement and to annotate, we adopted the following linguistic
appreciation – as an attribute of the tag , criteria to enhance the notational analysis.
also providing the positive/negative sentiment
orientation as a value of the attribute. Here below • Semantic criteria:
we show the annotation for Sonnet 40 which is Anytime one or more verb/noun modifiers are
highly contrasted: found, when they do not represent meaning-
ful evaluation by themselves, they are anno-
Take all my loves, my love, they contribute to modify. Any instance of
yea take them all, What hast thou then evaluation of a multiword expression, is an-
more than thou hadst before? No love, notated as a single appraisal unit. Any in-
mylove, that stance of evaluation of rhetorical or figurative
thou mayst language, is annotated as a single appraisal
true love call, All mine was unit. When possible, the evaluations are em-
thine, before thou hadst this more: bedded so as to include appraisal units into a
Then if for mylove, thou mylove receivest, goges, rhetorical questions, interjections and
I cannotblame thee, for • Syntactic Criteria:
mylove thou Without exceeding the length of the propo-
usest, But yetbe blamed, if single appraisal unit up until a clause-level,
thou thy selfdeceivest tions. Additionally, for those cases where
Bywilful taste limited ourselves to the annotation of the
of what thy selfrefusest
quence, so as to avoid overproduction of
I long annotation. Again, when possible,
do forgive thy robbery gentle
the evaluation on a clause-level in greater de-
thief textbfAlthoughthou steal thee quences on a clause level even beyond the
all my poverty: punctuation marks limits. However, these an-
And yet love knows it is agreater grief of items, whenever they share the same at-
Tobear tribute and the same polarity orientation, they
love’s wrong, thanhate’s known injury . case of more than three items in a row that
Lascivious grace, orientation, they were annotated separately.
in whom all ill well As to interpretation criteria, we assumed that
shows, Kill me withspites yet we must not be justified by the fact that a high level of Negative
foes.
Judgements accompanied by Positive Apprecia-
tions or Affect is by itself interpretable as the in-
tention to provoke a sarcastic mood. As a final
result, there are 44 sonnets that present the highest
Table 3: Quantitative data for six appraisal classes
contrast and are specifically classified according
for sonnets with lowest contrast
to the six classes above (see Figure 1 in the Ap-
pendix). There is also a group that contains am- Classes Sum Mean St.Dev.
biguous sonnets which have been classified with Appr.Pos 139 5.346 18.821
a double class, mainly by Irony and Sarcasm. As Appr.Neg 65 2.5 8.844
a first remark, in all these sonnets, negative polar- Affct.Pos 64 2.462 8.708
ity is higher than positive polarity with the excep- Affct.Neg 81 3.115 11.009
tion of sonnet 106. In other words, if we consider Judgm.Pos 59 2.269 8.029
this annotation as the one containing the highest Judgm.Neg 37 1.423 5.047
levels of Judgement, we come to the conclusion
that possible Sarcasm reading is mostly associated
with presence of Judgement Negative and in gen- Table 4: Quantitative data for six appraisal classes
eral with high Negative polarity annotations (see for sonnets with no contrast
table 2 below). As a first result, we may notice Classes Sum Mean St.Dev.
a very high convergence existing between critics’ Appr.Pos 88 3.034 1.269
opinions as classified by us with the label highest Appr.Neg 59 2.034 7.638
contrast and the output of manual annotation by Affct.Pos 89 3.069 11.483
Appraisal classes. Affct.Neg 109 3.759 14.052
Judgm.Pos 49 1.689 6.367
Table 2: Quantitative data for six appraisal classes Judgm.Neg 8 0.276 1.079
for sonnets with highest contrast
Classes Sum Mean St.Dev.
nets look different from the two groups we already
Appr.Pos 56 2.534 8.199
analysed. The prevailing trait is Affect Negative;
Appr.Neg 25 1.134 3.691
Judgement Negative is only occasionally present;
Affct.Pos 53 2.4 7.733
the second preminent trait is Affect Positive. In
Affct.Neg 77 3.467 11.202
order to know how much the difference is, we can
Judgm.Pos 32 1.445 4.721
judge from the quantities shown in table 3 above
Judgm.Neg 122 5.467 17.611
(but see also Figure 3 in the Appendix).
In particular, in this case the ratio Nega-
In the group of 50 sonnets classified, mainly or tive/Positive is more balanced 226 over 176 with a
exclusively, with Irony, the presence of Judgement majority of Positive annotations as happened with
Negative is much lower than in the previous ta- Irony but with a lower gap. The appraisal category
ble for Sarcasm (see Figure 2 in the Appendix). with highest number of annotations is now Affect,
In fact only half of them – 25 – has annotation whereas in the case of Irony it was Appreciation,
for that class, the remaining half introduces two and in Sarcasm it was Judgement. So eventually
other negative classes: mainly Affect Negative, we have been able to differentiate the three main
but also Appreciation Negative - see table 3 be- and more frequent pragmatic categories by means
low. As to the main Positive class, we can see that of Appraisal Framework features: they are char-
it is no longer Judgement Positive, but Apprecia- acterized by a different distribution of positive vs.
tion Positive which is present in 33 sonnets. This negative evaluations and also by a prominent pres-
is followed by Affect Positive which is better dis- ence of one of the three main subcategories into
tributed. which Appraisal has been subdivided that is Ap-
In other words we can now consider that Sar- preciation for Irony, Judgement for Sarcasm and
casm is characterized by a majority of negative Affect where no evaluation has been expressed.
evaluations 224 over 141; while Irony is charac-
terized by a majority of Positive evaluations 262 3 Conclusion
over 183 and that the values are sparse and un-
equally distributed. The final table concerns the In this paper we have presented work carried out to
number of sonnets with blank evaluation by critics annotate and experiment with the theme of irony in
which amount to 60. As a rule, this group of son- Shakespeare’s Sonnets. The gold standard for the
experiment has been created by collecting com- M. Taboada and J. Grieve. 2004. Analyzing appraisal
ments produced by literary critics on the presence automatically. In Proceedings of the AAAI Spring
Symposium on Exploring Attitude and Affect in Text:
of some kind of thematic, semantic and syntac-
Theories and Applications, pages 158–161. AAAI
tic opposition in the sonnets as to produce some Press.
sort of irony. At first the sonnets have been an-
notated using the framework of Appraisal Theory David K. Weiser. 1983.
http://www.jstor.org/stable/43343552 Shake-
and then we checked the results: we obtained a spearean irony: The ’sonnets’. Neuphilologische
very high level of matching with the critics’ opin- Mitteilungen, 84(4):456–469.
ions at 80%. Eventually, Appraisal framework has
David K. Weiser. 1987. Mind in Character – Shake-
shown its ability to classify and diversify different
speare s Speaker in the Sonnets. The University of
levels of irony effectively. Missouri Press.
References
Salvatore Attardo. 1994. Linguistic Theories of Hu-
mor. Mouton de Gruyter, Berlin – New York.
Salvatore Attardo. 2000. Irony as relevant inappropri-
ateness. Journal of Pragmatics, 84(32).
Dario Calimani. 2009. William Shakespeare, I sonetti
della menzogna. Carrocci, Roma.
Rodolfo Delmonte and Giulia Marchesini. 2017. A
semantically-based approach to the annotation of
narrative style. In Proceedings of the 13th Joint ISO-
ACL Workshop on Interoperable Semantic Annota-
tion (ISA-13), pages 14–25, Stroudsburg, PA, USA.
ACL.
R.L. Eagle. 1916. New light on the enigmas of Shake-
speare’s Sonnets. John Long Limited, London.
Northrop Frye. 1957. Anatomy of Criticism: Four Es-
says. Princeton University Press.
Maria Antonietta Marelli. 2015. William Shakespeare,
I Sonetti – con testo a fronte. Garzanti.
J. Martin and P.R. White. 2005. Language of Eval-
uation, Appraisal in English. Palgrave Macmillan,
London and New York.
Giorgio Melchiori. 1971. Shakespeare s Sonnets.
Adriatica Editrice, Bari.
J. Read and J. Carrol. 2012. Annotating expressions of
appraisal in english. Language Resources and Eval-
uation, 46:421–447.
Michael Schoenfeldt. 2010. Cambridge introduction to
Shakespeare’s poetry. Cambridge University Press,
Cambridge.
Alessandro Serpieri. 2002. Polifonia Shakespeariana.
Bulzoni, Roma.
Michele Stingo and Rodolfo Delmonte. 2016. Anno-
tating satire in italian political commentaries with
appraisal theory. In Natural Language Processing
meets Journalism - Proceedings of the Workshop,
NLPMJ-2016, pages 74–79, Stroudsburg, PA, USA.
ACL.
APPENDIX.
Figures Of the Six Pragmatic Categories for Appraisal-Based Classification
Figure 1: Subdivision into six appraisal classes for sonnets with highest contrast
Figure 2: Subdivision into six appraisal classes for sonnets with lowest contrast
Figure 3: Subdivision into six appraisal classes for sonnets with no contrast