=Paper= {{Paper |id=Vol-2481/paper11 |storemode=property |title=Annotating Shakespeare’s Sonnets with Appraisal Theory to Detect Irony |pdfUrl=https://ceur-ws.org/Vol-2481/paper11.pdf |volume=Vol-2481 |authors=Nicolò Busetto,Rodolfo Delmonte |dblpUrl=https://dblp.org/rec/conf/clic-it/BusettoD19 }} ==Annotating Shakespeare’s Sonnets with Appraisal Theory to Detect Irony== https://ceur-ws.org/Vol-2481/paper11.pdf
Annotating Shakespeare’s Sonnets with Appraisal Theory to Detect Irony

                      Nicolò Busetto                                 Rodolfo Delmonte
              Department of Linguistic Studies                 Department of Linguistic Studies
                   Ca Foscari University                            Ca Foscari University
                   Ca Bembo - Venezia                               Ca Bembo - Venezia
               830070@stud.unive.it                               delmont@unive.it


                        Abstract                                stata poi raccolta automaticamente e con-
                                                                frontata con il gold standard per verificare
       English. In this paper we propose an ap-                 la persistenza di certi schemi che possono
       proach to irony detection based on Ap-                   essere identificati come ironici, satirici o
       praisal Theory(Martin and White(2005))                   sarcastici, raggiungendo una corrispon-
       in Shakespeare’s Sonnets, a well-known                   denza finale del 80%.
       data set that is statistically valuable. In
       order to produce meaningful experiments,
       we created a gold standard by collecting            1    Introduction
       opinions from famous literary critics on
       Shakespeare’s Sonnets focusing on irony.            Shakespeare’s Sonnets are a collection of 154 po-
       We started by manually annotating the               ems which is renowned for being full of ironic
       data using Appraisal Theory as a refer-             content (Weiser(1983)), (Weiser(1987)) and for its
       ence theory. This choice is motivated by            ambiguity thus sometimes reverting the overall in-
       the fact that Appraisal annotation schemes          terpretation of the sonnet. Lexical mbiguity, i.e.
       allow smooth evaluation of highly elab-             a word with several meanings, emanates from the
       orated texts like political commentaries.           way in which the author uses words that can be
       The annotation is then automatically com-           interpreted in more ways not only because inher-
       piles and checked against the gold stan-            ently polysemous, but because sometimes the ad-
       dard in order to verify the persistence of          ditional meaning meaning they evoke can some-
       certain schemes that can be identified as           times be derived on the basis of the sound, i.e.
       ironic, satiric or sarcastic. Upon observa-         homophone (see “eye”, “I” in sonnet 152). The
       tion, irony detection reaches a final match         sonnets are also full of metaphors which many
       of 80%1 .                                           times requires contextualising the content to the
                                                           historical Elizabethan life and society. Further-
       Italiano. In questo articolo si propone un          more, there is an abundance of words related to
       approccio basato sulla Appraisal Theory             specific language domains in the sonnets. For in-
       per l’individuazione dell’ironia nei Sonetti        stance, there are words related to the language of
       di Shakespeare, un dataset che è statistica-        economy, war, nature and to the discoveries of the
       mente valido. Allo scopo di produrre es-            modern age, and each of these words may be used
       perimenti significativi, abbiamo creato un          as a metaphor of love. Many of the sonnets are
       gold standard raccogliendo le opinioni di           organized around a conceptual contrast, an oppo-
       famosi critici letterari sullo stesso corpus,       sition that runs parallel and then diverges, some-
       con l’ironia come tema. Abbiamo poi an-             times with the use of the rhetorical figure of the
       notato manualmente i sonetti utilizzando            chiasmus. It is just this contrast that generates
       gli strumenti e i tratti della Appraisal The-       irony, sometimes satire, sarcasm, and even par-
       ory che permettono di ottenere una valu-            ody. Irony may be considered in turn as: what
       tazione di testi altamente elaborati come           one means using language that normally signifies
       gli articoli di politica. L’annotazione è           the opposite, typically for humorous or emphatic
   1
                                                           effect; a state of affairs or an event that seems
     Copyright c 2019 for this paper by its authors. Use
permitted under Creative Commons License Attribution 4.0   contrary to what one expects and is amusing as
International (CC BY 4.0)                                  a result. As to sarcasm this may be regarded the
use of irony to mock or convey contempt. Par-                 THEME: One against many ACTION: Young
ody is obtained by using the words or thoughts                man urged to reproduce METAPHOR:
of a person but adapting them to a ridiculously               Through progeny the young man will not be
inappropriate subject. There are several types of             alone NEG.EVAL: The young man seems
irony, though we select verbal irony which, in the            to be disinterested POS.EVAL: Young man
strict sense, is saying the opposite of what you              positive aesthetic evaluation CONTRAST:
mean for outcome, and it depends on the extra-                Between one and many
linguistics context(Attardo(1994)). As a result,
Satire and Irony are slightly overlapping but con-         • SONNET 21
stitute two separate techniques; eventually Sar-             SEQUENCE: 18-86 Time and Immortal-
casm can be regarded as a specialization or a sub-           ity MAIN THEME: Love ACTION: The
set of Irony. It is important to remark that in many         Young man must understand the sincerity
cases, these linguistic structures may require the           of poet’s love METAPHOR: True love is
use of nonliteral or figurative language, i.e. the use       sincere NEG.EVAL: The young man listens
of metaphors. This has been carefully taken into             the false praise made by others POS.EVAL:
account when annotating the sonnets by means                 Young Man positive aesthetic evaluation
of Appraisal Theory Framework (hence ATF). In                CONTRAST: Between true and fictitious love
our approach we will follow the so-called incon-
gruity presumption or incongruity-resolution pre-           As can be seen, we indicate SEQUENCE for
sumption. Theories connected to the incongruity          the thematic sequence into which the sonnet is in-
presumption are mostly cognitive-based and re-           cluded; this is followed by MAIN THEME which
lated to concepts highlighted for instance, in (At-      is the theme the sonnet deals with; ACTION re-
tardo(2000)). The focus of theorization under this       ports the possible action proposed by the poet
presumption is that in humorous texts, or broadly        to the protagonist of the poem; METAPHOR is
speaking in any humorous situation, there is an op-      the main metaphor introduced in the poem some-
position between two alternative dimensions. As a        times using words from a specialized domain;
result, we will look for contrast in our study of the    NEG.EVAL and POS.EVAL stand for Negative
sonnets, produced by the contents of manual clas-        Evaluation and Positive Evaluation contained in
sification. The purpose of this study is to show         the poem in relation to the theme and the protag-
how ATF can be useful for detecting irony, con-          onist(s); finally, CONTRAST is the key to signal
sidering its ambiguity and its elusive traits.           presence of opposing concrete or abstract concepts
                                                         used by Shakespeare to reinforce the arguments
2    Producing the Gold Standard                         purported in the poem. Many sonnets have re-
                                                         ceived more than one possible pragmatic category.
In order to produce a gold standard that may en-
                                                         This is due to the difficulty in choosing one cate-
compass strong hints to classification in terms of
                                                         gory over another. In particular, it has been par-
humour as explained above, we collected literary
                                                         ticular hard to distinguish Irony from Satire, and
critics’ reviews of the sonnets. We used criticism
                                                         Irony from Sarcasm. Overall, we ended up with 54
from a set of authors including (Frye(1957))
                                                         sonnets receiving a double marking over 98, rep-
(Calimani(2009))         (Melchiori(1971))     (Ea-
                                                         resenting the total number of sonnets with some
gle(1916)) (Marelli(2015)) (Schoenfeldt(2010))
                                                         kind of pragmatic label by the literary critics, with
(Weiser(1987)) (Serpieri(2002)) all listed in the
                                                         a ratio of 98/154, corresponding to a percentage of
reference section. The gold standard classification
                                                         63.64%. We ended up with the count of annotated
has been produced by second author and checked
                                                         sonnets reported above in Table 1.
by first author. It is organized into a number
of separate fields in a sequence to allow the               Eventually, as commented in the section be-
reader to get a better picture of the sonnet in the      low, the introduction of annotations based on Ap-
collection. All classifications are reported in a        praisal Theory has helped in choosing best prag-
supplementary file in the Appendix. Here below           matic classification. In fact, literary critics were
are the classifications for two sonnets:                 simply hinting at "irony" or "satire", but the anno-
                                                         tation gave us a precise measure of the level of
    • SONNET 8                                           contrast present in each of the sonnets regarded
      SEQUENCE:        1-17    Procreation     MAIN      generically as "ironic".
                                                            • Judgement is any kind of ethical evaluation of
Table 1: Final distribution of sonnets in the 5 prag-
                                                              human behaviour, (e.g. good/bad), and con-
matic categories
                                                              siders the ethical evaluation on people and
              Type            Quantity                        their behaviours.
              Blank           57
              Irony           73                            • Appreciation is every aesthetic or functional
              Satire          20                              evaluation of things, processes and state of
              Parody          4                               affairs (e.g. beautiful/ugly; useful/useless),
              Sarcasm         47                              and represent any aesthetic evaluation of
              Duplicated      54                              things, both man-made and natural phenom-
                                                              ena.

2.1    Appraisal Theory for Poetry and                       Eventually, we end up with six different classes:
       Literary Texts                                     Affect positive, Affect Negative, Judgement Pos-
                                                          itive, Judgement Negative, Appreciation Positive,
The experiment we have been working on is an              Appreciation Negative. Overall in the annotation
attempt to describe irony, parody and sarcasm in          there is a total majority of positive polarities with
terms of a strict scientifically viable linguistic the-   a ratio of 0.511, in comparison to negative anno-
ory, the Appraisal Framework Theory (Martin and           tations with a ratio of 0.488. In short, the whole
White(2005)), as has already been done in the past        of the positive poles is 607, and the totality of the
by other authors (see (Taboada and Grieve(2004))          negative poles is 579 for a total number of 1186
(Read and Carrol(2012)) but also (Stingo and Del-         annotations. Judgement is the more interesting
monte(2016)) (Delmonte and Marchesini(2017)) .            category because it allows social moral sanction,
The idea is as follows: produce a complete anno-          in that it refers to two subfields, Social Esteem
tation of the sonnets using the tools made avail-         and Social Sanction - which however we decided
able by the theory and then verify how well it fits       not to mark. In particular, whereas the positive
into the gold standard produced. The primary pur-         polarity annotation of Judgement extends to Ad-
pose of the Appraisal Framework Theory(hence              miration and Praise, the negative polarity annota-
AFT) is to delineate the interpersonal dimension          tion deals with Criticism and Condemnation or So-
of communication, supplying schemes by which              cial Esteem and Social Sanction (see (Martin and
it is possible to recognize evaluative sequences          White(2005)), p.52). In particular, Judgement is
within texts and information about the positioning        found mainly in the final couplet of the sonnets.
of the author in relation to evaluated targets.2             The annotation work on the texts has been
   The annotation has been organized around only          accomplished by first author and checked by
one category, Attitude, and its direct subcate-           second author. Given the level of objective
gories, in order to keep the annotation at a more         difficulty in understanding the semantic content
workable level, and to optimize time and space in         of the sonnets, we have decided not to resort to
the XML annotation. Attitude includes different           additional annotators - second author produced
options for expressing positive or negative evalua-       the annotation as part of his Master thesis work.
tion, and expresses the author’s feelings. The main       So far, we have not been able to produce a mea-
category is divided into three primary fields with        sure for interannotator agreement: however, since
their relative positive or negative polarity, namely:     I was obliged to correct 35% of all annotations
                                                          that measure could be approximated by 65% of
  • Affect is every emotional evaluation of               agreement. The tags we used for the annotation
    things, processes or states of affairs, (e.g.         include a tag for  contains the whole text
    like/dislike), it describes proper feelings and       of the sonnet; 

to mark stanzas, and any emotional reaction within the text aimed to mark lines. Focusing on the annotation of towards human behaviour/process and phe- the evaluative sequences instead, every time we nomena. found an evaluative word (or sequence of words), 2 we delimited the item/phrase within the tags Further information can be found on the dedicated website dedicated to the Appraisal Framework Theory: . Subsequently, following the http://www.languageofevaluation.info/appraisal/ general indications mentioned above provided by (Martin and White(2005)), we assigned one of In the choice of which and how many items the three subcategories – affect, judgement and to annotate, we adopted the following linguistic appreciation – as an attribute of the tag , criteria to enhance the notational analysis. also providing the positive/negative sentiment orientation as a value of the attribute. Here below • Semantic criteria: we show the annotation for Sonnet 40 which is Anytime one or more verb/noun modifiers are highly contrasted: found, when they do not represent meaning- ful evaluation by themselves, they are anno-

Take all my loves, my love, they contribute to modify. Any instance of yea take them all, What hast thou then evaluation of a multiword expression, is an- more than thou hadst before? No love, notated as a single appraisal unit. Any in- mylove,that stance of evaluation of rhetorical or figurative thou mayst language, is annotated as a single appraisal truelove call, All mine was unit. When possible, the evaluations are em- thine, before thou hadst this more: bedded so as to include appraisal units into a

Then if for mylove,thou mylovereceivest, goges, rhetorical questions, interjections and I cannotblamethee, for • Syntactic Criteria: mylovethou Without exceeding the length of the propo- usest, But yetbe blamed,if single appraisal unit up until a clause-level, thou thy selfdeceivest tions. Additionally, for those cases where Bywilfultaste limited ourselves to the annotation of the of what thy selfrefusest

quence, so as to avoid overproduction of

I long annotation. Again, when possible, do forgivethy robbery gentle the evaluation on a clause-level in greater de- thief textbfAlthoughthou steal thee quences on a clause level even beyond the all my poverty: punctuation marks limits. However, these an- And yet love knows it is agreater grief of items, whenever they share the same at- Tobear tribute and the same polarity orientation, they love’s wrong,thanhate’s known injury. case of more than three items in a row that

Lasciviousgrace, orientation, they were annotated separately. in whom all ill well As to interpretation criteria, we assumed that shows, Kill me withspitesyet we must not be justified by the fact that a high level of Negative foes.

Judgements accompanied by Positive Apprecia- tions or Affect is by itself interpretable as the in- tention to provoke a sarcastic mood. As a final result, there are 44 sonnets that present the highest Table 3: Quantitative data for six appraisal classes contrast and are specifically classified according for sonnets with lowest contrast to the six classes above (see Figure 1 in the Ap- pendix). There is also a group that contains am- Classes Sum Mean St.Dev. biguous sonnets which have been classified with Appr.Pos 139 5.346 18.821 a double class, mainly by Irony and Sarcasm. As Appr.Neg 65 2.5 8.844 a first remark, in all these sonnets, negative polar- Affct.Pos 64 2.462 8.708 ity is higher than positive polarity with the excep- Affct.Neg 81 3.115 11.009 tion of sonnet 106. In other words, if we consider Judgm.Pos 59 2.269 8.029 this annotation as the one containing the highest Judgm.Neg 37 1.423 5.047 levels of Judgement, we come to the conclusion that possible Sarcasm reading is mostly associated with presence of Judgement Negative and in gen- Table 4: Quantitative data for six appraisal classes eral with high Negative polarity annotations (see for sonnets with no contrast table 2 below). As a first result, we may notice Classes Sum Mean St.Dev. a very high convergence existing between critics’ Appr.Pos 88 3.034 1.269 opinions as classified by us with the label highest Appr.Neg 59 2.034 7.638 contrast and the output of manual annotation by Affct.Pos 89 3.069 11.483 Appraisal classes. Affct.Neg 109 3.759 14.052 Judgm.Pos 49 1.689 6.367 Table 2: Quantitative data for six appraisal classes Judgm.Neg 8 0.276 1.079 for sonnets with highest contrast Classes Sum Mean St.Dev. nets look different from the two groups we already Appr.Pos 56 2.534 8.199 analysed. The prevailing trait is Affect Negative; Appr.Neg 25 1.134 3.691 Judgement Negative is only occasionally present; Affct.Pos 53 2.4 7.733 the second preminent trait is Affect Positive. In Affct.Neg 77 3.467 11.202 order to know how much the difference is, we can Judgm.Pos 32 1.445 4.721 judge from the quantities shown in table 3 above Judgm.Neg 122 5.467 17.611 (but see also Figure 3 in the Appendix). In particular, in this case the ratio Nega- In the group of 50 sonnets classified, mainly or tive/Positive is more balanced 226 over 176 with a exclusively, with Irony, the presence of Judgement majority of Positive annotations as happened with Negative is much lower than in the previous ta- Irony but with a lower gap. The appraisal category ble for Sarcasm (see Figure 2 in the Appendix). with highest number of annotations is now Affect, In fact only half of them – 25 – has annotation whereas in the case of Irony it was Appreciation, for that class, the remaining half introduces two and in Sarcasm it was Judgement. So eventually other negative classes: mainly Affect Negative, we have been able to differentiate the three main but also Appreciation Negative - see table 3 be- and more frequent pragmatic categories by means low. As to the main Positive class, we can see that of Appraisal Framework features: they are char- it is no longer Judgement Positive, but Apprecia- acterized by a different distribution of positive vs. tion Positive which is present in 33 sonnets. This negative evaluations and also by a prominent pres- is followed by Affect Positive which is better dis- ence of one of the three main subcategories into tributed. which Appraisal has been subdivided that is Ap- In other words we can now consider that Sar- preciation for Irony, Judgement for Sarcasm and casm is characterized by a majority of negative Affect where no evaluation has been expressed. evaluations 224 over 141; while Irony is charac- terized by a majority of Positive evaluations 262 3 Conclusion over 183 and that the values are sparse and un- equally distributed. The final table concerns the In this paper we have presented work carried out to number of sonnets with blank evaluation by critics annotate and experiment with the theme of irony in which amount to 60. As a rule, this group of son- Shakespeare’s Sonnets. The gold standard for the experiment has been created by collecting com- M. Taboada and J. Grieve. 2004. Analyzing appraisal ments produced by literary critics on the presence automatically. In Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text: of some kind of thematic, semantic and syntac- Theories and Applications, pages 158–161. AAAI tic opposition in the sonnets as to produce some Press. sort of irony. At first the sonnets have been an- notated using the framework of Appraisal Theory David K. Weiser. 1983. http://www.jstor.org/stable/43343552 Shake- and then we checked the results: we obtained a spearean irony: The ’sonnets’. Neuphilologische very high level of matching with the critics’ opin- Mitteilungen, 84(4):456–469. ions at 80%. Eventually, Appraisal framework has David K. Weiser. 1987. Mind in Character – Shake- shown its ability to classify and diversify different speare s Speaker in the Sonnets. The University of levels of irony effectively. Missouri Press. References Salvatore Attardo. 1994. Linguistic Theories of Hu- mor. Mouton de Gruyter, Berlin – New York. Salvatore Attardo. 2000. Irony as relevant inappropri- ateness. Journal of Pragmatics, 84(32). Dario Calimani. 2009. William Shakespeare, I sonetti della menzogna. Carrocci, Roma. Rodolfo Delmonte and Giulia Marchesini. 2017. A semantically-based approach to the annotation of narrative style. In Proceedings of the 13th Joint ISO- ACL Workshop on Interoperable Semantic Annota- tion (ISA-13), pages 14–25, Stroudsburg, PA, USA. ACL. R.L. Eagle. 1916. New light on the enigmas of Shake- speare’s Sonnets. John Long Limited, London. Northrop Frye. 1957. Anatomy of Criticism: Four Es- says. Princeton University Press. Maria Antonietta Marelli. 2015. William Shakespeare, I Sonetti – con testo a fronte. Garzanti. J. Martin and P.R. White. 2005. Language of Eval- uation, Appraisal in English. Palgrave Macmillan, London and New York. Giorgio Melchiori. 1971. Shakespeare s Sonnets. Adriatica Editrice, Bari. J. Read and J. Carrol. 2012. Annotating expressions of appraisal in english. Language Resources and Eval- uation, 46:421–447. Michael Schoenfeldt. 2010. Cambridge introduction to Shakespeare’s poetry. Cambridge University Press, Cambridge. Alessandro Serpieri. 2002. Polifonia Shakespeariana. Bulzoni, Roma. Michele Stingo and Rodolfo Delmonte. 2016. Anno- tating satire in italian political commentaries with appraisal theory. In Natural Language Processing meets Journalism - Proceedings of the Workshop, NLPMJ-2016, pages 74–79, Stroudsburg, PA, USA. ACL. APPENDIX. Figures Of the Six Pragmatic Categories for Appraisal-Based Classification Figure 1: Subdivision into six appraisal classes for sonnets with highest contrast Figure 2: Subdivision into six appraisal classes for sonnets with lowest contrast Figure 3: Subdivision into six appraisal classes for sonnets with no contrast