<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Does Anyone see the Irony here? Analysis of Perspective-aware Model Predictions in Irony Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Simona Frenda</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Soda Marem Lo</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Silvia Casola</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Bianca Scarlini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cristina Marco</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Valerio Basile</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Davide Bernardi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Alexa AI, Amazon Development Centre Italy</institution>
          ,
          <addr-line>Turin</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Computer Science Department, University of Turin</institution>
          ,
          <addr-line>Turin</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>aequa-tech srl</institution>
          ,
          <addr-line>Turin</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the framework of perspectivism, analyzing how people perceive pragmatic phenomena, like irony, is relevant for deeply understanding the diferent points of view, and for creating more robust perspective-aware models. This paper presents a linguistic analysis of irony perception in 11 perspectivist models. Each model is trained on annotations by crowd-sourcing workers diferent in gender, age, and nationalities. Due to the sparsity of the dataset, we examine the texts classified as ironic and not-ironic by these perspectivist models, and identify linguistic patterns that all perspectives associate with irony. To our knowledge, we are the first to also provide evidence for the diferent linguistic patterns perceived as ironic by a specific perspective. For example, models trained on data annotated by American and Australian annotators are more inclined to classify a text as ironic when it includes a negative sentiment, while models trained on data annotated by the youngest annotators are particularly influenced by words related to immoral behaviors. Warning: This paper could contain content that is ofensive or upsetting for the reader.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Irony Detection</kwd>
        <kwd>Irony Interpretation</kwd>
        <kwd>Perspectivism</kwd>
        <kwd>Linguistic Analysis</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction
nomenon by a majority of people [5], irony tends to be
closely related to the cultural and personal background
The use of supervised learning is the core of several ar- of those who interpret it [6, 7].
eas of Artificial Intelligence, including Natural Language In this paper, we investigate the perception of irony in
Processing (NLP). Models that leverage this learning diferent segments of the English-speaking population.
paradigm are strictly dependent on either automatically- We focus, in particular, on two research questions (RQ):
produced datasets, i.e., silver data, or manually-curated
ones, i.e., gold standards. In the contest of human-made • RQ1: what are the common linguistic triggers for
annotations, the standard approach determines the final irony interpretation, regardless of perspectives?
annotation by resolving the disagreement of multiple an- • RQ2: what are the linguistic patterns typical of
notators, e.g., through majority voting. Recent research each perspective?
trends ofer an alternative take and show that
flattening the disagreement of several annotators can discard
valuable information [1, 2].</p>
      <p>Some of these trends go by the name of perspectivist
approaches. According to these lines of research, the
discrepancies of diferent annotators can be exploited to
model diferent points of view ( perspectives) on a specific
task [3]. This is especially important when the task is
highly subjective, such as that of identifying irony [4].</p>
      <p>While some linguistic patterns are linked to this
phe</p>
    </sec>
    <sec id="sec-2">
      <title>To answer these questions, we exploited EPIC (English</title>
      <p>Perspectivist Irony Corpus) [8], a disaggregated English
corpus for irony detection, containing 3,000 pairs of
PostsReplies from Twitter and Reddit, along with the
demographic information of each annotator.</p>
      <p>Inspired by [9], and in continuity with [8], we grouped
annotators in 11 diferent perspectives: self-identified
female and male, age-based groups (boomers, generation
X, generation Y and generation Z), and country-based
groups. Then, reproducing the experiments of [8], we
2nd Workshop on Perspectivist Approaches to NLPi soliti noti created 11 perspective-aware models and obtained their
* Corresponding author. predictions on the same set of instances.
$ simona.frenda@unito.it (S. Frenda); sodamarem.lo@unito.it We do so to perform a quantitative and qualitative
(S. M. Lo); silvia.casola@unito.it (S. Casola); scarlini@amazon.it analysis of the common and specific linguistic patterns
(vBa.leSrcioa.rbliansii)l;em@aurnciotcor.iit@(Vam.Baazsoinle.i)t; (dCv.dMbea@rcaom);azon.it (D. Bernardi) (afective, ofensive, syntactic, and lexical) that activate
© 2023 Copyright © 2023 for this paper by its authors. Use permitted under Creative Commons the ironic interpretation of a text for each population
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g LCicEenUseRAttWribuotironk4s.0hIontpernPatrioonacl e(CeCdBiYn4g.0)s. (CEUR-WS.org) segment. We leveraged the models’ knowledge to predict
the labels on the test set, and performed a linguistic syntactic [10], stylistic [11], pragmatic [12], semantic
analysis on this portion of the corpus to compare the [13], and afective [ 14, 15, 16] ones. Despite the clear
predicted perception of each social group on the same impact of some of these features on irony detection,
content. In fact, since instances are annotated on average the general cognitive mechanisms that activate irony
by 5 annotators, they do not necessarily contain labels for regardless of language and domain are still being studied
all demographic traits and perspectives. For example, an [17, 5, 18, 19].
instance can be annotated by workers from Generation The authors of [5] conducted an exhaustive
linguisGenY and GenZ only, and lack labels from annotators of tic analysis on three Twitter datasets annotated for the
the older generations. irony detection task in French, Italian, and English. They</p>
      <p>By comparing the relevance of diferent linguistic fea- looked for specific linguistic strategies used for
expresstures for the perspectivist models, we are able, firstly, to ing irony: analogy, metaphor, hyperbole/exaggeration,
confirm the importance – for all perspectives – of some euphemism, rhetorical question, oxymoron, paradox, and
specific features known to be of high impact in previ- other elements such as false assertion, context shift,
situous works [5]; secondly, we show that some patterns are ational irony, or specific markers (emoticons, negations,
perspective-specific. patterns of discourse, hashtags labelling the presence</p>
      <p>For instance, we found that the models trained on of humour, intensifiers, punctuation, false propositions,
female, generation Y, Australian, and American perspec- elements of surprise, modality, quotations, opposition,
tives tend to recognize irony especially when the texts capital letters, personal pronouns, interjections,
comparexpress negative sentiment. The Irish perspective seems ison, named entities, report verbs, expression of opinion,
to be amused by the emotional contrast in the texts. The urls). Oxymorons, false assertion, and situational irony
male perspectivist model, instead, seems to be more sensi- have been confirmed as triggers for irony in Italian tweets
tive to the recognition of irony when texts contain insults also by the authors of [20], who analysed the predictions
explicitly related to crimes or immoral behaviors, pro- obtained in the context of the IronITA shared task [21].
fessions, and animals. A similar diference is also visible Unlike other languages, ellipsis and apostrophes stand
in the dimension of age, where words related to female out for Spanish [22].
genitalia appear relevant in the decision for Generation Another common trait for irony detection from the
X; in contrast, the youngest generations (i.e., Y and Z) multilingual perspective is the role played by afective
are more influenced by words related to crimes and im- information. For example, the authors of [14] showed
moral behaviors. Models trained on the perspectives of how pleasantness, imagery, activation, and negative
senboomers and Indians are sensitive to specific syntactic timent have a discriminative power in classifying ironic
patterns. and non-ironic English tweets. Negative emotions, in</p>
      <p>These analyses shed light on the diferent perceptions particular, were identified primarily in English #ironic
of irony by diferent population segments. While we self-labelled tweets [23], in diferent ironic texts in
Spanfound common patterns that are independent of lan- ish [22] and Italian ironic tweets [20]. These works show
guages and perspectives, attention to diferent points that, among the linguistic strategies that can be used for
of view is needed especially for creating user-centered the activation of irony, some are language-independent,
applications and for making them explainable. while others seem related to specific languages and
cul</p>
      <p>This paper is organized as follows. In Section 2, we tures. Irony, as a subjective phenomenon, is strongly
present an overview of previous works related to the anal- influenced by individual perception.
ysis of linguistic features and strategies for expressing The perspectivist framework [3] aims at modelling
irony, focusing on a multilingual and multiperspective these aspects by incorporating the diferent points of view
approach to the phenomenon. In Section 3 we describe represented in the annotations. The new multi-faceted
the EPIC corpus, used to perform the source-independent annotation process is then exploited for model training,
(Section 4.1) and source-dependent (Section 4.2) analy- interpretation, and analysis of the predictions [4].
Perses on the patterns that drive the interpretation of our spectivist works on irony are very few. To our knowledge,
perspective-aware models. Finally, Section 5 is dedicated only two disaggregated datasets for English exist on
huto the discussion and conclusive observations on our mour [24] and irony [8]. The first was used as benchmark
results. in the first edition of the LeWiDi (Learning with
disagreement) shared task at SemEval 2021; whereas the second
was used to build, with a strongly perspectivist approach,
2. Related Work demographic-based models to encode annotators’
perspectives. Results demonstrated both a variation in the
Literature about irony detection has explored the contri- perception of irony based on annotators’ social group,
bution of several linguistic features within classical and and an increase in confidence for perspective-aware
modneural architectures (using golden standard datasets): els compared to the non-perspectivist ones.
iro
515
536
156
415
577
322
418
338
343
355
452</p>
      <p>Inspired by their work, and focusing especially on the the perspectivist models1.
perception of irony, we propose a linguistic analysis of Each perspective-specific training set was used to
finethe predictions of diferent perspectivist models, which tune a pre-trained BERT model [25]. In particular, similar
contributes to this emerging framework by examining to [8], we finetuned the uncased version of BERT 2 for
the most impactful linguistic features for interpreting Sequence Classification, with a binary (ironic and
notirony. ironic) label. Each BERT model was trained by taking
as input the representation of the Post-Reply pair. The
learning rate was set in a range of 6e-5 and 5e-5, the
3. Dataset and Perspectivist batch size to 16 and the maximum number of epochs to
Models 10 with an early-stopping strategy.</p>
      <p>These models have been tested in perspective-specific
To answer the research questions RQ1 and RQ2, we ex- test sets, computing the binary label and the confidence
ploit EPIC, the English Perspectivist Irony Corpus re- score of each model by following [26]’s formula based on
leased by [8]. This corpus comprises 3,000 pairs of Post- the normalized diference between the logits of each class,
Reply extracted from social media, evenly retrieved from i.e., ironic and not-ironic. The average of the confidence
Twitter and Reddit, and was annotated for the irony de- scores over instances and the f1-score of each model are
tection task by crowdsourcing workers with diferent reported in Table 1. As we can notice, the f1-score is
demographical traits. EPIC was qualitatively examined fair enough considering the notable unbalance between
by [8], that inspected the diferent demographic-based positive (iro) and negative (non-iro) classes in each
perspectives encoded in the dataset. They exploited this dataset.
information to create perspectivist models trained on Once we validated these models, we applied them to
subsets of data annotated by workers with the same de- the test set (iro: 110, non-iro: 443) obtaining the
mographical trait. With the aim of examining the percep- predictions (and the confidence score of the predictions)
tion of irony, we reproduced their perspectivist models of perspectivist models for each instance, like in Table 23.
and used their predictions for the linguistic analysis.</p>
      <p>In more details, following [8] we trained 11
perspective-aware classifiers. Each of these models was 4. Analysis on Perspectives
trained on data labeled by a specific subset of annotators,
who were separated according to their demographic traits In this Section, we focus on the analysis of the common
as shown in Table 1: gender (female, male), age (boomers, and specific patterns that trigger the interpretation of
Generation X, Generation Y, Generation Z), and nation- irony of 11 perspective-aware models across the 553
inality (British, Indian, Irish, American, and Australian). stances of the test set. As commented above, EPIC
conAs in [8], we created: i) a unique test set featuring tains Post-Reply pairs extracted from two sources:
Twit20% of the instances of EPIC’s corpus (246 from Reddit 1We note that to label each instance in our perspective-specific
and 307 from Twitter) used for the analyses described in datasets, we applied the majority voting strategy to each Post-Reply
Section 4, ii) and the perspective-specific datasets (see pair given the annotations of the selected subsets of annotators.
Table 1) by grouping the remaining instance-annotation We, then, discarded all the entries for which we could not compute
pairs according to the age, gender, and nationality of 2ahtmtpasj:o//rhituygvgointegfwaciteh.ctoh/ebearvta-bilaasbel-euanncnaosetadtions.
their annotators used, in a split 80/20, to train and test 3For the sake of clarity, we report the maximum and minimum
confidence score only for each instance.</p>
      <p>fem
Reddit Other people on social Saw someone on a friend’s 0
media when they’re be- FB comments have the
ing trolls. They only do nerve to tell her to "check
it because 99% of them her sources" and link to a
wouldn’t have the nerve to meme. The friend has a
say whatever they’re say- PhD in the field being
dising to your face. cussed.</p>
      <p>Reddit Pasta pillows, yes. Pasta Yeah that implies that you 1 1 (.841) 0 (.003)
cushions even because of dip them in the water
the frilly edge. But pasta and then bin them
beteabags? No. fore drinking your slightly</p>
      <p>pasta flavoured water.</p>
      <p>Twitter Hey atheists, what gives @BeatTheCult 1 1 1
your life meaning if you Meat,chips, bread and
don’t believe in God? beer....</p>
      <p>Twitter Apparently Reece Mogg @YvonneBurdett3 We can 1 1 0
will be making a statement only hope! Perhaps we’ve
within the hour. It’s not declared war on Russia
going to be his resignation or put a man on Mars
is it overnight.
0
0
1
1
1
ter and Reddit. Therefore, we describe two types of anal- To examine the features that are actually
discriminaysis: firstly, a source-independent analysis (Section 4.1) tive for the detection of irony, we selected for each model
and secondly, a source-based analysis (Section 4.2). only texts from the test set predicted with a very high</p>
      <p>The former focused on capturing the linguistic fea- score of confidence. The threshold used for this selection
tures that trigger the ironic interpretation of a text re- is unique for each perspectivist model (Table 3), and it
gardless of its source, exploring the common and diverse was obtained by computing the median of the list of
confeatures among the predictions of diferent perspective- ifdence scores resulting from the prediction of positive
based models. The latter aimed at identifying in which class (ironic texts) on the specific perspective-based test
source these models tend to predict irony exploring the sets of EPIC (Table 1).
possible causes, and if there are linguistic patterns spe- This choice is motivated by one of the findings of [ 8],
cific of a source, looking especially at the use of the who proved that perspectivist models are more confident
strategies and markers identified by [ 5] in multilingual and precise when predict labels in test sets that encode
datasets. their perspectives; and depends also on our purpose of</p>
      <p>For both analyses, we took into account the predictions examining the perception of irony. We want to be sure
of perspectivist models obtained in the test set (Table 2). that the analysed texts, especially the ones recognized as
For each instance, therefore, we have the labels of all ironic, have been predicted with a very high confidence
the 11 perspectives, and the confidence score of each by the models.
model computed as described in Section 3. We leveraged
the models’ knowledge to predict the labels on the test
set since – by design –, not all instances of our corpus
feature manual annotations covering all demographic
traits/perspectives.</p>
      <sec id="sec-2-1">
        <title>4.1. Source-independent Analysis</title>
        <p>To observe the commonalities and diferences among
the interpretation of irony by the various perspectivist
models, we extracted a set of linguistic features from the
texts of the test set, computed their  2 value for each
model, and plotted these values in heatmaps4.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4Since we observed that the distribution of the  2 values of the</title>
      <p>features is non-linear, we employed the logarithmic function of
PowerTransformer to normalize the data.</p>
      <p>The selection of the set of features was inspired by
existing literature about multilingual and multigenre ironic
texts (Section 2); and include: 1) afective features: the
sentiment, emotions, and feelings expressed in the texts
(Section 4.1.1); 2) the presence of ofensive language
(Section 4.1.2); 3) syntactic features (Section 4.1.3). We also
performed a lexical analysis (Section 4.1.4).</p>
      <sec id="sec-3-1">
        <title>4.1.1. Afective analysis</title>
        <p>We used the EmoLex dictionary [27] to extract emotions
and expressed feelings (Figure 1). EmoLex is based on
the wheel of emotions theorized by [28], which includes
8 main emotions (anger, anticipation, disgust, fear, joy,
sadness, surprise, trust) and the primary dyads or
feelings (aggressiveness, optimism, love, submission, awe,
disapproval, remorse, contempt).</p>
        <p>Favored by the design of the wheel of emotions, we
computed also the variability of opposite emotions and
contrary feelings by means of the standard deviation
( ). The weights of the emotional features are obtained
by summing the TF-IDF5 of words belonging to the
specific emotions/feelings. And, we computed the sentiment
scores (positive and negative) by using SentiWordNet 3.0
[29] (Figure 2).</p>
        <p>As Figure 1 shows, negative emotions and feelings
(Example 1) like disgust, contempt, and remorse report
the highest  2 values for the majority of the
perspectivist models. Thus, we can confirm the findings
of previous analyses in English tweets [23, 14]) where
negative emotions were identified primarily in #ironic
self-labelled tweets. Another common discriminative
feature is the contrast between negative emotions and
feelings and their positive counterpart (Example 2).</p>
        <p>(1) [Post] TLDR: senior positions and management get
paid more.
[Reply] And are generally the most useless pricks out
there, all talk and no action.
(2) [Post] Fuck carlow they beat me in the feile when I
was 13. They all looked like 30 year old men.
[Reply] We have to win a match in football some how.</p>
        <p>By looking at the perspective-specific models , we
noticed some interesting findings. For instance, when
considering the gender dimension, we can notice a higher confident in detecting irony when the text is
character 2 for the Fem-persp model on the presence of nega- ized by a negative sentiment, diferently from their
countive sentiment and on negative emotions/feelings (fear, terparts (especially GenX, GenZ, IN, IR-persp models).
sadness, disapproval, and awe) with respect to the Male- The analysis of emotions brings to light an interesting
difpersp model (Figures 2 and 1). These values suggest the ference between the IR-persp model and all the 4 models
idea that female annotators tend to recognize irony in built taking into account the provenance. The IR-persp
texts that express a certain negativity. model shows a marked and higher  2 score especially in</p>
        <p>Similar finding is noticed in GenY, AU and particularly the presence of emotional contradictions in the texts.
US-persp models. All these models, indeed, show to be</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5To compute the TF-IDF, we cleaned text from URLs and other not</title>
      <p>alphanumeric symbols, tokenized it and removed the stopwords,
and finally lemmatized it using the SpaCy large model for English.</p>
      <sec id="sec-4-1">
        <title>4.1.2. Ofensive language</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>The authors of [20] proved that irony, especially in its</title>
      <p>sarcastic form, can be used to reinforce a negative
message. For this reason, the presence of ofensive language
could be considered a trigger for the ironic interpretation
of a text.</p>
      <p>To this purpose, we exploited HurtLex, a multilingual
lexicon of ofensive words. The entries in the lexicon are
categorized into 17 types of ofences (related to the
economic and social spheres, professions, animals, and so on)
(Table 4) enclosed in two macro-categories: conservative
(words with literally ofensive sense) and inclusive (all
the words regardless of the explicitness of the ofenses).</p>
      <p>Category
PS
RCI
PA
DDP</p>
      <p>GenX-persp model, and the words related to
crimes/immoral behaviours for the youngest generations (i.e., Y
and Z). In the dimension of nationality, it is clear that the
presence of ofensive words related especially to moral
behaviours/defects have some impact to the detection of
irony for AU and US-persp model. While words related
to male genitalia report a higher score only for AU and
IR-persp model.</p>
      <p>Figure 3 shows that some categories of ofensive language
report the highest  2 values for the majority of the
perspectivist models. These categories are related in 4.1.3. Syntactic features
particular to male genitalia, moral behaviors/defects, and, As shown in previous work [30], syntactic features are
even in its conservative sense, to the category of physical proven to be useful to detect ironic language in social
disabilities and diversity. media. In particular, we captured syntactic dependencies</p>
      <p>We can also point out interesting diferences when that could reveal pragmatic information, such as:
intensiconsidering perspective-specific models . Looking ifers ( intens), discourse connections (disc_conn),
adat the gender, we can notice higher values in the Male- verbial locutions (adv_loc), mentions (mention) and
persp model when the texts contain words related to nominal phrases (and the number of nominal phrases
crimes/immoral behaviors, professions, and animals, dif- in the tweet) (nom_phrase and num_nom_phrase). As
ferently from the Fem-persp model. Figure 4 shows, only the adverbial locutions appear
rele</p>
      <p>Observing the dimension of age, instead, the difer- vant for the majority of models.
ences are not so marked, except for the ofensive words re- However, we noticed that syntactic features have
lated to female genitalia that appear discriminant for the a higher  2 score in a few models, such as Boomer
and IN-persp models. If the former seems to be triggered
by diferent syntactic features (i.e., the presence of
intensifiers and nominal utterances), the latter shows to
discriminate irony, especially in the presence of
discursive connections.</p>
      <sec id="sec-5-1">
        <title>4.1.4. Lexical analysis</title>
        <p>Specifically, considering the features associated with
the model based on boomers’ perspective, there is a high
presence of non-English words (as usernames or foreign
words, especially from Hindi), and few verbs. In fact, it
relies more on nominal n-grams, which in some cases
corresponds to the entire text, as in the Examples 3 and
4. This result is further confirmed in the analysis above
(Figure 4).</p>
        <p>(3) [Post] That’s damn shitty of Hugo Boss, what on earth
with the chaps in the corner shop and the kebab shop
call us now?
[Reply] Ma man
(4) [Post] Election Predictions: Republicans will win the
House! Stacey Abrams will lose in Georgia! Any
takers?
[Reply] @USER Yo crazy dude</p>
        <sec id="sec-5-1-1">
          <title>4.2. Source-based Analysis</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>In this section, we present a quantitative and qualitative</title>
      <p>analysis of the characteristics of ironic texts in Twitter
and Reddit, showing analogies and diferences.</p>
      <p>To perform a lexical analysis on the test set, we
extracted the top 100 unigrams, bigrams and trigrams 4.2.1. Irony on Twitter is more contextual
weighted by their TF-IDF6 applied separately for each Observing the predicted texts, we noticed that
perspecmodel on texts labelled as ironic. In order to examine the tivist models tend to identify irony more in posts from
lexical patterns that may influence their choices, we man- Reddit (63% of the cases in Table 6) even if the two sources
ually analysed both the features that were common to at are balanced in the creation process of our corpus.
least 6 models and the ones that occurred in a individual We hypothesized that this diference was due to the
model only. diferent level of complexity and need for context for</p>
      <p>Focusing on the n-grams common to at least 5 mod- instances in the two sources. To measure the
characterisels, we individuated a total of 18 features that recur in 5 tics, we computed the length in characters and tokens7
to 7 models. Ten of them are unigrams frequent across and lexical richness of the Post-Reply pairs, in terms of
the texts, as family, think, feel, know, while the other 8 type-token-ration (TTR)8.
lexical features are bigrams and trigrams linked to the We also compute the number of named entities9 and
same 4 texts predicted as ironic by at least 5 models and external elements10 that could amplify the contextual
reported in Table 5. information in each source (Table 7). We used spaCy and</p>
      <p>To highlight whether some lexical features were spaCy-udpipe loading the available models for English in
model-specific , we filtered the data by removing all particular to extract interjections and the named entities.
the features that recurred in more than one model of For the emoticons and emojis, we exploited available lists
the same dimension (age, gender, and nationality). By in the emoji library. While all the other characteristics
manually inspecting these unique features per model, we have been extracted using specific regex.
noticed that for the majority of them, the bigrams and As expected, posts from Reddit are longer than tweets,
trigrams represented a diferent combination of the same but the values of the lexical richness and the number
texts (e.g. common lannister aside, family common
lannister, lannister aside obsession, aside obsession, common
lannister, family common, lannister aside). Boomer-persp
and GenY-persp models were the only ones that behaved
diferently. Their bigrams and trigrams rarely show the
systematic repetition of the same lexical items described
above, and they both present a higher number of
unigrams compared to other models.
7For computing the length in tokens, the texts have been cleaned
and tokenized, removing urls, punctuation, emoji, and emoticons.
8TTR is the number of distinct words over the overall words in
the text. We took into account tokens and types lists without
urls, punctuation, emoji, and emoticons. Here, the texts have been
cleaned and tokenized as described in the previous footnote.
9The list of named entities considered in this study includes: works
of art, organizations, persons, geopolitical entities, locations, events,
names of products, date, languages, laws, and nationalities or
religious or political groups.
10External elements include: hashtags, emoji, emoticons, and urls.</p>
    </sec>
    <sec id="sec-7">
      <title>6We used the TfidfVectorizer from Scikit-learn</title>
      <p>no don’t please. i was crushing
over since she came. keep that
chutiya away
Hey atheists, what gives your life
meaning if you don’t believe in
God?
Reply
You know there’s something else
that Trump family has in
common with the Lannisters aside
from the obsession with gold.
@BeatTheCult
bread and beer....</p>
      <p>So BJP-RSS folk need to fear NSG
? Kinda contradictory no ?
Has this guy any shame left. He
should be behind bars!
wind power my arse
so....what you think this is false?
Or you prefer burning stuf?
8
7
6
8
# Models</p>
      <p>Bi/Trigrams
trump family common
Meat,chips,
meat chip bread
shame leave bar, shame
leave
prefer burn, burn stuf ,
think false prefer, prefer
burn stuf
of named entities suggest that the content on Twitter is
more varied than that from Reddit (Table 7). This is also
confirmed by the number of external elements. A similar
trend is also observed in the human annotations of the
texts of the test set: most annotators recognized more
irony in posts from Reddit (27%) than in tweets (14%).</p>
      <p>To analyze this trend further, we explored how each
model behaves with respect to the source. In general,
they identify texts from Reddit as ironic more often than
tweets; the only exception is the model trained on the
Boomers’ perspective, which have classified instances
as ironic almost equally for the two sources (52% from
Reddit and 48% from Twitter).</p>
      <sec id="sec-7-1">
        <title>4.2.2. Linguistic strategies and markers</title>
        <p>We carried out a qualitative analysis of the texts predicted
as ironic by at least 5 models, which amounts to a total of
26 texts, 24 from Reddit and 2 from Twitter (Table 6). To
these, we added 22 tweets from those identified as ironic
by at least 3 models in order to conduct a comparative
linguistic analysis of the two sources. For this analysis, we
took into account also the irony strategies and markers
proposed in the schema of [5] (Section 2).</p>
        <p>We found that in both sources, users tend to use
similar linguistic strategies to express irony, such as
paradox/oxymoron and false assertions, confirming the
results presented in [5]; and other interesting features, such
as context shift (Example 5) and hyperbole/exaggeration
(Example 6).</p>
        <p>(5) [Post] How many roads must a man walk down?
[Reply] The only word I know is grunt and I can’t
spell it.
(6) [Post] Apparently Reece Mogg will be making a
statement within the hour. It’s not going to be his
resignation is it
[Reply] @USER We can only hope! Perhaps we’ve
declared war on Russia or put a man on Mars overnight.</p>
        <p>However, some diferences are evident. Twitter users
often convey contradictions that characterize irony
through unexpected answers (Example 4) and
euphemisms (Example 7), while Reddit communities lean
towards the use of rhetorical questions (Example 8) and respect to their counterparts (respectively, generations X
metaphors. and Z, and Indian and Irish perspectives). Moreover,
dif(7) [Post] Lindsey Hoyle spent £7,500 of taxpayers money ferently from other models of the provenance dimension,
on a mattress and sheets for his bed in the speakers the Irish perspective shows to recognize irony especially
residence. in presence of emotional contradictions. In turn, the
[Reply] @USER @USER Very Toriesque male perspective model seems more sensitive to irony
(8) [Post] wind power my arse when the text reports ofences related to crimes/immoral
[Reply] so....what you think this is false? Or you prefer behaviors, professions, or animals.
burning stuf? Similar diferences are visible in the dimension of age,
where texts including female genitalia are considered
From a stylistic point of view, both Reddit and Twitter ironic by Generation X, while the youngest generations
texts contain question marks, exclamation points, and (i.e., Y and Z) are more influenced by words related to
ellipsis. Full stops are common to the two sources, but crimes/immoral behaviors. Finally, only boomers and
they are more frequent in tweets, while Reddit users are Indian perspectives are sensible to syntactical patterns,
more prone to employ swear words. such as intensifiers, nominal utterances, and discursive</p>
        <p>Tweets also contain nominal utterances more fre- connectors [RQ2]. We also noticed that all models detect
quently than Reddit posts; this is coherent with the statis- irony in Reddit posts more often than in tweets.
tics shown in Table 7, which highlight how texts from The findings of these analyses reveal the perception of
Reddit are longer and thus include verbal expressions irony of diferent segments of people. These observations,
to fulfil complete sentences. In general, in both sources, therefore, could help to create models for irony detection
texts are short and composed of straight answers. with diferent degrees of “subjectivity”: models that take
into account the most common features to detect irony,
5. Discussion and Conclusion or models that target distinct perspectives. In both cases,
this study provides the ingredients to make their decisions
explainable. In line with this purpose, we would like, in
the future, to enrich these analyses looking also at the
topic of the texts, and extend them to diferent languages,
capturing also the understanding of irony in diferent
countries.</p>
        <p>To the best of our knowledge, this work is the first to
approach the analysis of the perceptions of irony in specific
segments. Specifically, we base our analysis on the age,
gender, and nationality dimension from the EPIC dataset
[8]. To examine these patterns in a specific set of texts,
we modelled 11 perspectives (self-identified female and
male, boomers, generation X, generation Y and genera- Limitations
tion Z, British, Indian, Irish, American, and Australian),
and comparatively analysed the impact of various lin- This work is the first attempt to explore the perception
guistic features in each of them. of irony, looking at diferent perspectives. Given the</p>
        <p>The contribution of this paper is twofold. Firstly, our early stages of this framework, we are aware there are
analysis confirms most of the observations made in the some limitations, which we aim to tackle in subsequent
literature about the similar ironic patterns featured in research. In particular, the perspectives are based on a
texts of diferent languages [ 23, 14, 5]. Secondly, our small subset of characteristics (self-identified gender, age,
analysis provides evidence for the diferent perceptions and nationality), and the analysis is conducted using a
of irony experienced by people with distinct demographic limited number of data instances (553). To overcome this
traits. As a subjective task, irony identification is indeed problem, in the future, we plan to extend these analyses
impacted by experience and background. to a larger corpus that includes texts in several languages.</p>
        <p>Through this analysis exercise, we noticed that the
patterns that often trigger ironic interpretation in most
perspectivist models are negative emotions (i.e., disgust, Acknowledgments
contempt, remorse) and contrasting expressions with
their counterparts in the wheel of emotions of Plutchik The work of S. Frenda, S. Casola and V. Basile was
par(trust, submission, and love); ofensive language (related tially funded by the Multilingual Perspective-Aware NLU
in particular to male genitalia), moral behaviors or de- project in partnership with Amazon Alexa. This research
fects, physical disabilities, and diversities also play a role was funded through a donation from Amazon.
[RQ1].</p>
        <p>In addition, looking at the diferences among
perspectives, we noticed that models trained on female,
generation Y, Australian, and American perspectives, often
recognize irony when texts convey negative sentiment with
TUT, IEEE intelligent systems 28 (2013) 55–63. [26] A. A. Taha, L. Hennig, P. Knoth, Confidence
estiURL: https://www.computer.org/csdl/magazine/ex/ mation of classification based on the distribution
2013/02/mex2013020055/13rRUxAAT3i. of the neural network output layer, arXiv preprint
[18] C. Van Hee, E. Lefever, V. Hoste, SemEval-2018 arXiv:2210.07745 (2022).</p>
        <p>task 3: Irony detection in English tweets, in: Pro- [27] S. M. Mohammad, P. D. Turney, Crowdsourcing a
ceedings of the 12th International Workshop on word-emotion association lexicon, Computational
Semantic Evaluation, Association for Computa- Intelligence 29 (2013) 436–465. URL: https://doi.org/
tional Linguistics, New Orleans, Louisiana, 2018, 10.1111/j.1467-8640.2012.00460.x.
pp. 39–50. URL: https://aclanthology.org/S18-1005. [28] R. Plutchik, H. Kellerman, Theories of emotion,
voldoi:10.18653/v1/S18-1005. ume 1, Academic Press, 1980. URL: https://books.
[19] R. Giora, I. Jafe, I. Becker, O. Fein, Strongly at- google.it/books?id=TV99AAAAMAAJ.
tenuating highly positive concepts. The case of de- [29] S. Baccianella, A. Esuli, F. Sebastiani,
SentiWordfault sarcastic interpretations, Review of Cognitive Net 3.0: An enhanced lexical resource for
sentiLinguistics. Published under the auspices of the ment analysis and opinion mining, in:
ProceedSpanish Cognitive Linguistics Association 16 (2018) ings of the Seventh International Conference on
19–47. URL: https://doi.org/10.1075/rcl.00002.gio. Language Resources and Evaluation (LREC’10),
Eu[20] S. Frenda, A. T. Cignarella, V. Basile, C. Bosco, ropean Language Resources Association (ELRA),
V. Patti, P. Rosso, The unbearable hurtfulness of Valletta, Malta, 2010. URL: http://www.lrec-conf.
sarcasm, Expert Systems with Applications 193 org/proceedings/lrec2010/pdf/769_Paper.pdf .
(2022) 116398. [30] A. T. Cignarella, V. Basile, M. Sanguinetti, C. Bosco,
[21] A. T. Cignarella, S. Frenda, V. Basile, C. Bosco, P. Rosso, F. Benamara, Multilingual irony detection
V. Patti, P. Rosso, Overview of the EVALITA 2018 with dependency syntax and neural models, in:
task on irony detection in Italian tweets (IronITA), Proceedings of the 28th International Conference
in: Sixth Evaluation Campaign of Natural Language on Computational Linguistics, International
ComProcessing and Speech Tools for Italian (EVALITA mittee on Computational Linguistics, Barcelona,
2018), volume 2263, CEUR-WS, 2018, pp. 1–6. Spain (Online), 2020, pp. 1346–1358. URL: https:
[22] S. Frenda, V. Patti, Computational models for irony //aclanthology.org/2020.coling-main.116.
detection in three spanish variants, in: CEUR
Workshop Proceedings, volume 2421, CEUR-WS, 2019,
pp. 297–309.
[23] E. Sulis, D. I. H. Farías, P. Rosso, V. Patti, G. Rufo,</p>
        <p>Figurative messages and afect in Twitter:
Differences between #irony, #sarcasm and #not,
Knowledge-Based Systems 108 (2016) 132–143. URL:
https://doi.org/10.1016/j.knosys.2016.05.035, new
Avenues in Knowledge Bases for Natural Language</p>
        <p>Processing.
[24] E. Simpson, E.-L. Do Dinh, T. Miller, I. Gurevych,</p>
        <p>Predicting humorousness and metaphor novelty
with Gaussian process preference learning, in:
Proceedings of the 57th Annual Meeting of the
Association for Computational Linguistics, Association for
Computational Linguistics, Florence, Italy, 2019, pp.
5716–5728. URL: https://aclanthology.org/P19-1572.</p>
        <p>doi:10.18653/v1/P19-1572.
[25] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT:</p>
        <p>Pre-training of deep bidirectional transformers for
language understanding, in: Proceedings of the
2019 Conference of the North American
Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1
(Long and Short Papers), Association for
Computational Linguistics, Minneapolis, Minnesota,
2019, pp. 4171–4186. URL: https://aclanthology.org/
N19-1423. doi:10.18653/v1/N19-1423.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>