=Paper= {{Paper |id=Vol-3180/paper-126 |storemode=property |title=Overview of the CLEF 2022 JOKER Task 1: Classify and Explain Instances of Wordplay |pdfUrl=https://ceur-ws.org/Vol-3180/paper-126.pdf |volume=Vol-3180 |authors=Liana Ermakova,Fabio Regattin,Tristan Miller,Anne-Gwenn Bosser,Silvia Araújo,Claudine Borg,Gaëlle Le Corre,Julien Boccou,Albin Digue,Aurianne Damoy,Paul Campen,Orlane Puchalski |dblpUrl=https://dblp.org/rec/conf/clef/ErmakovaRMBABCB22 }} ==Overview of the CLEF 2022 JOKER Task 1: Classify and Explain Instances of Wordplay== https://ceur-ws.org/Vol-3180/paper-126.pdf
Overview of the CLEF 2022 JOKER Task 1:
Classify and Explain Instances of Wordplay
Liana Ermakova1,2 , Fabio Regattin3 , Tristan Miller4 , Anne-Gwenn Bosser5 ,
Sílvia Araújo6 , Claudine Borg7 , Gaelle Le Corre8 , Julien Boccou1 ,
Albin Digue1 , Aurianne Damoy1 , Paul Campen1 and Orlane Puchalski1
1
    Université de Bretagne Occidentale, HCTI, 29200 Brest, France
2
    Maison des sciences de l’homme en Bretagne, 35043 Rennes, France
3
    Dipartimento DILL, Università degli Studi di Udine, 33100 Udine, Italy
4
    Austrian Research Institute for Artificial Intelligence, Vienna, Austria
5
    École Nationale d’Ingénieurs de Brest, Lab-STICC CNRS UMR 6285, France
6
    Universidade do Minho, CEHUM, 4710-057 Braga, Portugal
7
    University of Malta, Msida MSD 2020, Malta
8
    Université de Bretagne Occidentale, CRBC, 29200 Brest, France
8
    Université de Bretagne Sud, HCTI, 56321 Lorient, France


                                         Abstract
                                         As a multidisciplinary field of study, humour remains one of the most difficult aspects of
                                         intercultural communication. Understanding humor often involves understanding implicit
                                         cultural references and/or double meanings, which raises the questions of how to detect
                                         and classify instances of this complex phenomenon. This paper provides an overview of
                                         Pilot Task 1 of the CLEF 2022 JOKER track, where participants had to classify and explain
                                         instances of wordplay. We introduce a new classification of wordplay and a new annotation
                                         scheme for wordplay interpretation suitable both for phrase-based wordplay and wordplay in
                                         named entities. We describe the collection of our data, our task setup, and the evaluation
                                         procedure, and we give a brief overview of the participating teams’ approaches and results.

                                         Keywords
                                         wordplay, computation humour, pun, classification, wordplay interpretation, wordplay loca-
                                         tion, deep learning




1. Introduction
Creative language, such as humour and wordplay, is all around us: from entertain-
ment to advertisements to business relationships. Internet humour flourishes on
social networks, special humour-dedicated websites, and on web pages focusing on
edutainment or infotainment [1]. As a multidisciplinary research area, humour has

CLEF 2022: Conference and Labs of the Evaluation Forum, September 5–8, 2022, Bologna, Italy
$ liana.ermakova@univ-brest.fr (L. Ermakova)
€ https://www.joker-project.com/ (L. Ermakova)
 0000-0002-7598-7474 (L. Ermakova); 0000-0003-3000-3360 (F. Regattin); 0000-0002-0749-1100
(T. Miller); 0000-0003-4321-4511 (S. Araújo); 0000-0003-3858-5502 (C. Borg)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International
                                       (CC BY 4.0).
                                       CEUR Workshop Proceedings (CEUR-WS.org)
    CEUR
                  http://ceur-ws.org
    Workshop      ISSN 1613-0073
    Proceedings
been a focus of interest to many academics from different theoretical backgrounds.
Many humour-related academic studies have been centred on culture-specific hu-
mour [2, 3] and the (un)translatability of this universal phenomenon [4, 5], which
poses genuine challenges for subtitlers and other translators. In order to better
understand and process this creative use of language, we have to recognise that it
requires special treatment, not only insofar as linguistic mechanisms are concerned,
but also regarding the universe of paralinguistic elements [6]. From a metalinguist-
ic/metadiscursive point of view, wordplay includes a wide variety of dimensions that
exploit or subvert the phonological, orthographic, morphological, and semantic con-
ventions of a language [7, 8]. It is therefore vitally important that natural language
processing applications be capable of recognising and appropriately dealing with
instances of wordplay. And indeed, numerous studies have already been conducted
for the related task of humour generation [9, 10, 11, 12, 13, 14].
  To make a step forward to the automation of humour and wordplay analysis, we
introduced the JOKER track at CLEF 2022. The goal is to bring together translators
and computer scientists to further the computational analysis of humour. Our
workshop proposed three pilot tasks [15]:

Pilot Task 1: classify and explain instances of wordplay,

Pilot Task 2: translate wordplay in named entities, and

Pilot Task 3: translate entire phrases containing wordplay (puns).

This paper covers Pilot Task 1. We introduce a new classification of wordplay and we
discuss the shortcomings of previous classifications from the literature. Moreover,
we propose a new annotation scheme for wordplay interpretation. Our interpretation
scheme is applicable both for phrase-based wordplay, including puns, and wordplay
within named entities, including portmanteaux. We present an evaluation benchmark
of our own devising and we present and discuss the participating systems and their
results.
  The paper is organised as follows. In §2 we provide the definitions of wordplay
from the literature as well as existing classifications. In §3 we describe the initial
dataset created in the JOKER project. The data was annotated, at first, according
to the classifications from the literature. However, as we noticed that its classes
overlapped, we introduced our own classification system and used it for our final
annotation, described in §4. The corpus details can be found in §5. Results from our
own preliminary experiments on wordplay perception are found in §6, and alternative
classification methods proposed by our participants are covered in §7. The methods,
evaluation metrics, and results are described in §§8, 9, and 10, respectively. Some
concluding remarks are given in §11.
2. Related work
In this section, we first define the concepts of “wordplay” and “pun” and then present
different proposed classifications for these complex concepts.


2.1. Definitions
The definitions given by mass-market dictionaries are very close:

Definition 2.1. Wordplay is the activity of joking about the meaning of words,
especially in a clever way. (Cambridge Learner’s dictionary)

Definition 2.2. Wordplay is the clever or amusing use of words, especially involving
a word that has two meanings or different words that sound the same. (Oxford
Advanced Learner’s dictionary)

Definition 2.3. Pun is the clever or humorous use of a word that has more than
one meaning, or of words that have different meanings but sound the same. (Oxford
Advanced Learner’s dictionary)

Definition 2.4. Pun is a humorous use of a word or phrase that has several meanings
or that sounds like another word. (Cambridge Learner’s dictionary)

Delabastita [16] and Gottlieb [17] consider these terms synonymous and generally
interchangeable. According to Delabastita [16], a wordplay or pun

     is the general name for the various textual phenomena in which structural
     features of the language(s) used are exploited in order to bring about
     a communicatively significant confrontation of two (or more) linguistics
     structures with more or less similar forms and more or less different
     meanings.

  Chiaro [18] and Giorgadze [19] make a distinction between the terms “wordplay”
and “pun”. According to Giorgadze [19], “wordplay can be discussed in its narrow
and broad senses”. In the broad sense, the term is a hypernym which may subsume
(without limitation) the following phenomena:

puns e.g., “Just as a poached egg isn’t a poached egg unless it’s been stolen from
    the woods in the dead of the night.” (Roald Dahl, Charlie and the Chocolate
    Factory)

wellerisms e.g., “ ‘Don’t move, I’ve got you covered,’ said the wallpaper to the
     wall.”; “ ‘We’ll have to rehearse that,’ said the undertaker, as the coffin fell out
     of the car.”

spoonerisms e.g., “Time wounds all heels” instead of “Time heals all wounds”; “a
    well-boiled icicle” instead of “a well-oiled bicycle”
anagrams e.g., “genuine class” for “Alec Guinness” (Dick Cavett)

palindromes e.g., “No lemons, no melon!”; “Straw? No, too stupid a fad! I put soot
     on warts.”

onomatopoeia e.g., “Water plops into pond, splish-splash downhill warbling mag-
    pies in tree trilling, melodic thrill. . . ” (Lee Emmet, “Running Water”)

mondegreens e.g., “Lady Mondegreen” instead of “layd him on the green”

malapropisms e.g., “the very pineapple of politeness”; “Medieval victims of the
    Bluebonnet plague”

neologisms and portmanteaux e.g., “meringued: the act of being changed into,
     or being trapped inside, a large meringue” (Jasper Fforde, The Eyre Affair);
    “ChameleoCar: a multi-coloured car that can change hue at the flick of a switch”
    (ibid.); “frumious” from “fuming” and “furious” (Lewis Carroll, “The Hunting of
     the Snark”)

alliteration, assonance, and consonance e.g.         “Peter Piper picked a peck of
      pickled peppers”

  In a narrower sense, a wordplay is a pun when it is based on the ambiguity that
occurs when a word or phrase has more than one meaning and can be understood in
different ways.
  Giorgadze [19] claims that the terms “pun” and “wordplay” are synonymous when
the latter is understood in its narrow sense. Redfern [20] considers that “to pun is to
treat homonyms as synonyms.” This ambiguity can be based on

   • lexis and semantics, the lexical ambiguity of a word or phrase pertaining to its
     having more than one meaning in the language to which the word belongs; or
   • syntax, where a sentence may have two (or more) different interpretations
     according to how it is parsed.

  Delabastita [16] argues that puns are textual phenomena, meaning that they are
dependent on the structural characteristics of language as an abstract system. The
context should thus be taken into account when analysing puns. The context may be
situational or textual. As Gottlieb [21, p. 210] writes,

     The intended effect of wordplay can accordingly be conveyed through
     dialogue (including intonation and other prosodic features), combined
     with non-verbal visual information, or through written text. . .


2.2. Classification of Wordplay
Delabastita [16] distinguishes the following categories:
Phonological and graphological structure This is wordplay based on sound or
    spelling – e.g., “Love at first bite.” This category includes:
     Homonymy (identical sounds and spelling)
     Homophony (identical sounds but different spellings)
     Homography (different sounds but identical spelling)
     Paronymy (there are slight differences in both spelling and sound)

Lexical structure (polysemy and idioms) This concerns the distance between
     the idiomatic and literal reading of idioms – e.g., “Britain going metric: give
     them an inch and they’ll take our mile.”

Morphological structure This type of wordplay is based on the distinction between
    the accepted meaning of the words and the interpretation of the components –
    e.g., “ ‘I can’t find the oranges,’ said Tom fruitlessly.”

Syntactic structure Here, grammar generates puns through sentences or phrases
    that can be understood in more than one way. For example, “How do you stop a
    fish from smelling? Cut off its nose.” [19, p. 274]

  Delabastita [16] also distinguishes horizontal from vertical puns. In horizontal
puns, the secondary meaning is expressed concretely within the text: “The mere
nearness of the pun components may suffice to bring about the semantic confronta-
tion; in addition, grammatical and other devices are used to highlight the pun.” [16,
p. 129] In vertical wordplay, the pivotal element is mentioned only once: “one of the
pun’s components is materially absent from the text and has to be triggered into
semantic action by contextual constraints.” [16, p. 129] Some examples are given in
Figure 1.
  Gottlieb [17] considers wordplays and puns as synonymous linguistic units and
divides them up into three categories:

Lexical homonymy The central feature is single-word ambiguity.

Collocational homonymy The word-in-context ambiguity is the central feature.

Phrasal homonymy The clause ambiguity is the central feature.

Giorgadze [19], on the other hand, breaks the term pun into three other categories:

Lexical-Semantic Pun (homonyms, homophone, polysemantic words) – e.g., “I like
     kids, but I don’t think I could eat a whole one.” (the polysemous word “kid”
     creating the pun); “Where do fish learn to swim? They learn from a school.”
     (Lewis Carroll, Alice’s Adventures in Wonderland )

Structural-Syntactic Pun Here, a complex phrase or sentence can be understood
     in different ways. For example, “ ‘I rushed out and killed a huge lion in my
     pajamas.’ ‘How did the lion get in your pajamas?’ ”
                vertical                                  horizontal
 homonomy       There was a brass plate screwed on        Well, yah, dey lose members
                the wall beside the door. It said: “C V   in there.    Their members lose
                Cheesewaller, DM (Unseen) B. Thau,        members (T. Pratchett, Soul Music,
                BF.” It was the first time Susan had      quoted in [22, p. 19])
                ever heard metal speak. (T. Pratch-
                ett, Soul Music, quoted in [22, p. 29])
 homophony      Beleave in Britain. (The Sun)             Why can a man never starve in
                                                          the Great Desert?     Because he
                                                          can eat the sand which is there.
                                                          But what brought the sandwiches
                                                          there? Why, Noah sent Ham, and his
                                                          descendants mustered and bred.
 homography     You can tune a guitar, but you can’t      How the US put US to shame. [23]
                tuna fish. Unless you play bass.
 paronomy       Landen Parke-Laine (i.e., “land on “That’s a bodacious audience,” said
                Park Lane”, referring to the Mono- Jimbo.
                poly board game) (Jasper Fforde, “Yeah, that’s right, bodacious,” said
                Thursday Next series)                Scum.      “Er.   What’s bodacious
                                                     mean?”
                                                     “Means. . . means it bodes,” said
                                                     Jimbo.
                                                     “Right. It looks like it’s boding all
                                                     right.”
                                                     (Terry Pratchett, Soul Music, quoted
                                                     in [22, p. 19]

Figure 1: Examples of horizontal and vertical puns.



Structural-Semantic Pun Per Giorgadze [19, p. 274], “Structural-semantic ambi-
     guity arises when a word or concept has an inherently diffuse meaning based on
     its widespread or informal usage.” This is mainly found with idiomatic expres-
     sions: “ ‘Did you take a bath?’ ‘No, only towels, is there one missing?’ ” [19,
     p. 275]

  Chuandao [24] believes that a pun cannot be reduced to play on the meaning and
the homophony of a word and considers that the context, the logic and the way the
pun is formulated should also be taken into account. Chuandao defines the following
categories:

Homonymic pun (identical sounds and spelling)

Lexical meaning pun (polysemous words)

Understanding pun (absence of any pun word, but the context enables the ad-
    dressee to understand the implied meaning of a sentence) – for example:
          My sister Mrs. Joe Gargery, was more than twenty years older than
          I, and had established a great reputation with herself and the neigh-
          bours because she had brought me up "by hand". Having at that
          time to find out for myself what the expression meant, and knowing
          her to have a hard and heavy hand, and to be much in the habit
          of laying it upon her husband as well as upon me, I supposed
          that Joe Gargery and I were both brought up by hand.” (Charles
          Dickens, Great Expectations, quoted in [24]

Figurative pun (opposition between the surface and figurative meaning of a simile
     or metaphor) – examples:

         In reply, Dr. Zunin would claim that a little practice can help us
         feel comfortable about changing our social habits. We can become
         accustomed to any changes we choose to make in our personality.
         “It’s like getting used to a new car. It may be unfamiliar at first, but it
         goes much better than the old one.” [24]

     Here, the simile “becoming accustomed to any changes in our social habits” is
     compared to “getting used to a new car”.

Logic pun (a kind of implication in a given context) –for example:
          Lady Capulet: . . . Some grief shows much of love;
          But much of grief shows still some want of wit.
          Juliet: Yet let me weep for such a feeling loss.
          Lady Capulet: So shall you feel the loss, but not the friend.
          Which you weep for.
          Juliet: Feeling so the loss.
          I cannot choose but ever weep the friend.
          Lady Capulet: Well, girl, thou weep’st not so much for this death,
          As that the villain lives which slaughter’d him.
          Juliet: What villain, madam?
          Lady Capulet: That same villain, Romeo.
          Juliet: Villain and he are many miles asunder. —
          God pardon him! I do with all my heart;
          And yet no man like he doth grieve my heart.
          Lady Capulet: That is, because the traitor murderer lives.
          Juliet: Ay, madam, from the reach of these my hands.
          Would none but I might venge my cousin’s death!
          Lady Capulet: We will have vengeance for it, fear thou not:
          Then weep no more. I’ll send to one in Mantua,
          Where that same banish’d runagate doth live,
          Shall give him such an unaccustom’d dram,
          That he shall soon keep Tybalt company:
          An then, I hope, thou wilt be satisfied.
          Juliet: Indeed, I never shall be satisfied
          With Romeo, till I behold him — dead —
          (William Shakespeare, Romeo and Juliet )
     In this excerpt, Lady Capulet uses the phrase “feel the loss” to refer to Juliet’s
     grief for losing her cousin, while Juliet goes on and expresses her grief for
     losing Romeo.


3. Initial data annotation
At the very beginning of the project, we collected more than 1000 translated wordplay
instances in English and French from various sources: video games (Enter the
Gungeon, Undertale, South Park, League of Legends, Phoenix Wright, Pokémon,
etc.), advertising slogans, literature (Shakespeare, Alice’s Adventures in Wonderland,
Asterix, How to Train your Dragon, Harry Potter, etc.). Some slogans and punning
tweets were translated by our experts.
   As previously mentioned, in its broad sense, a wordplay includes sub-categories
such as puns wellerisms, spoonerisms, anagrams, palindromes, onomatopoeia,
mondegreens, malapropisms, neologisms and portmanteaux, alliterations, assonance,
and consonance. In its narrow sense, a wordplay is a pun based on the ambiguity that
occurs when a word or phrase has more than one meaning and can be understood in
different ways.
   Our research project considers wordplay in its broad sense. Following Delabatista’s
classification, our corpus mainly includes two types of wordplay annotation based
on sound and spelling (phonological and graphological structure) and lexical struc-
ture (polysemy and idioms). Only a few puns play on syntactic and morphological
structures.
   The collected data mainly contains punning named entities, in many cases neolo-
gisms. Each pun in each language was classified in several classes according to a
well-defined multi-label classification and explained with respect to how the pun is
constructed. For example, the punning joke “Why is music so painful? Because it
hertz” was annotated as “Paronymy” under the Structure classification (there being
slight differences in both spelling and sound) and explained simply as “hertz/hurts”.
   The total amount of English wordplay instances, after being classified according
to Type (type of humour) and/or Structure (wordplay based on phonological and
graphological structure), are given by frequency in Tables 1 and 2. For each category,
we provide an example and its translation.
   The “others” category in Table 1 refers to anagrams, mondegreens, onomatopoeia,
spoonerisms, and wellerisms. Some of the collected entries (342 entries in the
English corpus) employed a type of humour not directly related to wordplay (e.g.,
Table 1
Statistics and examples for the initial classification of wordplay type.

 Type                        # entries     English                      French
 Neologisms and                    1353    Cat lovers will only drink   Les amoureux des chats
 portmanteaux                              their kit-TEA. (Lipton ad)   ne boivent que du thé
                                                                        Mat-chat.
 Puns                               586    Hellmann’s        makes      Avec Hellmann’s, le pou-
                                           chicken so juicy, all the    let est si juteux que la
                                           competition is squawk-       concurrence en perd ses
                                           ing (Hellmann ad)            plumes.
 Alliteration, assonance,           107    Weasleys’ Wildfire Whiz-     Feuxfous Fuseboum
 and consonance                            bangs (Harry Potter)
 Malapropisms                         35   I’m Redd White, CEO of       Je suis Redd White, le
                                           Bluecorp. You know, Cor-     PDG de Bluecorp. Vous
                                           porate Expansion Offi-       savez, Présence, Distinc-
                                           cial? (Phoenix Wright)       tion et Grâce !
 Others                               25   Snipperwhapper!              Vetit aporton !
                                           (Phoenix Wright)



absurd humour, humour related to the visual or historical context, humour related
to a cultural or historical reference). We classified these instances according to
their properties and annotated them as such, as the translation of absurd humour
and cultural references is to be studied with different tools. These entries are not
included in the tables.
  We observed that there was significant overlap among the classes. The most
problematic category was neologism, as almost any transformation of a common
word was considered by our annotators to be a neologism. To address this issue, we
decided to introduce our own classification and wordplay interpretation annotation
scheme, which is described in §4.


4. Final annotation guidelines
For our Task 1, we annotated both phrase-based (puns) and term-based instances of
wordplay in English and French (see §5). Following the SemEval-2017 pun task [25],
we annotated each instance of wordplay according to its LOCATION and INTER-
PRETATION. For LOCATION, we mean the precise word(s) in the instance forming
the wordplay, such as the ambiguous words of a punning joke.1 INTERPRETATION
means the explanation of the wordplay, which we do, for example, by providing the
secondary meaning of a pun. To facilitate preprocessing, we do not use WordNet as
in SemEval-2017 but rather introduce the notation described in Table 3.
  We further annotated the data according to the following typologies:

   1
     Unlike in the SemEval-2017 task, we simply list the word(s) in question rather than indicating
their position within the instance.
Table 2
Statistics and examples for the initial classification of wordplay structure.

 Structure                   # entries     English                      French
 Lexical structure (poly-           337   I used to be a train driver   Avant, j’étais conduc-
 semy and idioms)                         but I got sidetracked.        teur de train, mais j’ai
                                          (punning tweet)               changé de voie.
 Paronymy                           314   I guess Lotta’s in “lotta”    À mon avis, Eva, “eva”
                                          trouble. . .     (Phoenix     avoir des ennuis. . .
                                          Wright)
 Homophony                          161   Weasleys’ Wildfire Whiz-      Feuxfous Fuseboum
                                          bangs (Harry Potter)
 Homonymy                            54   There’s a large mustard-      Il y a une bonne mine de
                                          mine near here. And           moutarde près d’ici ; la
                                          the moral of that is—         morale en est qu’il faut
                                          The more there is of          faire bonne mine à tout
                                          mine, the less there is       le monde !
                                          of yours. (Alice’s Adven-
                                          tures in Wonderland)
 Syntactic or morpholo-              18   I can’t remember how          J’ai oublié comment
 gical structure                          to write 1, 1000, 51, 6       écrire 100, 1, 6 et 50
                                          and 500 in Roman Nu-          en chiffres romains, et
                                          merals. IM LIVID. (pun-       même si cela m’agace
                                          ning tweet)                   fortement,    je  reste
                                                                        CIVIL.



   • HORIZONTAL/VERTICAL concerns the co-presence of source and target of
     the wordplay. In horizontal wordplay, both the source and the target of the
     wordplay are given:

      Example 4.1. They’re called lessons because they lessen from day to day.

      In vertical wordplay, source and target are collapsed in a single occurrence:

      Example 4.2. How do you make a cat drink? Easy: put it in a liquidizer.

   • MANIPULATION TYPE:
         – Identity: source and target are formally identical, as in Example 4.2.
         – Similarity: as in Example 4.1: source and target are not perfectly identical,
           but the resemblance is obvious.
         – Permutation: the textual material is given a new order, as in anagrams or
           spoonerisms:
            Example 4.3. Dormitory = dirty room
         – Abbreviation: an ad-hoc category for textual material where the initials
           form another meaning, as in acrostics or “funny” acronyms:
Table 3
Wordplay interpretation notation

 𝑎/𝑏             Distinguishes the location from the interpretation and the different meanings of
                 a wordplay:
                 meaning 1 (location) / meaning 2 (second meaning)
 𝑎|𝑏             Separates the wordplay instances and their respective interpretations. An
                 expression can contain several wordplay instances:
                 location 1 | location 2
 𝑎(𝑏)            Specifies definitions or synonyms for each interpretation when location and
                 interpretation are homographs:
                 meaning (synonym, hyperonym or brief definition)
 𝑎[𝑏]            Specifies comments like foreign language, anagram, palindrome etc.:
                 interpretation [anagram]
 𝑎{𝑏}            Specifies the frame that activates the ambiguous word when a synonym or a
                 short definition is not available:
                 meaning {frame activated by meaning}
 < 𝑎; . . . >    Groups words from the same lexical field:
                 
 “𝑎”             Indicates presence of an idiom:
                 “idiom”
 𝑎∼𝑏             Indicates several possible interpretations for an ambiguous word:
                 meaning 1 (interpretation 1) ∼ meaning 2 (interpretation 2)
 𝑎+𝑏             Indicates that several words or syllables have been combined:
                 meaning 1 / meaning 1a + meaning 1b
 𝐴 /𝑏            Defines acronyms:
                 OWL /Ordinary Wizarding Level
 𝑎&𝑏             Shows when the wordplay relies on opposition:
                 location 1 & location 2


                Example 4.4. BRAINS: Biobehavioral Research Awards for Innovative
                New Scientists
          – Opposition: covers wordplay such as the antonyms hot & ice | warms &
            freezing in the following:
                Example 4.5. Hot ice cream warms you up no end in freezing weather.
   • MANIPULATION LEVEL: Most wordplay involves some kind of phonological
     manipulation, making SOUND our default category. Examples 4.1 and 4.2
     involve a clear sound similarity or identity, respectively. Only if this category
     cannot be applied to the wordplay is the instance tagged with another level of
     manipulation. The next level to be considered is WRITING (as in Examples 4.3
     and 4.4). If neither SOUND nor WRITING are manipulated, the level of manipu-
     lation is specified as OTHER. This level of manipulation may arise, for instance,
        in chiasmses:

        Example 4.6. We shape our buildings, and afterwards our buildings shape us.

   • CULTURAL REFERENCE: This is a binary (true/false) category. In order to
     understand some instances of wordplay, one has to be aware of some extra-
     linguistic factors.
   • CONVENTIONAL FORM: Another binary category, this time indicating whether
     the wordplay occurs in a fixed form, such as a Tom Swifty (i.e., wellerism).
   • OFFENSIVE: Another binary category, this time indicating whether the word-
     play could be considered offensive. (This category was not evaluated in the
     pilot tasks.)


5. Corpus details
We constructed a parallel corpus of wordplay in English and French. Our data is
twofold, containing phrase-based wordplay (puns) and term-based wordplay (mainly
named entities).


5.1. Parallel corpus of puns
Our English corpus of puns is mainly based on that of the SemEval-2017 shared task
on pun identification [25]. The original annotated dataset contains 3387 standalone
English-language punning jokes, between 2 and 69 words in length, sourced from
offline and online joke collections. Roughly half of the puns in the collection are
“weakly” homographic (meaning that the lexical units corresponding to the two senses
of the pun, disregarding inflections and particles, are spelled identically) while the
other half are heterographic (that is, with lemmas spelled differently). The original
annotation scheme is rather simple, indicating only the pun’s location within the joke,
whether it is homographic or heterographic, and the two meanings of the pun (with
reference to senses in WordNet [26]).
  In order to translate this subcorpus from English into French, we applied a gamific-
ation strategy. More precisely, we organised a translation contest2 . The contest was
open to students but we also received multiple translations out of official ranking
from professional translators and academics in translation studies. The results were
submitted via Google Form. 47 participants submitted 3,950 translation of 500 puns
coming from the SemEval-2017 dataset. We took first 250 puns in English from each
of homographic and heterographic subsets. In the Google Form the homographic
and heterographic puns were alternated. Each page of the Google Form contained
100 puns.
  Besides this SemEval-derived data, we sourced further translation pairs from
published literature and from puns translated by Master’s students in translation.

   2
       https://www.joker-project.com/pun-translation-contest/
Figure 2: Wordplay location normalised by text length for English (left); French (right)



   We annotated our dataset according to the classification introduced in §9. The
final annotated training set contains a total of 1772 distinct instances in English with
4753 corresponding French translations.


5.2. Parallel corpus of term-based wordplay
For this part of the corpus, we collected 1409 single terms in English containing
wordplay from video games, advertising slogans, literature, and other sources [15]
along with 1420 translations into French. Almost all translations are official ones
but we have eleven additional ones proposed by our interns, Master’s students in
translation.
  Statistics on the annotated data are given in Tables 4 and 5. We furthermore
noticed that the LOCATION is usually the last word in wordplay, as evidenced in
Figure 2.


5.3. Training data
Our training data consists of 2078 wordplay instances in English and 2550 in French
in the form of a list of translated wordplay instances. This data was provided as
a JSON or CSV file with one fields for the unique ID of the instance, one for the
text of the instance, and one each for the LOCATION, INTERPRETATION, HORI-
ZONTAL/VERTICAL, MANIPULATION_TYPE, MANIPULATION_LEVEL, and CUL-
TURAL_REFERENCE annotations. Figure 3 shows an excerpt from the JSON file.


5.4. Test data
Our test data contains 3255 instances of wordplay in English from the SemEval-2017
pun task [25] and 4291 instances in French that we did not use for the training set.
The test data was provided as a JSON or CSV file with only two fields – one of them a
unique ID and the other the text of the instance. Figure 4 shows an excerpt of the
JSON test data.
Table 4
Annotation statistics of puns.

 English                                    French


     • 1772 annotated instances                 • 4753 annotated instances
           – Vertical 1382                           – Vertical 4400
           – Horizontal 212                          – Horizontal 320
     • MANIPULATION TYPE                        • MANIPULATION TYPE
           – Identity 894                            – Identity 2970
           – Similarity 639                          – Similarity 1672
           – Opposition 42                           – Opposition 51
           – Abbreviation 12                         – Permutation 17
           – Permutation 7                           – Abbreviation 9
     • MANIPULATION LEVEL                       • MANIPULATION LEVEL
           – Sound 1551                              – Sound 4540
           – Writing 46                              – Writing 179
           – Other 2                                 – Other 4
     • CULTURAL REFERENCE                       • CULTURAL REFERENCE
           – False 1689                              – False 4665
           – True 82                                 – True 88
     • CONVENTIONAL FORM                        • CONVENTIONAL FORM
           – False 1604                              – False 4665
           – True 167                                – True 88
     • OFFENSIVE                                • OFFENSIVE
           – Sexist 9                                – Sexist 21
           – Possibly 7                              – Possibly 6
           – Racist 2                                – Racist 4
           – Other 1                                 – Other 1




  The prescribed output format is similar to the training data format, but with the
addition of the fields RUN_ID (to uniquely identify the participating team, pilot task,
and run number), MANUAL (to indicate whether the output annotations are produced
by a human or a machine), and OFFENSIVE (per our annotation scheme).


6. Preliminary results on wordplay perception
We carried out a preliminary analysis of wordplay perception based on the French
data issued from the translation contest. A student in linguistics, a French native
Table 5
Annotation statistics of wordplay in named entities.

 English                                        French


     • 1409 annotated instances                        • 1420 annotated instances
           – Vertical 1408                                  – Vertical - 1419
           – Horizontal 1                                   – Horizontal - 1
     • MANIPULATION TYPE                               • MANIPULATION TYPE
           – Similarity 606                                 – Similarity 775
           – Identity 441                                   – Identity 415
           – Abbreviation 340                               – Abbreviation 211
           – Permutation 17                                 – Permutation 15
           – Opposition 1                                   – Opposition 1
     • MANIPULATION LEVEL                              • MANIPULATION LEVEL
           – Sound 1402                                     – Sound 1411
           – Writing 7                                      – Writing 9
     • CULTURAL REFERENCE                              • CULTURAL REFERENCE
           – False 1361                                     – False 1344
           – True 48                                        – True 76
     • CONVENTIONAL FORM                               • CONVENTIONAL FORM
           – NOT APPLICABLE                                 – NOT APPLICABLE
     • OFFENSIVE                                       • OFFENSIVE
           – NOT IDENTIFIED                                 – NOT IDENTIFIED




speaker, applied a score between 0 and 5 on a Likert scale [27] to evaluate joke
humorousness. For the 149 annotated wordplay instances in French, the average
humorousness score was 4.6. Among the annotated examples, there were several
wellerisms:

Question–answer (25 in total). This type of wellerism refers to bipartite jokes with
    the form of a short dialogue: a question followed by an answer.

      Example 6.1. Qu’est-ce que tu fais sur une île déserte ? Tu trouves une cuillère
      et tu l’attaques.

Old soldiers never die (7 in total) These wellerisms are variations on a catch-
     phrase, with the original version being “Old soldiers never die, they simply fade
     away.”
[
    {
      "ID": "noun_1063",
      "WORDPLAY": "Elimentaler",
      "LOCATION": "Elimentaler",
      "INTERPRETATION": "Emmental (cheese) + Eliminator",
      "HORIZONTAL/VERTICAL": "vertical",
      "MANIPULATION_TYPE": "Similarity",
      "MANIPULATION_LEVEL": "Sound",
      "CULTURAL_REFERENCE": false,
      "CONVENTIONAL_FORM": false,
      "OFFENSIVE": null
    },
    {
      "ID": "pun_341",
      "WORDPLAY": "Geologists can be sedimental about their work.",
      "LOCATION": "sedimental",
      "INTERPRETATION": "sentimental/sediment",
      "HORIZONTAL/VERTICAL":"vertical",
      "MANIPULATION_TYPE":"Similarity",
      "MANIPULATION_LEVEL":"Sound",
      "CULTURAL_REFERENCE":false,
      "CONVENTIONAL_FORM":false,
      "OFFENSIVE": null
    }
]

Figure 3: Excerpt of training data (JSON format)


[
    {
      "ID": "noun_1",
      "WORDPLAY": "Ambipom"
    },
    {
      "ID": "het_1011",
      "WORDPLAY": "These are my parents, said Einstein relatively"
    }
]

Figure 4: Excerpt of test data (JSON format)



        Example 6.2. Les vieux adeptes de saut à l’élastique ne meurent jamais : ils
        savent toujours rebondir.

Tom Swifty (18 in total) These are wellerisms with a phrase in which a quoted
    sentence is linked by a pun to the manner in which it is attributed. The
                                     Old
                          20         Tom
                                     QA

              Frequency   15

                          10

                          5

                          0
                               1.0    1.5   2.0   2.5   3.0   3.5   4.0   4.5   5.0


Figure 5: Histogram of wellerism hilariousness



     standard form is for the quoted sentence to be first, followed by the description
     of the act of speaking of the conventional speaker, Tom.

     Example 6.3. “Pourquoi est-ce que je ne me vois pas dans ce miroir ?” fit Tom
     sans réfléchir.

Figure 5 presents the histogram of wellerism humorousness, with free-text comments
for 73 jokes reproduced in Table 6. As is clear from the figure, Tom Swifties were
generally not considered funny, while the highest scores were given to question–
answer wellerisms. These results are somehow opposite to the generated joke
humorousness in [28]. Although this opposition seems obvious due to the method
used for wellerism generation in [28], further analysis in needed as the annotators
were different and humour perception depends on multiple social factors, including
genre and age.
  Looking at the manually constructed data, we noticed that in a few instances, style
shift in the translation of the pun could pose an issue. Consider the following pair:

Example 6.4. I phoned the zoo but the lion was busy.
 J’ai appelé le zoo mais on m’a dit phoque you.

The French translation includes a vulgarism, with a pun across languages (fuck/
phoque). This was considered a very successful translation, but would clearly be an
inappropriate translation in many contexts. A number of other examples that we
could spot introduced strong stereotyping that could be construed as offensive, in
contrast to the original.
  We decided to annotate the data for those style shifts that introduced in the
translation a form of humour relying on vulgarism or stereotyping. In doing so,
Table 6
Free comment statistics
                               Free comment       # wordplay
                                              ?   2
                                            fun   3
                                      boosting    1
                                          funny   25
                                           hard   2
                                      dynamic     3
                                   intellectual   11
                                       literary   1
                                           cute   12
                                     not funny    3
                                      no more     1
                                         sexist   1
                           sexist and sexual?     1
                                        sexual    3
                                       serious    4



another issue became evident: an additional bias may be introduced in the data due
to the French language. Consider the following pair:

Example 6.5. Old Quilters never die, they just go under cover.
 Les vieilles tricoteuses ne meurent jamais, elles recousent les morceaux.

  French is more strongly gendered than English. As many French speakers still
consider the use of masculine a default, this translation introduces a stereotype by
using a feminine translation for the word knitter (tricoteuse). However, using the
masculine form only, as a default gender, also raises questions in a context where
the current evolution of the language seems to go against that usage [29].


7. Classifications proposed by participants
The JOKER participants suggested new classifications of wordplay in an attempt to
overcome issues with the existing classifications.
  Delarche [30] distinguishes polysemic constructs and letter-based constructs – e.g.,
wordplay based on selections, permutations, repetitions or suppression of letters
such as acronyms, acrostics, lipograms, palindromes, pangrams. We should admit
that this distinction, although not tested, seems promising for tasks of wordplay
generation and translation. He describes in details acronyms and acrostics. These
types of wordplay are missing from our corpus.
  Delarche [30] also differentiates single pivotal keyword polysemic constructs from
repeated keyword with different meanings, which is basically similar to the categories
HORIZONTAL//VERTICAL that we ourselves defined [15].
  A. Digue and P. Campen tried to make the JOKER classification more precise by
introducing a clear distinction between Sound/Writing/Both for VERTICAL wordplay
and Sound/Writing/Both/Other for HORIZONTAL one. They also demonstrated the
non-existence of certain combinations of the JOKER categories.


8. Methods used by the participants
Five teams participated in Pilot Task 1: FAST_MT [31], eBIHAR [32], Cecilia [33],
Agnieszka, and Hakima [34].
   The Cecilia and Agnieszka teams applied the Google T5 model [35] via the SimpleT5
library3 . The Google T5 (Text-To-Text Transfer Transformer) model is based on the
transfer learning with a unified text-to-text transformer [35]. Agnieszka submitted a
run without a paper, though the team notified the JOKER organisers of the method
they used.
   eBIHAR applied a polynomial naive Bayesian classifier and logistic regression to
classify and predict text (with and without preprocessing) on the bag-of-words and
TF–IDF representations.
   Hakima applied Jurassic-1, a model of the first generation in a series of large
language models trained and made widely accessible by AI21 Labs4 [36]. Jurassic-1 is
an auto-regressive language model based on the decoder module of the Transformer
architecture [37] with the modifications similar to GPT-3 proposed by Radford, Wu,
Child, Luan, Amodei, and Sutskever [38].
   One team submitted a run after the official deadline and so its results are not
presented in our workshop overview paper [15].
   Delarche [30] suggested interesting heuristic-based deterministic algorithmic
filters which might be useful for specific types of wordplay, though these algorithms
were not implemented and therefore not tested.


9. Evaluation metrics
We preprocessed runs to lowercase and trim the values. For the English subcorpus,
the labels for LOCATION and INTERPRETATION were provided for puns from the
original dataset [25]. All wordplay instances from this dataset were considered to be
VERTICAL with manipulation type SOUND. HOMOGRAPHIC puns were attributed
the IDENTITY label while HETEROGRAPHIC puns were classified as SIMILARITY
manipulation type. We report the absolute values of true labels submitted by the
participants.
  We discarded all INTERPRETATION values that were equal to the LOCATION fields
as we considered this to be insufficient.


   3
       https://github.com/Shivanandroy/simpleT5
   4
       https://studio.ai21.com/
Table 7
Scores of participants’ runs for Pilot Task 1

                                         LOCATION     MANIP. TYPE    MANIP. LEVEL
                 FAST_MT                                      1035          2437
                 FAST_MT_updated              1455            1667          2437
                 Cecilia_task_1_run5          1484            1541          2437
                 Agnieszka_task1_t5           1554
                 eBIHAR_en                                    1392          2437
                 eBIHAR_en_tfidf_wp                           1083          2437
                 eBIHAR_en_tfidf                               536          2437
                 _wp_preprocessed



  In recognition of the fact that there may be slightly different but equally valid
INTERPRETATION annotations, for evaluation we retained only the high-level an-
notation (by removing everything in brackets, parentheses, etc.). We downcased,
tokenised, and lemmatised this high-level annotation with the aid of regular expres-
sions and the NLTK WordNetLemmatizer.5 We then compared the set of lemmas
generated by participants with our own annotations.


10. Results
All together, four teams submitted eight runs for the English dataset. The eBIHAR
team also submitted one run in French. The release of the French dataset was delayed
and we also updated the English dataset during the competition. The FAST_MT team
submitted runs both for the first release of English dataset and the updated one.
The Agnieszka team submitted only partial runs for LOCATION. The results for the
participants are given in Table 7.
  All participants, except the Agnieszka team which did not submit predictions for
MANIPULATION LEVEL, successfully predicted all classes. However, this success
might be explained by the nature of our data, as in the test set the only class was
SOUND.
  The teams Cecilia, FAST_MT, and Agnieszka demonstrated fairly good results for
LOCATION. However, as previously noted, in our dataset the majority of instances
had the wordplay located at the last word.
  Only the FAST_MT team succeeded in INTERPRETATION prediction for the first
data release. For this first run, our annotation coincides with that of the submission
in 597 cases; it differs for 61. These differences are, in the majority of cases, not
errors but differences in the presentation or human interpretation. The first dataset
contained a lot of named entities from popular anime, movies, and video games (e.g.,
Pokemon), unlike the updated data set. FAST_MT had gathered raw data from various
websites explaining puns in Pokemon names and trained their model on it. We should

   5
       https://www.nltk.org/_modules/nltk/stem/wordnet.html
acknowledge that some annotations provided by FAST_MT were more detailed than
ours. For the updated dataset, FAST_MT’s predictions for LOCATION are identical
to those of INTERPRETATION. Only one run, Cecilia’s run 5, was successful for this
dataset with 441 correct results.
  We do not provide results for other binary classes; since our data was unbalanced
with regard to these categories, the submitted results always provided negative
labels.


11. Conclusion
We introduced the JOKER track at CLEF 2022, consisting of a workshop and associ-
ated pilot tasks on automatic wordplay analysis and translation. Our primary goal is
to build parallel data and evaluation metrics for detecting, locating, interpreting, and
translating wordplay in order to woze a step forward to the automation of wordplay
analysis.
  We surveyed existing classifications and we present the data we initially annotated
according to two well-known classifications from the literature. However, we were
obliged to abandon these classifications as the first one contains overlapping classes
while another is not expressive enough. Therefore, we introduced a new classification
of wordplay which aims to handle the issues of the classifications from the literature.
  We manually classified wordplay in English and French according to our categories.
Our data was used to organise Pilot Task 1: Classify and Explain Instances of
Wordplay. Four teams submitted official runs for the Pilot Task 1; one team submitted
a run after the deadline.
  Participants succeeded in wordplay location, but the interpretation tasks raised
difficulties. The binary classes HORIZONTAL/VERTICAL, CONVENTIONAL_FORM,
CULTURAL_REFERENCE, OFFENSIVE, MANIPULATION_LEVEL were unbalanced,
provoking very high but meaningless scores. However, these binary classifications
were not the focus of our research. We plan to perform a more detailed study of
wordplay perception, including humorousness, and offensiveness, as well as a free
category study.
  It should be kept in mind that our data consists mainly of puns and portmanteaux,
which may make our classification too not expressive enough. Participants proposed
new wordplay classifications or tried to improve upon ours. In the future, we will use
this feedback to revise our classification in order to improve its expressiveness.
  Further details on the other pilot tasks and the submitted runs can be found in the
CLEF CEUR proceedings [39]. The overview of the entire JOKER track can be found
in the LNCS proceedings [15]. Additional information on the track is available on the
JOKER website: http://www.joker-project.com/
12. Authors’ contribution
The general framework was proposed by L. Ermakova. The initial annotation guide
was proposed by L. Ermakova and G. Le Corre. The new classification was introduced
by F. Regattin and adjusted by L. Ermakova, T. Miller, J. Boccou, A. Digue, A. Damoy
with participation of C. Borg and S. Araújo. O. Puchalski worked on the initial data
annotation. The interpretation annotation is an extension of the work of T. Miller and
was proposed by L. Ermakova and adjusted J. Boccou, A. Digue, A. Damoy, and Paul
Campen. J. Boccou, A. Digue, and A. Damoy annotated data. A.-G. Bosser worked
on the perception aspects and general organisation of the evaluation campaign.
Evaluation results were obtained and described by L. Ermakova. J. Boccou wrote the
first draft of the annotation guidelines. S. Araújo, G. Le Corre and F. Regattin wrote
the state of the art.


Acknowledgments
This work has been funded in part by the National Research Agency under the pro-
gram Investissements d’avenir (Reference ANR-19-GURE-0001) and by the Austrian
Science Fund under project M 2625-N31. JOKER is supported by La Maison des
sciences de l’homme en Bretagne.
  We thank Adrien Couaillet and Ludivine Grégoire for their participation in data
collection, annotation and adjustment of classification guidelines. We also thank
Elise Mathurin for co-supervising interns in translation as well as Alain Kerhervé
who supported the project.
  We would like also thank other JOKER organisers: Anne-Gwenn Bosser, Claudine
Borg, Fabio Regattin, Gaelle Le Corre, Elise Mathurin, Silvia Araujo, Monika Bokiniec,
̇ g Mallia, Gordan Matas, Mohamed Saki, Benoît Jeanjean, Radia Hannachi, Danica
Goṙ
Škara, and other PC members: Grigori Sidorov, Victor Manuel Palma Preciado, and
Fabrice Antoine.
  We thank Eric Sanjuan who provided resources for data management.


References
 [1] L. Laineste, P. Voolaid, Laughing across borders: Intertextuality of internet
     memes, The European Journal of Humour Research 4 (2017) 26–49.
 [2] R. A. Martin, The Psychology of Humor: An Integrative Approach, Academic
     Press, 2006. doi:10.5860/choice.45-2902.
 [3] T. Jiang, H. Li, Y. Hou, Cultural differences in humor perception, usage, and
     implications, Frontiers in Psychology 10 (2019) 141–156. URL: https://www.
     ncbi.nlm.nih.gov/pmc/articles/PMC6361813/. doi:10.3389/fpsyg.2019.00123.
 [4] S. Attardo (Ed.), The Linguistics of Humor: An Introduction, Oxford Scholarship
     Online, 2020. doi:10.1093/oso/9780198791270.001.0001.
 [5] D. Chiaro, Humor and Translation, Routledge, 2017, p. 16.
 [6] M. J. Veiga, Linguistic mechanisms of humour subtitling, in: IV Forum for
     Linguistic Sharing, 2009, pp. 1–14. URL: https://clunl.fcsh.unl.pt/wp-content/
     uploads/sites/12/2017/07/linguistic-mechanisms-of-humour-subtitling.pdf.
 [7] L. Ermakova, T. Miller, O. Puchalski, F. Regattin, É. Mathurin, S. Araújo, A.-G.
     Bosser, C. Borg, M. Bokiniec, G. L. Corre, B. Jeanjean, R. Hannachi, G.̇ Mallia,
     G. Matas, M. Saki, CLEF Workshop JOKER: Automatic Wordplay and Humour
     Translation, in: M. Hagen, S. Verberne, C. Macdonald, C. Seifert, K. Balog,
     K. Nørvåg, V. Setty (Eds.), Advances in Information Retrieval, volume 13186 of
     Lecture Notes in Computer Science, Springer International Publishing, Cham,
     2022, pp. 355–363. doi:10.1007/978-3-030-99739-7_45.
 [8] A. Zirker, E. Winter-Froemel (Eds.), Wordplay and Metalinguistic/Metadiscursive
     Reflection: Authors, Contexts, Techniques, and Meta-Reflection, volume 2015
     of English and American Studies in German, 2015.
 [9] Z. Yu, J. Tan, X. Wan, A neural approach to pun generation, in: Proceedings
     of the 56th Annual Meeting of the Association for Computational Linguistics,
     volume 1, Association for Computational Linguistics, 2018, pp. 1650–1660. URL:
     https://aclanthology.org/P18-1153. doi:10.18653/v1/P18-1153.
[10] A. Jaiswal, Monika, A. Mathur, Prachi, S. Mattu, Automatic humour detection
     in tweets using soft computing paradigms, 2019 International Conference on
     Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon) (2019)
     172–176.
[11] T. Miller, C. F. Hempelmann, I. Gurevych, SemEval-2017 Task 7: Detection
     and interpretation of English puns, in: Proceedings of the 11th International
     Workshop on Semantic Evaluation (SemEval-2017), 2017, pp. 58–68. doi:10.
     18653/v1/S17-2005.
[12] T. Miller, M. Turkovic, Towards the automatic detection and identification
     of english puns, The European Journal of Humour Research 4 (2016) 59–75.
     doi:10.7592/EJHR2016.4.1.miller.
[13] A. Mittal, Y. Tian, N. Peng, Ambipun: Generating humorous puns with ambiguous
     context, ArXiv abs/2205.01825 (2022).
[14] R. Sharma, S. Shekhar, An automatic pun word identification framework for code
     mixed text, in: Proceedings of the 5th International Conference on Information
     Systems and Computer Networks (ISCON), 2021, pp. 1–5.
[15] L. Ermakova, T. Miller, F. Regattin, A.-G. Bosser, E. Mathurin, G. L. Corre,
     S. Araújo, J. Boccou, A. Digue, A. Damoy, B. Jeanjean, Overview of JOKER@CLEF
     2022: Automatic Wordplay and Humour Translation workshop, in: A. Barrón-
     Cedeño, G. Da San Martino, M. Degli Esposti, F. Sebastiani, C. Macdonald,
     G. Pasi, A. Hanbury, M. Potthast, G. Faggioli, N. Ferro (Eds.), Experimental
     IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the
     Thirteenth International Conference of the CLEF Association (CLEF 2022),
     volume 13390 of LNCS, 2022.
[16] D. Delabastita, Introduction to the special issue on wordplay and translation,
     The Translator: Studies in Intercultural Communication 2 (1996) 1–22. doi:10.
     1080/13556509.1996.10798970.
[17] H. Gottlieb, Anglicisms and translation, in: G. Anderman, M. Rogers (Eds.),
     In and Out of English: For Better, For Worse, Multilingual Matters, 2005, pp.
     161–184. doi:10.21832/9781853597893-014.
[18] D. Chiaro, The Language of Jokes. Analyzing Verbal Play, Routledge, London,
     1992.
[19] M. Giorgadze, Linguistic features of pun, its typology and classification,
     European Scientific Journal 10 (2014). URL: https://eujournal.org/index.php/esj/
     article/view/4819.
[20] W. Redfern, Puns, Blackwell, 1985. doi:10.1177/007542428702000114.
[21] H. Gottlieb, "You got the picture?". On the Polysemiotics of subtitling wordplay,
     St. Jerome Publishing, 1997, pp. 207–232.
[22] M. Mastonen, Translating Wordplay: a Case Study on the Translation of Word-
     play in Terry Pratchett’s Soul Music, Master’s thesis, School of Languages and
     Translation Studies, Faculty of Humanities, University of Turku, 2016. URL:
     https://www.utupub.fi/bitstream/handle/10024/146151/MustonenMarjo.pdf.
[23] D. Delabastita, There’s a Double Tongue: an Investigation into the Translation of
     Shakespeare’s Wordplay, with Special Reference to Hamlet, Rodopi, Amsterdam,
     1993.
[24] Y. Chuandao, English pun and its classification, Language in India 5 (2005).
     URL: http://www.languageinindia.com/april2005/englishpun1.html.
[25] T. Miller, C. F. Hempelmann, I. Gurevych, SemEval-2017 Task 7: Detection and
     interpretation of English puns, in: Proceedings of the 11th International Work-
     shop on Semantic Evaluation, 2017, pp. 58–68. doi:10.18653/v1/S17-2005.
[26] C. Fellbaum (Ed.), WordNet: An Electronic Lexical Database, MIT Press, Cam-
     bridge, MA, 1998.
[27] R. Likert, A technique for the measurement of attitudes, Archives of Psychology
     22 140 (1932) 55–55.
[28] L. Glémarec, A.-G. Bosser, L. Ermakova, Generating Humourous Puns in French,
     in: Proceedings of the Working Notes of CLEF 2022 – Conference and Labs
     of the Evaluation Forum, Bologna, Italy, September 5th to 8th, 2022, CEUR
     Workshop Proceedings, CEUR-WS.org, Bologna, Italy, 2022, p. 8.
[29] E. Viennot, Le langage inclusif: pourquoi, comment., Les Éditions iXe, 2020.
[30] M. Delarche, A translation-oriented categorisation of wordplays, in: Pro-
     ceedings of the Working Notes of CLEF 2022 – Conference and Labs of the
     Evaluation Forum, Bologna, Italy, September 5th to 8th, 2022, CEUR Workshop
     Proceedings, CEUR-WS.org, Bologna, Italy, 2022, p. 6.
[31] F. Dhanani, M. Rafi, M. A. Tahir, FAST_MT participation for the JOKER CLEF-
     2022 automatic pun and human translation tasks, in: Proceedings of the
     Working Notes of CLEF 2022 – Conference and Labs of the Evaluation Forum,
     Bologna, Italy, September 5th to 8th, 2022, CEUR Workshop Proceedings,
     CEUR-WS.org, Bologna, Italy, 2022, p. 14.
[32] A. Epimakhova, Using machine learning to classify and interpret wordplay, in:
     Proceedings of the Working Notes of CLEF 2022 – Conference and Labs of the
     Evaluation Forum, Bologna, Italy, September 5th to 8th, 2022, CEUR Workshop
     Proceedings, CEUR-WS.org, Bologna, Italy, 2022, p. 6.
[33] L. Glemarec, Use of SimpleT5 for the CLEF workshop JokeR: Automatic Pun
     and Humor Translation, in: Proceedings of the Working Notes of CLEF 2022 –
     Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th to
     8th, 2022, CEUR Workshop Proceedings, CEUR-WS.org, Bologna, Italy, 2022,
     p. 11.
[34] H. Arroubat, CLEF Workshop: Automatic Pun and Humour Translation Task, in:
     Proceedings of the Working Notes of CLEF 2022 – Conference and Labs of the
     Evaluation Forum, Bologna, Italy, September 5th to 8th, 2022, CEUR Workshop
     Proceedings, CEUR-WS.org, Bologna, Italy, 2022.
[35] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li,
     P. J. Liu, Exploring the limits of transfer learning with a unified text-to-text
     transformer, Journal of Machine Learning Research 21 (2020) 1–67. URL:
     http://jmlr.org/papers/v21/20-074.html.
[36] O. Lieber, O. Sharir, B. Lentz, Y. Shoham, Jurassic-1: Technical Details and
     Evaluation, White paper, AI21 Labs, 2021. URL: https://uploads-ssl.webflow.
     com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_
     tech_paper.pdf.
[37] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser,
     I. Polosukhin, Attention is all you need, arXiv:1706.03762 [cs] (2017). URL:
     http://arxiv.org/abs/1706.03762.
[38] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever, Language Mod-
     els Are Unsupervised Multitask Learners, Technical report, OpenAI, 2019.
     URL: https://cdn.openai.com/better-language-models/language_models_are_
     unsupervised_multitask_learners.pdf.
[39] G. Faggioli, N. Ferro, A. Hanbury, M. Potthast (Eds.), Proceedings of the Working
     Notes of CLEF 2022: Conference and Labs of the Evaluation Forum, CEUR
     Workshop Proceedings, CEUR-WS.org, 2022.