<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Madrid, Spain</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Overview of the CLEF 2025 JOKER Task 3: Onomastic Wordplay Translation</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Liana Ermakova</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tristan Miller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yaël Naud</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anne-Gwenn Bosser</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ricardo Campos</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff5">5</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Austrian Research Institute for Artificial Intelligence (OFAI)</institution>
          ,
          <addr-line>Vienna</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Bretagne INP - ENIB, Lab-STICC CNRS UMR 6285</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Computer Science, University of Manitoba</institution>
          ,
          <addr-line>Winnipeg</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>INESC TEC</institution>
          ,
          <addr-line>Porto</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Université de Bretagne Occidentale</institution>
          ,
          <addr-line>HCTI</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>University of Beira Interior</institution>
          ,
          <addr-line>Covilhã</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>This paper summarises the setup and results of the shared task on onomastic wordplay translation at the CLEF 2025 JOKER Lab, an earlier version of which was run at JOKER 2022 as a pilot task. The objective of the task is to translate wordplay concerned with proper names from English to French. Such wordplay, widespread in classic and modern creative writing, is particularly challenging to translate due to its idiosyncratic nature and cultural references. Four teams participated in this year's task, submitting 20 runs. We describe our construction of the data set using for training and testing, the methods employed by the participating teams, and the results obtained for the runs and a naïve baseline in terms of various manually and automatically applied measures of translation quality. Despite notable advances, we find that translation of onomastic wordplay remains highly challenging, with fewer than 10% of manually evaluated translations judged as acceptable alternatives. Recurrent errors included untranslated source wordplay, overfitting to the training data, omission of surnames, and nonsensical generations.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;wordplay</kwd>
        <kwd>computation humour</kwd>
        <kwd>named entities</kwd>
        <kwd>neologisms</kwd>
        <kwd>machine translation</kwd>
        <kwd>LLM</kwd>
        <kwd>transformers</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        This paper describes Task 3 of the JOKER-2025 Track [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], which aims to benchmark automatic translation
of onomastic (i.e., name-related) wordplay from English to French. A pilot version of the task, on machine
translation of wordplay in named entities was, run in the JOKER’s 2022 edition [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ] and employed a
parallel corpus of onomastic wordplay in English and French that we constructed from video games,
advertising slogans, literature, and other sources. This year we extended the corpus with new onomastic
wordplay instances; we also provided short contexts for the names, which are often necessary to
recognise, understand, and translate the wordplay they contain. This year, Task 3 complements two
other tasks, one on humour-aware information retrieval [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and the other on translation of English
puns into French [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Wordplay is often used for its attention-getting or mnemonic qualities in headlines, toponyms,
company names, and advertising. Onomastic wordplay is used as a rhetorical device by novelists, poets
and playwrights. It is widespread in classic literature [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], such as in Shakespeare’s characters’ names [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ],
but also in names found in modern-day works such as Pokémon, Harry Potter, Asterix, and video games.
Proper nouns with an extra semantic load are used as a meaningful element in literary texts and can
be considered as wordplay [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The translation of such names is problematic, raising the questions of
whether the transposition of such names into a given target language is technically possible and, if so,
what method might be appropriate for doing this [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Common approaches to translating names include
transliterating them [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ] or keeping them unchanged in the target text. However, these approaches
rarely preserve wordplay in a meaningful way, which may harm the text’s pragmatic force.
      </p>
      <p>
        This year, the JOKER track ran its shared tasks through Codabench1 [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], a Free Software online
platform for organising AI benchmarks. (See Figure 1.) This greatly facilitated running shared tasks
and attracted many new participants. Although we are continuing to receive new registrations and
post-competition submissions, this paper presents only those runs submitted before our oficial results
were communicated to the participants. As summarised in Table 1, four teams participated in JOKER
Task 3, submitting a total of 20 oficial runs via the Codabench platform.
      </p>
      <p>In the remainder of this paper, we present related work (Section 2), the data (Section 3), the evaluation
measures (Section 4), participants’ approaches (Section 5), and an analysis of their results for both
training and test data (Section 6). Section 7 concludes the paper.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>As natural language processing continues to develop, its relationship with translation studies remains
an active – and debated – area of research. Studies highlight both the opportunities and challenges that
AI faces in translation, the new possibilities that AI brings, as well as the issues posed by the subtlety of
language, especially in humorous translation.</p>
      <p>The combination of humour studies and translation has become increasingly relevant, particularly
with the accelerating progress of AI technologies such as GPT models. The translation of humour – which
frequently relies on linguistic nuance, cultural context, and wordplay – poses significant challenges
to both human translators and AI models. These challenges are compounded when humour involves
neologisms, as newly coined terms often rely on specific cultural or temporal contexts that may not yet</p>
      <sec id="sec-2-1">
        <title>1https://www.codabench.org/competitions/8746/</title>
        <p>have direct equivalents in the target language. Research in this field aims to identify these dificulties
by studying how traditional translation models often struggle to preserve the intended humour of the
source text in the target language.</p>
        <p>
          Theoretical models of translation difer in terms of their relevance and sensitivities to humour,
including their recognition of humour, their treatment of its unique features, and how they identify and
solve various humour-related translation problems [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ]. These diferences are highly relevant for the
case of neologisms, which are embedded in humorous contexts and as such may require both semantic
interpretation and creative adaptation for an accurate translation.
        </p>
        <p>
          The first challenge related to humour detection is the requirement to understand subtleties such as
irony, tone, register, and various physical cues that contribute to humour [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. This challenge is even
greater when humour is expressed through wordplay and neologism, which are common in humorous
writing and often cannot be expressed through a word-for-word translation.
        </p>
        <p>
          The second challenge is the means by which translators navigate wordplay, cultural references, and
issues related to censorship, sensitivity, and agency. For instance, translating humour that depends
on archaic language or specific dialects can be highly challenging; however, there are indications that
humour can be accurately translated by adopting a “bidirectional” sense of humour, which focuses on
interpretation and re-creation rather than seeking an exact match in the target language [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. This
approach facilitates a more complex and creative translation of humour, and is particularly useful in
the translation of neologisms, which often carry not just linguistic innovation but cultural commentary
as well.
        </p>
        <p>
          The emergence of AI as a publicly accessible technology – especially with models like those of
ChatGPT, Claude, Llama or Gemini – has significantly impacted translation studies, including humour
translation. A systematic review of research on GPT-based translation reveals that AI-generated
translations often match human translations and can even surpass traditional machine translation in
the treatment of complex language such as humour, wordplay, and even poetry [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. However, as a
study [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] of humorous translations generated by GPT-4 points out, AI still struggles to grasp the full
depth of humour, particularly with respect to cultural context and subtlety. Although AI translations
can provide comparable or even improved experiences in certain situations, there is manifestly room
for improvement. This likely applies to neologisms, which have no standard definitions and require a
lfexible, context-sensitive approach that AI has yet to fully master.
        </p>
        <p>The use of AI in humorous translation goes beyond traditional written texts. The use of AI in creative
writing – especially in comedy – has been examined through work [e.g., 20] that views AI tools as
collaborators for writers, not competitors. This cooperative approach creates new opportunities for
humorous literature writers, allowing them to engage with AI as a ‘new toy’ in the creative process.
Exploring AI’s ability to generate and adapt neologisms could be particularly valuable in this process.
Giving writers new linguistic tools to experiment with naturally raises broader questions about the
quality of AI-generated content, but also helps reframe AI as a tool for creation rather than a threat to
human creativity.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <sec id="sec-3-1">
        <title>3.1. Construction</title>
        <p>Task 3 uses a parallel corpus of wordplay in named entities in English and French, drawn from video
games, advertising slogans, literature, and other sources. This corpus is based in part on one used for
the 2022 edition of this task; for that task we had sourced 1,398 names in English along with 1,450
translations into French. The vast majority of those translations are the oficial, published ones, and
as such may already be included in the training data of popular large language models (LLMs). Some
alternative translations had been provided by Master’s students in translation, all native speakers of
French. For some sources, such as Pokémon names, we included both oficial and unoficial translations,
as newer generations and Fakémon (fan-created Pokémon) often lack oficial localised names. Most
of the names in the corpus are portmanteau words – i.e., words formed by merging the sounds and
meanings of two diferent words.</p>
        <p>For this year’s task, we doubled the size of the corpus and added explanatory descriptions in English
for each instance of wordplay, sourced variously from Wikipedia, the Web, and Master’s students in
translation.</p>
        <p>For training purposes, we released 353 onomastic wordplay instances in English with corresponding
French translations and descriptions, all drawn from Asterix and Harry Potter. These sources are well
known and well documented on Web sources such as Wikipedia. For testing purposes, we compiled our
own dataset of instances manually translated by trained professionals, as well as instances of oficial
translations. We used 2,333 instances of onomastic wordplay in English with corresponding French
translations as our test set.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Format</title>
        <p>Input. We provide the training data as JSON files with the following fields:
• id_en: a unique identifier from the input file. Note that this identifier is not unique in the file, as
the same English pun might have multiple French translations.
• en: the text of the instance of source onomastic wordplay in English. Note that the texts in English
are not unique in the file, as the same English pun might have multiple French. translations
• description: short contexts for the names and objects, which are often necessary to recognise,
understand, and translate the wordplay
• fr: translation of the onomastic wordplay into French
For example:
"id":"en_1",
"en":"Asterix",
"description":"Asterix is the small but clever hero of the Asterix comic series.</p>
        <p>Known for his sharp wit and courage, he outsmarts the Roman invaders with the
help of a magical potion that grants him superhuman strength. Alongside his
loyal friend Obelix, Asterix defends his village and embodies bravery and
cleverness.",
"fr":"Astérix"</p>
        <p>The test data format is identical to that of the training data, except that the field for the target text is
omitted.</p>
        <p>Output. Participants were asked to submit to Codabanch a ZIP archive containing a file named
prediction.json in the root directory. This JSON-formatted file was to contain the following fields:
• run_id: Run ID starting with &lt;team_id&gt;_&lt;task_id&gt;_&lt;method_used&gt; – e.g.,</p>
        <p>UBO_task_3_BLOOM
• manual: 0 if the run is automatic, or 1 if manual
• id_en: a unique identifier from the input file
• en: the text of the instance of source onomastic wordplay in English.</p>
        <p>• fr: translation of the onomastic wordplay into French
Example:</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Evaluation</title>
      <p>
        For the wordplay translation, there do not yet exist any universally accepted metrics of translation
quality [
        <xref ref-type="bibr" rid="ref21 ref22">21, 22</xref>
        ]. Machine translation is traditionally measured with the BLEU (Bilingual Evaluation
Understudy) metric, which calculates vocabulary overlap between the candidate translation and a
reference translation [23]. However, this metric is clearly inappropriate for single-term wordplay
translation evaluation, as overlap measures operate only on larger text spans and not on individual
words, the morphological analysis of which can be crucial for neologisms [
        <xref ref-type="bibr" rid="ref21 ref22">21, 22</xref>
        ]. We therefore
evaluated participants’ translations by automatically checking them for case-insensitive exact matches
against the manual reference translations; we report these scores under the label “automatic”.
      </p>
      <p>We hypothesised that the majority of proper nouns would not be translated automatically, so we also
checked the target translations for identity with the source texts, and report these scores under the
label “identical”.</p>
      <p>Finally, we performed a manual evaluation of 1,737 translations of 203 distinct source wordplay
instances sampled from the participants’ runs. The annotation was carried out by Master’s students in
translation, who evaluated whether the generated translations were valid alternatives – i.e., whether
they preserved the wordplay and conveyed a meaningful name for the character or object in context.
Descriptions and reference translations were also provided to support this task. Although we tried to
remove translations matching the references and the English sources to reduce the annotators’ working
load, some of them were maintained due to the slight format diferences. We added reference translations
to calculate the percentage of successful translations in runs resulting in 1,833 distinct lower-cased
stripped translations. These manual evaluation scores are reported under the label “manual”.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Participants’ approaches</title>
      <p>
        Four teams participated in this task for a total number of 20 runs:
mariapazr20 [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] This team used chain-of-thought prompting techniques with several large language
models, including additional constraints identified from recurring translation patterns for each literary
work of the provided corpus (such as favouring puns in a given semantic field over meaning preservation
for instance).
arampageos [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] This participant started by manually or semi-automatically classifying the names
in the training data into four categories (alliteration, wordplay, realistic names and unclassified). They
then used the same strategy as for Task 2, with a two-stage approach where a record of manually
defined translations preceded using large language models. They applied diferent machine translation
models (e.g., MarianMT, Helsinki-NLP-opus, facebook-nllb, T5).
sarath_kumar [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] This participant prepared a dataset containing named entity recognition
annotations and used it to source translations from the T5-base model. They used a beam search to prioritise
phonetic matches between source and target names, and then ranked the translations according to their
creativity, phonetic fidelity, and cultural relevance.
pjmathematician [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] This team used zero-shot prompting with Qwen models. The prompt consists
of about 50 lines and includes guidance on how to translate wordplay (when to not translate, when
to use a literal translation, or suggesting creative constraints such as considering characters’ traits or
relying on the story universe vocabulary).
      </p>
      <p>All participants who submitted runs also submitted system description papers to the Working Notes
volume [24]. Despite the requirement to include the team ID in the run name, participants’ submissions
often difered in their run names, registration details, and Codabench IDs. We manually matched the
Working Notes with the submitted runs and report the results using the team names provided in those
submissions.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Results</title>
      <sec id="sec-6-1">
        <title>6.1. Test data</title>
        <p>
          Translation analysis. About 12% of reference translations are identical to the English source wordplay.
The manual evaluation shows only 2.55% of untranslated wordplay instances as appropriate translations,
as we tried to remove translations matching the references and the English sources in order to reduce the
annotators’ working load. These 2.55% correspond to some translations identical to the source that were
not filtered out due to the minor diferences in formatting or typography. The identity baseline remains
a strong one, outperforming more than half of submitted runs in terms of matching to the references.
Half the runs have more than 40% French translations identical to English while 30% keep half the
onomastic wordplay instances untranslated. This proportion is much higher than in the references but
aligns with traditional approaches to translating named entities that omit wordplay [
          <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
          ], which often
fail to preserve the intended humorous or pragmatic meaning for the target audience.
        </p>
        <p>Among 1,737 manually evaluated translations, 172 (10%) were considered successful ones. Among
these, 17 were nearly identical to the reference translations, with the only diferences manifesting in
diacritics, capitalisation, and/or punctuation – for example, “Oreilles de Soie” (run) vs. “Oreilles-De-Soie”
(reference) for “Ears of Silk” (source). Less than 10% of manually evaluated translations (155 instances)
were genuinely alternative translations, suggesting that translating onomastic wordplay remains a
challenge despite the impressive capacities of LLMs. Among recurrent errors, 226 generations were
identical to the English source. In 102 cases, we found the sufix “-ix” as in Celtic names, which might
be a result of overfitting on the training set containing the names from the Asterix comics. In 11 cases,
the translations lack the character’s surname. Twenty-nine generations were blank or consisted only
of punctuation (e.g., “???”), and in 13 cases we found spurious overgeneration such as “l’aide de” or
seemingly random translations such as “l’intention des autorités fédérales, il” for “Chimchar”. There
were 226 translations containing extraneous articles (le, l’, la, les) as in “Le Munchlax” for “Munchlax”
or “Le Shinx” for “Shinx”. The Pokémon name “Pidove” was inexplicably translated as “pédophile” in
one run.</p>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Training data</title>
        <p>As for other tasks, runs were submitted for both the training and test datasets in order to analyse
potential overfitting and related efects. Tables 6.2 report the Task 3 results on the training data, showing
the percentage of translations matching the references and the percentage of translation instances
identical to the source text. Unlike the test data, the training data was not manually evaluated due to
cost constraints.</p>
        <p>
          Of the 21 runs, 14 (33%) submitted by teams duth [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] and pjmathematician [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] achieved nearly
identical scores by both exactly matching the reference translations (56% of successful translations)
and retaining the English source text as the translation (3.4%). These runs implement very diferent
models and show varied performance on the test data. Submitted translations on the training data
show a much higher rate of exact matches with the reference translations compared to the test set,
while the number of untranslated names is markedly lower. The best-scoring run on the training data,
pjmathematician_Q332 [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], shows a drop in exact matches with the reference translations from 56%
to 23%, while the proportion of untranslated names increases from 3.4% to 22% on the test data. The
training data was sourced from Asterix and Harry Potter, which have well-known oficial translations;
for the test data, we created completely new data. The higher rate of exact matches and lower number
of untranslated names in the training data may be explained by the potential inclusion of this data in
the training sets of AI models.
        </p>
        <p>Surprisingly, VerbaNex_gpt4o does not follow this trend; it had only 12% of exact matches on the
training data while it achieved 39% on the test. The percentage of untranslated names for VerbaNex_gpt4o
does not difer a lot between the training and test data. Note that according to manual evaluation,
63% of translations were successful. This might be explained by more creative alternative translations.
copy</p>
        <sec id="sec-6-2-1">
          <title>Further analysis is needed to prove this hypothesis.</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>For Task 3, we constructed a parallel corpus of English and French onomastic wordplay collected
from video games, slogans, literature, and other sources with translations and context descriptions,
which are often necessary to recognise, understand, and translate the wordplay. We released 353
training instances of English onomastic wordplay with corresponding translations into French and
descriptions. For testing, we assembled 2,333 English wordplay instances paired with professional or
oficial French translations and descriptions providing necessary context. Participants’ submissions
were evaluated both automatically by exact matching and manually on 1,737 translations, with some
near-identical outputs retained due to minor formatting or typographical diferences, resulting in 1,833
distinct normalised translations for analysis.</p>
      <p>
        Four teams submitted a total number of 20 runs to Codabench. Fourteen runs (33%) from teams
duth [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and pjmathematician [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] achieved nearly identical scores, with high exact matches (56%)
and few untranslated names (3.4%) on the training data. However, the best run, pjmathematician_Q332,
dropped to 23% exact matches and 22% untranslated names on the test set. This likely reflects diferences
in data composition: the training set drew on well-known, oficially translated sources like Asterix
and Harry Potter, whereas the test set included partially new data, reducing potential overlap with AI
training data.
      </p>
      <p>Surprisingly, VerbaNex_gpt4o achieved only 12% exact matches on the training data but 39% on the
test, with a stable rate of untranslated names and 63% successful translations by manual evaluation,
possibly due to more creative alternative translations—a hypothesis requiring further analysis.</p>
      <p>
        Half the runs left more than 40% of the instances untranslated. This proportion is much higher than
in the reference corpus, which has 12% of translations identical to the source, but aligns with traditional
approaches to translating named entities that omit wordplay [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ]. This traditional approach may
nonetheless fail to preserve the intended humorous or pragmatic meaning for the target audience.
      </p>
      <p>Despite notable advances, translating onomastic wordplay remains highly challenging, with fewer
than 10% of manually evaluated translations judged as genuine alternatives. While VerbaNex_gpt4o
achieved the highest performance overall, a significant portion of the outputs of runs still relied on
identity to the English source or near-verbatim adaptations, illustrating the limitations of current models.
Recurrent errors – such as untranslated names, overfitting to training data, omission of surnames, and
occasional nonsensical generations – highlight that further progress is needed to produce creative,
culturally adapted translations at scale.</p>
      <p>In the future, we plan to perform more detailed analysis of the alternative generated translations and
compare human and machine strategies for neologism creation as well.</p>
      <p>
        For more information about the JOKER lab this year, please refer to the overview paper [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and to the
task description papers for Task 1: Humour-aware Information Retrieval [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and Task 2: Translation of
Puns from English to French [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Visit the JOKER website at https://joker-project.com for any other
information related to the track.
      </p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This work has received a government grant managed by the National Research Agency under the
program Investissements d’avenir integrated into France 2030, with the Reference ANR-19-GURE-0001.
It was also financed by National Funds through the Portuguese funding agency FCT through the project
LA/P/0063/2020 (DOI 10.54499/LA/P/0063/2020). Ricardo Campos would also like to acknowledge
project StorySense, with reference 2022.09312.PTDC (DOI 10.54499/2022.09312.PTDC). We thank all
other colleagues and students who participated in data construction, the translation contests, and the
CLEF JOKER track.</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT and Grammarly in order to: Grammar
and spelling check and Paraphrase and reword. Further, the authors used Gemini in order to: Generate
images. After using these tools/services, the authors reviewed and edited the content as needed and
take full responsibility for the publication’s content.
International Conference of the CLEF Association (CLEF 2022), volume 13390 of Lecture Notes in
Computer Science, Springer, Cham, 2022, pp. 447–469. doi:10.1007/978-3-031-13643-6_27.
[23] K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, BLEU: A method for automatic evaluation of machine
translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational
Linguistics, 2002, pp. 311–318. doi:10.3115/1073083.1073135.
[24] G. Faggioli, N. Ferro, P. Rosso, D. Spina (Eds.), Working Notes of CLEF 2025: Conference and Labs
of the Evaluation Forum, CEUR Workshop Proceedings, CEUR-WS.org, 2025.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          , A.-G. Bosser,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Campos</surname>
          </string-name>
          ,
          <article-title>Overview of JOKER: Humour in the machine</article-title>
          , in: J.
          <string-name>
            <surname>C. de Albornoz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Piroi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Spina</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Sixteenth International Conference of the CLEF Association (CLEF</source>
          <year>2025</year>
          ), Lecture Notes in Computer Science, Springer,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Regattin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Borg</surname>
          </string-name>
          , Élise Mathurin,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Corre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Araújo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hannachi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Boccou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Digue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Damoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Jeanjean</surname>
          </string-name>
          , Overview of JOKER@CLEF 2022:
          <article-title>Automatic wordplay and humour translation workshop</article-title>
          , in: A.
          <string-name>
            <surname>Barrón-Cedeño</surname>
            ,
            <given-names>G. D. S.</given-names>
          </string-name>
          <string-name>
            <surname>Martino</surname>
            ,
            <given-names>M. D.</given-names>
          </string-name>
          <string-name>
            <surname>Esposti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Sebastiani</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Pasi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Hanbury</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Potthast</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction: Proceedings of the Thirteenth International Conference of the CLEF Association (CLEF</source>
          <year>2022</year>
          ), volume
          <volume>13390</volume>
          of Lecture Notes in Computer Science, Springer, Cham,
          <year>2022</year>
          , pp.
          <fpage>447</fpage>
          -
          <lpage>469</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -13643-6_
          <fpage>27</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Boccou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Digue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Damoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Campen</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF 2022 JOKER task 2: Translate wordplay in named entities</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hanbury</surname>
          </string-name>
          , M. Potthast (Eds.),
          <source>Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum</source>
          , Bologna, Italy, September 5th - to - 8th,
          <year>2022</year>
          , volume
          <volume>3180</volume>
          <source>of CEUR Workshop Proceedings, CEUR-WS.org</source>
          ,
          <year>2022</year>
          , pp.
          <fpage>1666</fpage>
          -
          <lpage>1680</lpage>
          . URL: https://ceur-ws.
          <source>org/</source>
          Vol-
          <volume>3180</volume>
          /paper-127.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Campos</surname>
          </string-name>
          , A.-G. Bosser,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF 2025 JOKER Task 1: Humour-aware Information Retrieval</article-title>
          , in: [24],
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          , A.-G. Bosser,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Campos</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF 2025 JOKER Task 2: Wordplay Translation from English into French</article-title>
          , in: [24],
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>J. J. O'Hara</surname>
          </string-name>
          ,
          <article-title>True names: Vergil and the Alexandrian tradition of etymological wordplay</article-title>
          , University of Michigan Press,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>R. C.</given-names>
            <surname>Beshere</surname>
          </string-name>
          , “
          <article-title>What's in a name?”: Theorizing an etymological dictionary of Shakespearean characters</article-title>
          , The University of North Carolina at Greensboro,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Manini</surname>
          </string-name>
          ,
          <article-title>Meaningful literary names: Their forms and functions, and their translation</article-title>
          ,
          <source>The Translator</source>
          <volume>2</volume>
          (
          <year>1996</year>
          )
          <fpage>161</fpage>
          -
          <lpage>178</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>N.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Banchs</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Duan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Report of NEWS 2018 named entity transliteration shared task</article-title>
          ,
          <source>in: Proceedings of the Seventh Named Entities Workshop</source>
          , Association for Computational Linguistics,
          <year>2018</year>
          , pp.
          <fpage>55</fpage>
          -
          <lpage>73</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W18</fpage>
          -2409.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Duan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , R. E. Banchs,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>NEWS 2018 whitepaper</article-title>
          , in
          <source>: Proceedings of the Seventh Named Entities Workshop</source>
          , Association for Computational Linguistics,
          <year>2018</year>
          , pp.
          <fpage>47</fpage>
          -
          <lpage>54</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>W18</fpage>
          -2408.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Escalera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pavão</surname>
          </string-name>
          , M. Richard, W.-W. Tu,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Guyon</surname>
          </string-name>
          , Codabench: Flexible, easy
          <article-title>-to-use, and reproducible meta-benchmark platform</article-title>
          ,
          <source>Patterns</source>
          <volume>3</volume>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1016/ j.patter.
          <year>2022</year>
          .
          <volume>100543</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>G.</given-names>
            <surname>Arampatzis</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. Arampatzis,</surname>
          </string-name>
          <article-title>DUTH at CLEF JOKER 2025 Tasks 2 and 3: Translating Puns and Proper Names with Neural Approaches</article-title>
          , in: [24],
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>M. P. R. Atencio</surname>
            ,
            <given-names>J. D.</given-names>
          </string-name>
          <string-name>
            <surname>Jimenez</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Gómez</surname>
            ,
            <given-names>J. E.</given-names>
          </string-name>
          <string-name>
            <surname>Serrano</surname>
          </string-name>
          , E. Puertas,
          <article-title>VerbaNexAI at CLEF 2025 JOKER Task 3: Multi-Model LLM Approach for Onomastic Wordplay Translation</article-title>
          , in: [24],
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>P.</given-names>
            <surname>Vachharajani</surname>
          </string-name>
          ,
          <source>pjmathematician at the CLEF 2025 JOKER Lab Tasks</source>
          <volume>1</volume>
          ,
          <issue>2</issue>
          &amp; 3:
          <string-name>
            <given-names>A</given-names>
            <surname>Unified</surname>
          </string-name>
          <article-title>Approach to Humour Retrieval and Translation using the Qwen LLM Family</article-title>
          , in: [24],
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>S. K. P</surname>
          </string-name>
          , B. A, S. M, T. S,
          <string-name>
            <surname>REC</surname>
          </string-name>
          _Cryptix at JOKER CLEF 2025:
          <article-title>Teaching Machines to Laugh: Multilingual Humor Detection and Translation</article-title>
          , in: [24],
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Zabalbeascoa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Attardo</surname>
          </string-name>
          ,
          <article-title>Humour translation theories and strategies</article-title>
          , in: L.
          <string-name>
            <surname>Kostopoulou</surname>
          </string-name>
          , V. Misiou (Eds.),
          <article-title>Transmedial perspectives on humour and translation: from page to screen to stage, Advances in Translation and Interpreting Studies</article-title>
          , Routledge, New York/London,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>J.</given-names>
            <surname>Vandaele</surname>
          </string-name>
          ,
          <article-title>Translating literary humour: Aspects of detection and analogy making 1</article-title>
          , in: L.
          <string-name>
            <surname>Kostopoulou</surname>
          </string-name>
          , V. Misiou (Eds.),
          <article-title>Transmedial perspectives on humour and translation: from page to screen to stage, Advances in Translation and Interpreting Studies</article-title>
          , Routledge, New York/London,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>V.</given-names>
            <surname>Chan</surname>
          </string-name>
          , W. K.-W. Tang,
          <article-title>GPT and translation: A systematic review</article-title>
          , in: 2024
          <source>International Symposium on Educational Technology (ISET)</source>
          , IEEE, Macau, Macao,
          <year>2024</year>
          , pp.
          <fpage>59</fpage>
          -
          <lpage>63</lpage>
          . doi:
          <volume>10</volume>
          . 1109/ISET61814.
          <year>2024</year>
          .
          <volume>00021</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>H.</given-names>
            <surname>Abu-Rayyash</surname>
          </string-name>
          ,
          <article-title>AI meets comedy: Viewers' reactions to GPT-4 generated humor translation</article-title>
          ,
          <source>Ampersand</source>
          <volume>12</volume>
          (
          <year>2024</year>
          )
          <article-title>100162</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.amper.
          <year>2023</year>
          .
          <volume>100162</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>R.</given-names>
            <surname>Hamilton</surname>
          </string-name>
          ,
          <article-title>Artificially funny: collaborative play at the intersection of AI, literature and humour</article-title>
          , in: W. Slocombe, G. Liveley (Eds.),
          <article-title>The Routledge handbook of AI and literature, Routledge literature handbooks</article-title>
          , Routledge, New York,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Puchalski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Regattin</surname>
          </string-name>
          , É. Mathurin,
          <string-name>
            <given-names>S.</given-names>
            <surname>Araújo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Borg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bokiniec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Corre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Jeanjean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hannachi</surname>
          </string-name>
          ,
          <string-name>
            <surname>G</surname>
          </string-name>
          ̇. Mallia, G. Matas,
          <string-name>
            <given-names>M.</given-names>
            <surname>Saki</surname>
          </string-name>
          , CLEF Workshop JOKER:
          <article-title>Automatic Wordplay and Humour Translation</article-title>
          , in: M.
          <string-name>
            <surname>Hagen</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Verberne</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Seifert</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Balog</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Nørvåg</surname>
          </string-name>
          , V. Setty (Eds.),
          <source>Advances in Information Retrieval</source>
          , volume
          <volume>13186</volume>
          of Lecture Notes in Computer Science, Springer International Publishing, Cham,
          <year>2022</year>
          , pp.
          <fpage>355</fpage>
          -
          <lpage>363</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>030</fpage>
          -99739-7_
          <fpage>45</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Regattin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Borg</surname>
          </string-name>
          , Élise Mathurin,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Corre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Araújo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hannachi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Boccou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Digue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Damoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Jeanjean</surname>
          </string-name>
          , Overview of JOKER@CLEF 2022:
          <article-title>Automatic wordplay and humour translation workshop</article-title>
          , in: A.
          <string-name>
            <surname>Barrón-Cedeño</surname>
            ,
            <given-names>G. D. S.</given-names>
          </string-name>
          <string-name>
            <surname>Martino</surname>
            ,
            <given-names>M. D.</given-names>
          </string-name>
          <string-name>
            <surname>Esposti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Sebastiani</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Pasi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Hanbury</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Potthast</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction: Proceedings of the Thirteenth</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>