<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Grenoble, France
* Corresponding author.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Overview of the CLEF 2024 JOKER Task 1: Humour-aware Information Retrieval</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Liana Ermakova</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anne-Gwenn Bosser</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tristan Miller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adam Jatowt</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Austrian Research Institute for Artificial Intelligence (OFAI)</institution>
          ,
          <addr-line>Vienna</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, University of Manitoba</institution>
          ,
          <addr-line>Winnipeg</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>École Nationale d'Ingénieurs de Brest, Lab-STICC CNRS UMR 6285</institution>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Université de Bretagne Occidentale</institution>
          ,
          <addr-line>HCTI</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>University of Innsbruck</institution>
          ,
          <country country="AT">Austria</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>This paper presents the details of Task 1 of the JOKER-2024 Track, where the aim is to retrieve short humorous texts from an underlying document collection. The intended use case for this task is to search for a joke on a specific topic. This can be useful for humour researchers in the humanities, for second-language learners as a learning aid, for professional comedians as a writing aid, and for translators who might need to adapt certain jokes to other cultures. For this task, we provided a collection consisting of 61,268 documents, where 4,492 texts were humorous. Ten teams submitted 26 runs in total for this task.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;information retrieval</kwd>
        <kwd>wordplay</kwd>
        <kwd>puns</kwd>
        <kwd>computational humour</kwd>
        <kwd>wordplay detection</kwd>
        <kwd>test collection</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>instances of wordplay. The intended use case is to search for a joke on a specific topic. For example, a
search query of “math” would mean that the goal is to find math jokes, while the query “Tom” would
mean that the goal is to find jokes about some person or entity named Tom.</p>
      <p>
        The test collection was built based on the English corpora constructed within the previous edition of
the CLEF JOKER track:
• JOKER 2023 Task 1 - pun detection [
        <xref ref-type="bibr" rid="ref2 ref8 ref9">2, 8, 9</xref>
        ];
• JOKER 2023 Task 2 - pun location and interpretation [
        <xref ref-type="bibr" rid="ref10 ref2 ref9">2, 10, 9</xref>
        ];
• JOKER 2023 Task 3 - pun translation [
        <xref ref-type="bibr" rid="ref11 ref2 ref9">2, 11, 9</xref>
        ].
      </p>
      <p>This year, ten teams, out of the total 22 active JOKER participants, submitted 26 runs for Task 1 out
of the 103 runs submitted to the track (see run statistics in Table 1).</p>
      <p>This paper presents an overview of our data preparation process in Section 2. In Section 3, we
describe the participants’ runs, and we present the analysis of their results in Section 4. We provide
some concluding remarks in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Dataset</title>
      <p>
        The data for this task extends that which was originally used for JOKER-2023’s tasks on wordplay
detection in English [
        <xref ref-type="bibr" rid="ref2 ref8 ref9">8, 2, 9</xref>
        ]. Those texts were annotated according to whether they are humorous; we
supplement this data with texts from Task 3 of JOKER-2023 [
        <xref ref-type="bibr" rid="ref11 ref2 ref9">11, 2, 9</xref>
        ], used for humour translation, and
with some new wordplay instances. We further extended the data with text passages collected from
non-humourous sources, as well as data that was automatically generated in relation to the queries.
Specifically, the non-humorous data related to queries was obtained from the following sources:
• negative examples from the JOKER corpus.
• Wikipedia extracts returned for the queries. We used the Wikipedia Python package3 for this and
then collected sentences to form non-humorous text instances.
      </p>
      <p>• Descriptions of queries generated by Meta’s Llama 2 with 7B parameters [18].</p>
      <p>In total, we provided our participants with a collection consisting of 61,268 documents, where 4,492
texts are humorous. The latter encompasses 3,507 texts from JOKER 2023 and 985 new wordplay
instances. The remaining 56,776 texts are non-humorous. These consist of 4,954 negative examples
taken from the JOKER 2023 wordplay detection corpus, 12,523 texts generated using Llama 2, and 39,299
sentences from Wikipedia extracts. All the texts were typically one or two sentences long and were
released in the form of JSON files.
3https://pypi.org/project/wikipedia/</p>
      <p>
        For creating the set of queries, we harnessed data from CLEF 2023 JOKER Task 2 – Pun Location
and Interpretation [
        <xref ref-type="bibr" rid="ref10 ref2 ref9">10, 2, 9</xref>
        ], and in particular, the locations of wordplay in texts, i.e. words or phrases
carrying multiple meanings. In CLEF 2023 JOKER Task 2, puns were either homographic (identical
spelling as in I used to be a banker but I lost interest) or heterographic (i.e. exploiting paronymy as
propane/prophane in When the church bought gas for their annual barbecue, proceeds went from the
sacred to the propane.) To expand the queries, we used the semantic annotations of pun locations (pun
interpretation), i.e. pairs of lemmatized word sets, containing the synonyms (or, if absent, hypernyms)
of the two words involved in the pun, excluding any that share the same spelling as the pun. The lists
of query expansions were manually checked. The document was deemed humorous and relevant to the
query if it came from the positive examples of the JOKER corpus and included the query term or its
expansions.
      </p>
      <p>Twelve queries with their judgments (qrels) were created for training or validating participants’
systems. Then, another 45 queries were created as a test set.4 For all 57 queries (combined test and
training), 11,831 documents were deemed topically relevant. We considered a document to be topically
relevant to a given query if it contained the term from this query, or its synonyms, or its hyperonyms.
Among the topically relevant documents, 1,730 were considered to be humorous. The descriptive
statistics of relevant humourous texts is given in Table 2 while Figure 1 presents the histogram of the
number of relevant humourous texts per query. The average number of relevant humorous texts per
query is 30, while the median is 18 texts.</p>
      <sec id="sec-2-1">
        <title>2.1. Evaluation Measures</title>
        <p>When it comes to evaluation measures, a set of standard information retrieval metrics were used:
map mean average precision – i.e., the mean of the average precision scores for each query
ndcg normalised discounted cumulative gain, the gain of each document based on its relevance,
discounted logarithmically by its position in the ranking normalised over the ideal ranking
P1, P5, P10 precision – i.e. the ability of a system to present only relevant items, at diferent numbers
of top ranked results
R5, R10, R100, R1000 recall – measuring the ability of systems to find all or many relevant items, at
diferent top numbers of results
bpref binary preference, a sum-based metric showing how many relevant documents are ranked before
irrelevant documents
4Note that we also included all the training-set queries in the test input file; however, they are excluded from the resulting
scores.</p>
        <p>MRR mean reciprocal rank, the average of the multiplicative inverse of the ranks of the first correct
answer of results for a sample of queries
We used the pyterrier platform [19, 20] implementation of these metrics.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Input format</title>
        <p>2.2.1. Document collection
We provide the training and test data in a JSON format with the following fields:
docid a unique document identifier
text the text of the instance, which may or may not contain wordplay</p>
        <p>Input example:
"docid": "51135",
"text": "I’ve inherited a fortune, said Tom, willfully"
"docid": "591",
"text": "My name is Will, I’m a lawyer."
"docid": "1",
"text": "Good laws have sprung from bad customs."
"docid": "2",
"text": "The musical score to Topsyturveydom does not survive, but amateur
productions in recent decades have used newly composed scores or performed the
work as a non-musical play."
"docid": "3",
"text": "The organic compound primarily responsible for the characteristic odor of
musk is muscone."
},
{
},
{
},
{
},
{
2.2.2. Queries
qid a unique query identifier from the input file
query the search query</p>
        <p>Input example:
{"qid":"qid_train_1","query":"steps"},
{"qid":"qid_train_3","query":"math"},
{"qid":"qid_train_4","query":"Tom"}
The train and test queries are also JSON files, this time with the following fields:
2.2.3. Qrels
Finally, we provide training/validation data in the format of JSON qrels files with the following fields:
qid a unique query identifier from the query input file
docid a unique document identifier from the corpus
qrel indication the document docid is relevant to the query qid and is a wordplay instance
Example of a qrel file:
"qid": "qid_train_0",
"docid": "27260",
"qrel": 0
"qid": "qid_train_0",
"docid": "591",
"qrel": 1
"qid": "qid_train_0",
"docid": "51135",
"qrel":1
},
{
},
{
}
{
},
{</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Output format</title>
        <p>
          We required results to be provided in a JSON format with the following fields:
run_id run ID starting with &lt;team_id&gt;_&lt;task_id&gt;_&lt;method_used&gt;, e.g. UBO_task_1_TFIDF
manual flag indicating if the run is manual 0,1
qid a unique identifier from the input file
docid an identifier of the document retrieved from the corpus to the qid query
rank retrieved document rank
score normalised document relevance score (in the [
          <xref ref-type="bibr" rid="ref1">0–1</xref>
          ] scale)
For each query, the maximum allowed number of distinct documents (docid field) is 1000. A sample
output file is as follows:
"run_id":"team1_task_1_TFIDF",
"manual":0,
"qid":"qid_train_0",
"docid":"27260",
"rank":1,
"score":0.97
"run_id":"team1_task_1_TFIDF",
"manual":0,
"qid":"qid_train_0",
"docid":"591",
"rank":2,
"score":0.8
},
{
"run_id":"team1_task_1_TFIDF",
"manual":0,
"qid":"qid_train_1",
"docid":"27261",
"rank":1,
"score":0.7
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Participants’ Approaches</title>
      <p>In total, ten teams submitted 26 runs (see run statistics in Table 1). The approaches used by the
participating teams are as follows:
• The jokester team [12] provided a single run based on an approach that uses TF–IDF for feature
weighting and a Logistic Regression classifier.
• The Arampatzis team5 provided ten runs, testing a range of diverse models such as TF–IDF,
LSTM, Random Forest, XGBoost, LightGBM (Light Gradient-Boosting Machine), SVM, Decision
Tree, Gaussian Naive Bayes, KNN, and neural nets.
• A run submitted by LIS team [13] was based on T5 transformer model, query processing,
expanding terms with their synonyms collected from WordNet, choosing the optimal tokenisation
method for queries and documents, and then selecting the best threshold for the similarity score.</p>
      <p>Finally, a pre-trained model was applied to filter texts with puns.
• The Frane uses fine-tuned BERT models for estimating humorousness together with the
wellknown retrieval model such as BM25. The team submitted one run.
• The Dajana&amp;Kathy team processed text using stemming, lemmatisation, and stop word removal,
and employed TF–IDF and BM25, together with fine-tuning BERT for submitting their run
• The AB&amp;DPV team [14] used TF–IDF for ranking humourous text within the collection for
constructing thier run.
• The RubyAiYoungTeam team’s run was submitted without any description of the employed
method.
• The Petra&amp;Regina team [15] submitted a single run employing logistic regression with TF–IDF
vectorised documents and queries and iterative relevance scoring.
• The Tomislav&amp;Rowan team [16] employed logistic regression with TF–IDF vectorised
documents to create a single run.
• The UAms team [17] provided two runs based on BM25 and BM25+RM3 using default settings.</p>
      <p>Two other runs employ neural cross-encoder rerankings of the latter runs based on zero-shot
application of an MSMARCO-trained ranker. The last four runs are based on two trained versions
of the SimpleT5 model, one with a batch size of 6 and the other with a batch size of 8 and a trained
BERT model using LoRa.</p>
      <p>Note that we do not detail the zero-scored runs, nor the runs with problems that we could not resolve.
5No paper received
map
ndcg</p>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <sec id="sec-4-1">
        <title>4.1. Evaluation on Test Data</title>
        <p>The majority of submitted runs had some issues – for example, some runs were submitted on only part
of the data, and there were runs for the training data only. We tried to solve these problems whenever
possible.</p>
        <p>In Table 3 we show the main results for participants’ runs on test data. We make the following
observations based on the results:
• First, in general, both precision and recall are extremely low. Low precision is due to the presence
of the query terms in the non-humorous texts which is considered as topical relevance by the
retrieval systems. The low recall is probably related to the length of the text and the fact that in
many texts, both humorous and topically relevant, the query terms do not appear.
• The runs based on pseudo-relevance feedback RM3 query expansion outperform the BM25
baselines.
• Cross-encoder rerankers do not exhibit better performance than the baseline models.
• Filtering trained on the wordplay detection task improved systems’ results quite a lot.
• Simple solutions such as ones with TF–IDF and Logistic Regression remain quite competitive.
• Using T5 and BERT language models with RM3 is one of best approaches both in terms of precision
and recall.</p>
        <p>To evaluate the errors produced by the rankers, we compared the results with those obtained using
topic relevance alone, disregarding the humorousness of the texts. Topical relevance results on the
test data are given in Table 4. Traditional models without filtering, such as RM3, TF-IDF, BM25,
showed high performance on topical relevance only with   &gt; 0.35 and   &gt; 0.55 but the
oficial results, which take into account humorousness of the texts, go down with   &lt; 0.1 and
  &lt; 0.3. Post-filtering applied with diferent ranking models improved MAP up to 50% (cf.
UAms_rm3_T5_Filter2 and UAms_Anserini_rm3) according to the oficial results but dropped topical
relevance. UAms_bm25_BERT_Filter demonstrated high scores according to both the oficial results
and topic relevance alone.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Evaluation on Training Data</title>
        <p>Here we also report the results on the training data in order to provide additional insights as to the
performance and characteristics of diferent approaches. Table 5 shows the performance based on the
submitted runs.</p>
        <p>Looking at the results we can make the following observations:
map
ndcg</p>
        <p>map
ndcg</p>
        <p>R100</p>
        <p>R1000
bpref</p>
        <p>MRR
• While precision is quite high, recall still poses many challenges, even on training data. This may
also support the hypothesis that low recall may arise due to the absence of query terms in texts
which are relatively short.
• The approaches using decision trees, and relatively standard approaches like SVM and kNN,
achieve the best results. However, these results are from a team (Arampatzis) that have not
submitted a system description paper, so they should be treated with caution.
• Considering the remaining results, the ordering of the best runs is similar to that of the test data.
• Considering topical relevance alone on the training data (see Table 6), in general, we observe
similar trends as seen in the test set where unfiltered runs tend to have higher topical relevance
alone but a significant drop according to the oficial ranking.
• The filtered runs exhibited identical scores in both the oficial ranking and for topical relevance
alone, indicating that they retrieved (almost) exclusively humorous documents. However, the
relative ranking between filtered and unfiltered runs difers, except for the Arampatzis runs,
which are at the top of both tables.
• The topical relevance scores between the test and training data are similar, but the ranking that
considers both topical relevance and humor is nearly twice as low on the test data, indicating
potential overfitting in humor classification (cf. the UAms runs).
map
ndcg</p>
        <p>R100</p>
        <p>R1000
bpref</p>
        <p>MRR</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>
        This paper has given an overview and discussed the results of Task 1 of the JOKER-2024 challenge on
the retrieval of humorous texts. Based on the data for wordplay detection and interpretation previously
constructed within the CLEF JOKER track [
        <xref ref-type="bibr" rid="ref10 ref2 ref8 ref9">8, 10, 2, 9</xref>
        ], we constructed a unique reusable test collection
for wordplay retrieval in English.
      </p>
      <p>Ten participating teams submitted 26 runs in total for Task 1. The teams applied diverse methods,
ranging from traditional approaches rankers such as TF–IDF, BM25, and RM3 to cross-encoders with
and without post-filtering based on classical machine learning methods (logistic regression, and SVMs)
to more modern ones, including SimpleT5 and BERT.</p>
      <p>The participants’ results confirm that humour-oriented information retrieval remains a rather
challenging task with both precision and recall being extremely low. Filtering trained on the wordplay
detection task significantly improved the systems’ results. However, while topical relevance scores
between the test and training data are similar, the ranking that considers both topical relevance and
humor is nearly twice as low on the test data, suggesting potential overfitting in humor classification.</p>
      <p>In general, our results confirm that retrieval models are humour-agnostic and humour detection is
still a challenge for machine learning models and LLMs. Developing new test collections, including
those for non-English languages, could help address this issue.</p>
      <p>Additional information on the track is available on the JOKER website: https://www.joker-project.
com/</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This project has received a government grant managed by the National Research Agency under the
program “Investissements d’avenir” integrated into France 2030, with the Reference ANR-19-GURE-0001.
This track would not have been possible without the great support of numerous individuals. We want to
thank in particular the colleagues and the students who participated in data construction and evaluation,
in particular the students of the Université de Bretagne Occidentale. Please visit the JOKER website for
more details on the track.6
Forum, volume 3497 of CEUR Workshop Proceedings, 2023, pp. 1818–1827.
[12] H. Baguian, H. N. Ashley, JOKER Track @ CLEF 2024: The Jokesters’ approaches for retrieving,
classifying, and translating wordplay, in: G. Faggioli, N. Ferro, P. Galuscakova, A. García Seco de
Herrera (Eds.), Working Notes of CLEF 2024 – Conference and Labs of the Evaluation Forum, 2024.
[13] A. Gepalova, A.-G. Chifu, S. Fournier, CLEF 2024 JOKER Task 1: Exploring pun detection using
the T5 transformer model, in: G. Faggioli, N. Ferro, P. Galuscakova, A. García Seco de Herrera
(Eds.), Working Notes of CLEF 2024 – Conference and Labs of the Evaluation Forum, 2024.
[14] D. P. Varadi, A. Bartulović, JOKER 2024 by AB&amp;DPV: From ‘LOL’ to ‘MDR’ using AI models to
retrieve and translate puns, in: G. Faggioli, N. Ferro, P. Galuscakova, A. García Seco de Herrera
(Eds.), Working Notes of CLEF 2024 – Conference and Labs of the Evaluation Forum, 2024.
[15] R. Elagina, P. Vučić, Convergential approach in machine learning for efective humour analysis
and translation, in: G. Faggioli, N. Ferro, P. Galuscakova, A. García Seco de Herrera (Eds.), Working
Notes of CLEF 2024 – Conference and Labs of the Evaluation Forum, 2024.
[16] R. Mann, T. Mikulandric, CLEF 2024 JOKER Tasks 1–3: Humour identification and classification,
in: G. Faggioli, N. Ferro, P. Galuscakova, A. García Seco de Herrera (Eds.), Working Notes of CLEF
2024 – Conference and Labs of the Evaluation Forum, 2024.
[17] L. Buijs, M. Cazemier, E. Schuurman, J. Kamps, University of Amsterdam at the CLEF 2024 Joker
Track, in: G. Faggioli, N. Ferro, P. Galuscakova, A. García Seco de Herrera (Eds.), Working Notes
of CLEF 2024 – Conference and Labs of the Evaluation Forum, 2024.
[18] H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P.
Bhargava, S. Bhosale, et al., Llama 2: Open foundation and fine-tuned chat models, arXiv preprint
arXiv:2307.09288 (2023).
[19] C. Macdonald, N. Tonellotto, Declarative experimentation ininformation retrieval using pyterrier,
in: Proceedings of ICTIR 2020, 2020.
[20] C. Van Gysel, M. de Rijke, Pytrec_eval: An extremely fast python interface to trec_eval, in: SIGIR,
ACM, 2018.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. M.</given-names>
            <surname>Palma-Preciado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sidorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <article-title>Overview of CLEF 2024 JOKER track on automatic humor analysis</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>G. Q.</given-names>
          </string-name>
          <string-name>
            <surname>Philippe Mulhem</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. M. D. Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ), Lecture Notes in Computer Science, Springer,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. M. P.</given-names>
            <surname>Preciado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sidorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          , Overview of JOKER - CLEF-2023
          <source>Track on Automatic Wordplay Analysis</source>
          , in: A.
          <string-name>
            <surname>Arampatzis</surname>
            , E. Kanoulas,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Tsikrika</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Vrochidis</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Giachanou</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Aliannejadi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Vlachos</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>CLEF'23: Proceedings of the Fourteenth International Conference of the CLEF Association, Lecture Notes in Computer Science</source>
          , Springer,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Regattin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Borg</surname>
          </string-name>
          , Élise Mathurin,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Corre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Araújo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hannachi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Boccou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Digue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Damoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Jeanjean</surname>
          </string-name>
          , Overview of JOKER@CLEF 2022:
          <article-title>Automatic wordplay and humour translation workshop</article-title>
          , in: A.
          <string-name>
            <surname>Barrón-Cedeño</surname>
            ,
            <given-names>G. D. S.</given-names>
          </string-name>
          <string-name>
            <surname>Martino</surname>
            ,
            <given-names>M. D.</given-names>
          </string-name>
          <string-name>
            <surname>Esposti</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Sebastiani</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Pasi</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Hanbury</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Potthast</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction: Proceedings of the Thirteenth International Conference of the CLEF Association (CLEF</source>
          <year>2022</year>
          ), volume
          <volume>13390</volume>
          of Lecture Notes in Computer Science, Springer, Cham,
          <year>2022</year>
          , pp.
          <fpage>447</fpage>
          -
          <lpage>469</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -13643-6_
          <fpage>27</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>V. M.</given-names>
            <surname>Palma Preciado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sidorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF 2024 JOKER Task 2: Humour classification according to genre and technique</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuscakova</surname>
          </string-name>
          , A. G. Seco de Herrera (Eds.),
          <source>Working Notes of the Conference and Labs of the Evaluation Forum (CLEF</source>
          <year>2024</year>
          ), CEUR Workshop Proceedings, CEUR-WS.org,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          , A.-G. Bosser,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF 2024 JOKER Task 3: Translate puns from English to French</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuscakova</surname>
          </string-name>
          , A. G. Seco de Herrera (Eds.),
          <source>Working Notes of the Conference and Labs of the Evaluation Forum (CLEF</source>
          <year>2024</year>
          ), CEUR Workshop Proceedings, CEUR-WS.org,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Digiovanni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Narita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Goldberg</surname>
          </string-name>
          ,
          <article-title>Jester 2.0 (demonstration abstract): Collaborative ifltering to retrieve jokes</article-title>
          ,
          <source>in: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , SIGIR '99,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>1999</year>
          , p.
          <fpage>333</fpage>
          . URL: https://doi.org/10.1145/312624.312770. doi:
          <volume>10</volume>
          . 1145/312624.312770.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.</given-names>
            <surname>Friedland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Allan</surname>
          </string-name>
          ,
          <article-title>Joke retrieval: Recognizing the same joke told diferently</article-title>
          ,
          <source>in: Proceedings of the 17th ACM Conference on Information and Knowledge Management</source>
          , CIKM '08,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2008</year>
          , p.
          <fpage>883</fpage>
          -
          <lpage>892</lpage>
          . URL: https://doi.org/10.1145/ 1458082.1458199. doi:
          <volume>10</volume>
          .1145/1458082.1458199.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. M.</given-names>
            <surname>Palma Preciado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sidorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <article-title>Overview of JOKER 2023 Automatic Wordplay Analysis Task 1 - pun detection</article-title>
          , in: M.
          <string-name>
            <surname>Aliannejadi</surname>
            , G. Faggioli,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ferro</surname>
          </string-name>
          , M. Vlachos (Eds.),
          <source>Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum</source>
          , volume
          <volume>3497</volume>
          <source>of CEUR Workshop Proceedings</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1785</fpage>
          -
          <lpage>1803</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          , The JOKER Corpus:
          <article-title>English-French parallel data for multilingual wordplay recognition</article-title>
          ,
          <source>in: SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , Association for Computing Machinery, New York, NY,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .1145/3539618.3591885, to appear.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. M.</given-names>
            <surname>Palma Preciado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sidorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <article-title>Overview of JOKER 2023 Automatic Wordplay Analysis Task 2 - pun location and interpretation</article-title>
          , in: M.
          <string-name>
            <surname>Aliannejadi</surname>
            , G. Faggioli,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ferro</surname>
          </string-name>
          , M. Vlachos (Eds.),
          <source>Working Notes of CLEF 2023 - Conference and Labs of the Evaluation Forum</source>
          , volume
          <volume>3497</volume>
          <source>of CEUR Workshop Proceedings</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>1804</fpage>
          -
          <lpage>1817</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-G.</given-names>
            <surname>Bosser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. M.</given-names>
            <surname>Palma Preciado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sidorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <article-title>Overview of JOKER 2023 Automatic Wordplay Analysis Task 3 - pun translation</article-title>
          , in: M.
          <string-name>
            <surname>Aliannejadi</surname>
            , G. Faggioli,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Ferro</surname>
          </string-name>
          , M. Vlachos (Eds.),
          <source>Working Notes of CLEF 2023 - Conference and Labs of the Evaluation</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>