<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Humour Classification by Fine-tuning LLMs: CYUT at CLEF 2024 JOKER Lab Subtask Humour Classification According to Genre and Technique</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shih-Hung Wu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yu-Feng Huang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tsz-Yeung Lau</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Chaoyang University of Technology</institution>
          ,
          <addr-line>Taichung</addr-line>
          ,
          <country country="TW">Taiwan, R.O.C</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <fpage>3</fpage>
      <lpage>8</lpage>
      <abstract>
        <p>This paper reports how we attend the CLEF 2024 JOKER lab, Humour classification according to genre and technique subtask. The system will classifying short texts of humor among the six classes such as irony, sarcasm, exaggeration, incongruity-absurdity, self-deprecating and wit-surprise. This year, CYUT team sent three runs based on 3 deep learning models. Run 1 is based on a fine-tuned Llama 3 model, run 2 is based on a fine-tuned RoBERTa model and run 3 is using the GPT4.0 api provided by OpenAI with a zero-shot and CoT prompt. During the system developing phrase, our Llama 3 model can achieve an 89.68% accuracy, however, the ofical result is 69.78%.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Deep Learning</kwd>
        <kwd>Humour Classification</kwd>
        <kwd>Large Language Models (LLMs)</kwd>
        <kwd>Llama 3</kwd>
        <kwd>GPT-4</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The subtask Humour Classification According to Genre and Technique of JOKER Track @ CLEF 2024
is a multiclass classification task. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] The system automatically classify each given sentence into the
following classes: irony, sarcasm, exaggeration, incongruity-absurdity, self-deprecating and wit-surprise.
      </p>
      <p>The organizers provide manually annotated training and test data from existing corpora, including
the positive examples of the JOKER-2023 pun detection corpus as well as new data.</p>
      <p>
        Humor is a complex and ambiguous emotional concept unique to natural language[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Humorous
language cannot exist independently, as language gains meaning only when accompanied by context,
situation, and cultural background[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Discourse analysis has the capability to interpret humor.
Language itself becomes the subject of humor[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Humor recognition is a challenging issue in natural
language processing (NLP) for several reasons. Firstly, humor often stems from the use of figurative
language, such as irony and sarcasm. Additionally, the sense of humor varies across diferent cultural
and geographical groups. For instance, someone disinterested in political issues may find it dificult to
understand political jokes. People with diferent background knowledge will react diferently to the
same joke. This variability makes it challenging for NLP researchers to detect humorous content[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        Humor emotion analysis is an intriguing area of study as it reveals alternative ways of expressing
human emotions. When people convey various emotions through their words and actions, it is often
not straightforward but filled with humorous elements. This is where humor emotion analysis becomes
valuable. Previous research has primarily focused on categorizing emotions as positive, negative, or
neutral [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. However, now we aim to delve deeper into the meanings behind humorous emotions in text.
Such research not only helps us better understand the diversity of human emotional expression but also
provides useful insights for the development of natural language processing and emotional intelligence.
      </p>
      <p>Thus, humor emotion analysis is not merely a study of textual emotions but an adventurous journey
into the nature of human humor. This exploration will help us comprehensively understand the
psychological mechanisms behind human speech and behavior, while also bringing more enjoyment
and challenges to our technological advancements.</p>
      <p>In this study, we employ RoBERTa, GPT-4, and Llama 3-8B for humor classification. As a result,
Llama 3-8B performed the best, achieving an accuracy of 89.68%.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Large language models</title>
        <p>
          Large language models (LLMs), such as GPT-4 [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] and Llama 3 [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], have garnered attention due to
their outstanding performance on various tasks. These models possess a vast number of parameters
and can adapt to new tasks without additional training, a capability known as "in-context learning."
[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] Recently, the emergence of ChatGPT, particularly its basis on GPT-3.5 [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] and further refinement
through reinforcement learning from human feedback, has drawn significant attention[
          <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
          ].
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Prompt Engineering</title>
        <p>
          Prompt engineering plays a significant role in the fields of artificial intelligence and machine learning[
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
It acts as a communication bridge, especially when using large language models like GPT-3 or GPT-4.
We perform fine-tuning [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] to achieve better results, aiming for more accurate and targeted outputs
[
          <xref ref-type="bibr" rid="ref14 ref15">14, 15</xref>
          ]. This concept is crucial in natural language processing (NLP) as it directly impacts the model’s
performance and output quality. The basic idea of prompt engineering is to guide the model to provide
the desired information or execute complex specific tasks through carefully designed prompts [
          <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
          ].
Without clear instructions, the model might generate inaccurate or completely irrelevant responses. We
can enhance the accuracy of prompts through several known practices, such as precise instructions, role
assignment, give example(one-shot, few-shot)[
          <xref ref-type="bibr" rid="ref9">9</xref>
          ], iterative refinement and Chain of Thought (CoT)[
          <xref ref-type="bibr" rid="ref18">18</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Dataset</title>
      <p>The training dataset for this study was provided by the JOKER organizer and consists of a total of 1,742
entries. The humorous content is categorized into six types: IR (irony) with 210 entries, SC (sarcasm)
with 356 entries, EX (exaggeration) with 125 entries, AID (incongruity-absurdity) with 231 entries, SD
(self-deprecating) with 169 entries, and WS (wit-surprise) with 651 entries. The distribution of the
training data is shown in Figure 1. The test set comprises a total of 722 entries, as illustrated in Figure 2.
The results were evaluated by the JOKER organizer.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Method</title>
      <sec id="sec-4-1">
        <title>4.1. Deep Learning Models</title>
        <p>
          4.1.1. RoBERTa
We utilize the enhanced BERT[
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] model, RoBERTa[
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], as our baseline. BERT, which stands for
Bidirectional Encoder Representations from Transformers[
          <xref ref-type="bibr" rid="ref19">19</xref>
          ], was originally introduced by Google as
an encoder-only transformer[
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]-based model for natural language processing (NLP) tasks. BERT is
pre-trained using the Masked Language Model (MLM) and Next Sentence Prediction (NSP) techniques.
Unlike word2vec[
          <xref ref-type="bibr" rid="ref22">22</xref>
          ] and GloVe[
          <xref ref-type="bibr" rid="ref23">23</xref>
          ], which do not consider context, BERT leverages contextual
information during inference, leading to superior performance[
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. In the RoBERTa paper, they mentioned
that the BERT model was significantly undertrained[
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. To address this, they implemented several
modifications: using larger batches, training the model for a longer duration, dropping the NSP training,
training on longer sequences, and dynamically changing the masking pattern applied to the training
data[
          <xref ref-type="bibr" rid="ref20">20</xref>
          ]. For the RoBERTa baseline model, we achieve an accuracy of 72.49%.
4.1.2. GPT-4
In this study, we utilized the GPT-4.0 model with zero-shot prompting and Chain-of-Thought (CoT)
prompting to assist with the task. GPT-4, developed by OpenAI, is an advanced natural language
processing model built upon its predecessor, GPT-3, with a significantly increased parameter count.
This enhancement facilitates a deeper understanding and generation of complex sentence structures,
enabling more nuanced responses and better handling of contextual language features such as irony
and humor. GPT-4 is trained using autoregressive language modeling on a diverse dataset, allowing
it to perform exceptionally across various NLP tasks like translation, summarization, and question
answering.[
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]
        </p>
        <p>
          The model operates by first converting input text into tokens, which are processed by transformer
layers using attention mechanisms to evaluate relevance and context. These mechanisms generate
intermediate representations of data, which are then decoded into human-readable text. GPT-4
incorporates a randomness function influenced by temperature settings and top-k sampling, which dictate the
randomness and determinism of the output, thus enhancing the model’s ability to produce contextually
appropriate content. This process represents a significant evolution in language model capabilities,
setting new benchmarks in language understanding and generation.[
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]
4.1.3. Llama 3
Large Language Models (LLMs) are highly capable AI assistants that excel in complex reasoning tasks.
They enable interaction with humans through intuitive chat interfaces, leading to rapid and widespread
adoption among the general public.[
          <xref ref-type="bibr" rid="ref25">25</xref>
          ] Many diferent LLMs are publicly available, such as GPT-4[
          <xref ref-type="bibr" rid="ref7">7</xref>
          ],
Mistral 7B[
          <xref ref-type="bibr" rid="ref26">26</xref>
          ], Gemma 7B[
          <xref ref-type="bibr" rid="ref27">27</xref>
          ], and the LLM we utilize in this study, Llama 3.
        </p>
        <p>
          Llama 3[
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] is an open-source LLM utilizing the Transformer[
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] architecture, developed by Meta.
The Llama3 model is available in configurations with 8 billion and 70 billion parameters. Llama3 models
have achieved state-of-the-art (SOTA) performance across a broad range of tasks due to extensive
pre-training on over 15 trillion data tokens, making it the best-performing open-source model. In this
study, we fine-tuned the Llama 3-8B model on a single GPU, utilizing 4-bit quantization with QLoRa[
          <xref ref-type="bibr" rid="ref28">28</xref>
          ]
to reduce GPU RAM usage during training with unsloth[
          <xref ref-type="bibr" rid="ref29">29</xref>
          ]. As a result, the model achieved 89.68%
accuracy.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. System Development</title>
      <sec id="sec-5-1">
        <title>5.1. Environment</title>
        <p>In our experiment, we utilized a GPU, NVIDIA GeForce RTX 3090 with 24GB of memory. The versions
of all packages employed in the experiment will be thoroughly delineated in Table 1.</p>
        <sec id="sec-5-1-1">
          <title>Package</title>
        </sec>
        <sec id="sec-5-1-2">
          <title>Python</title>
        </sec>
        <sec id="sec-5-1-3">
          <title>Pytorch</title>
        </sec>
        <sec id="sec-5-1-4">
          <title>CUDA Toolkit</title>
        </sec>
        <sec id="sec-5-1-5">
          <title>CUDA</title>
        </sec>
        <sec id="sec-5-1-6">
          <title>Unsloth</title>
        </sec>
        <sec id="sec-5-1-7">
          <title>Version</title>
          <p>To fine-tune the RoBERTa model, we use 80% of the dataset as the training set and 20% as the test set.
The hyperparameters we used for fine-tuning are shown in Table 2.</p>
          <p>
            To evaluate the GPT-4 model, we use the entire dataset for self-testing. We found that direct classification
did not yield satisfactory results, so we grouped similar types into broader categories before conducting
ifner classifications. First, we grouped IR and SC into Category C. Then, we divided the remaining five
5.3. GPT 4.0
types (C, SD, EX, AID, WS) into two categories: Category A (AID, WS) and Category B (C, SD, EX).
We then performed a binary classification within Category A to distinguish between AID and WS. For
Category B, we conducted a three-way classification to separate C, SD, and EX. Finally, we performed a
binary classification within Category C to diferentiate between IR and SC. This approach allowed us to
consolidate the results for all six types. The flowchart displayed in figure 3 illustrates this classification
process. You can check the prompts we applied for each step in the appendix Table 13.14 .
5.3.1. Prompt Design
First, we assign the model a specialized role to enhance its performance in handling complex tasks within
a specific domain. Next, we utilize chain-of-thought (CoT) prompting to reduce model hallucinations
and increase the probability of generating reasonable responses. We specify the task clearly, provide
category names and definitions, and set output constraints. For example, we limit the output to no more
than three tokens and restrict the model from producing responses outside the given requirements or
repeating the question.
5.4. Llama 3
To fine-tune the Llama 3-8B model, we use 80% of the dataset as the training set and 20% as the test set.
The hyperparameters we used for fine-tuning are shown in Table 4.
5.4.1. Prompt Design
To fine-tune Llama3, we utilize the Stanford Alpaca Format[
            <xref ref-type="bibr" rid="ref30">30</xref>
            ]. The Alpaca format is shown in Table 3.
For the instruction, we first tell the model what to do: "Classify the following text into one of the classes."
Then, we provide the six classes for classification with explanations: irony, sarcasm, exaggeration,
incongruity-absurdity, self-deprecating humor, and wit-surprise. We simply utilize the explanations
provided in the oficial JOKER guideline document here. Based on results from RoBERTa and GPT-4,
we discovered that the model struggled to accurately classify irony and sarcasm. Therefore, we added
the sequence: "You ought to focus more on classifying irony and sarcasm." Finally, we applied Chain of
Thought (CoT) prompting[
            <xref ref-type="bibr" rid="ref18">18</xref>
            ] by adding the sequence: "Let’s think step by step.".
          </p>
          <p>Meanwhile, the sequence following "### Input:" denotes the text in need of classification, while
"### Response:" following with one of six classes: irony, sarcasm, exaggeration, incongruity-absurdity,
self-deprecating humor, and wit-surprise. During evaluation, we employ the same prompting technique.
The only diference is that we refrain from adding any text after "### Response:" to allow the model to
generate the response. The prompt elements are shown in Table 5.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Experiment Result</title>
      <sec id="sec-6-1">
        <title>6.1. Self-Test Result</title>
        <sec id="sec-6-1-1">
          <title>Classify the following text into one of the classes.</title>
        </sec>
        <sec id="sec-6-1-2">
          <title>Here are the six types of classes:</title>
        </sec>
        <sec id="sec-6-1-3">
          <title>Irony - Irony relies on a gap between the literal meaning and the</title>
          <p>intended meaning, creating a humorous twist or reversal.</p>
        </sec>
        <sec id="sec-6-1-4">
          <title>Sarcasm - Sarcasm involves using irony to mock, criticize, or convey contempt.</title>
        </sec>
        <sec id="sec-6-1-5">
          <title>Exaggeration - Exaggeration involves magnifying or overstating</title>
          <p>something beyond its normal or realistic proportions.</p>
        </sec>
        <sec id="sec-6-1-6">
          <title>Incongruity-Absurdity - Incongruity refers to unexpected or contradictory elements that are combined in a humorous way, and Absurdity involves presenting situations, events, or ideas that are inherently illogical, irrational, or nonsensical.</title>
        </sec>
        <sec id="sec-6-1-7">
          <title>Self-deprecating - Self-deprecating humor involves making fun of oneself or highlighting one’s own flaws, weaknesses, or embarrassing situations in a lighthearted manner.</title>
        </sec>
        <sec id="sec-6-1-8">
          <title>Wit-Surprise - Wit refers to clever, quick, and intelligent humor, and</title>
        </sec>
        <sec id="sec-6-1-9">
          <title>Surprise in humor involves introducing unexpected elements, twists, or punchlines that catch the audience of guard.</title>
        </sec>
        <sec id="sec-6-1-10">
          <title>You ought to focus more on classifying irony and sarcasm.</title>
        </sec>
        <sec id="sec-6-1-11">
          <title>Let’s think step by step. { text in need of classification from dataset. } { one of six classes from dataset: irony, sarcasm, exaggeration, incongruity-absurdity, self-deprecating humor, and wit-surprise. }</title>
        </sec>
        <sec id="sec-6-1-12">
          <title>Input</title>
        </sec>
        <sec id="sec-6-1-13">
          <title>Response</title>
          <p>*During the evaluation, leave the "Response" empty.</p>
        </sec>
      </sec>
      <sec id="sec-6-2">
        <title>6.2. Oficial Result</title>
        <p>
          All of the models were evaluated by the JOKER organizer [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ]. Table 10 presents the oficial results of
each model. The RoBERTa model achieved an accuracy of 18.56%, 0.19 Macro Average Precision (MAP),
0.24 Macro Average Recall (MAR), and 0.21 Macro Average F1-Score (MA-F1). The RoBERTa model
showed a significant drop in accuracy compared to our self-test results due to a mistake in our code.
The data uploaded for the oficial result was fine-tuned on extra data containing IR and SC, leading to
lower performance than expected. The GPT-4 model with clustering achieved an accuracy of 35.53%,
0.39 MAP, 0.40 MAR, and 0.34 MA-F1. The GPT-4 model with clustering produced results similar to our
self-testing.
        </p>
        <p>
          The Llama 3-8B model used for evaluation is the same model fine-tuned with 80% of the dataset. It
*Blue words represent the diferences compared to self-testing.
achieved an accuracy of 69.78%, 0.64 MAP, 0.65 MAR, and 0.64 MA-F1. The Llama 3-8B model exhibited
a significant drop in accuracy compared to our self-test results, potentially due to diferences in the
data distribution between the training and test sets, as shown in Figure 4. However, as seen in Table
9, the model performed exceptionally well on the class AID, even with a small amount of training
data. It appears that AID has distinctive features that the model can learn efectively. The model likely
overfitted to the training set, impairing its performance on the test set. Balancing the data in the training
set may help improve the model’s robustness. From the oficial results, it is evident that the Mistral-7B
model performed the best overall in humor classification, achieving an accuracy of 76%, from team
ORPAILLEUR[
          <xref ref-type="bibr" rid="ref31">31</xref>
          ].
        </p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>7. Discussion &amp; Error Analysis</title>
      <sec id="sec-7-1">
        <title>7.1. Discussion</title>
        <p>
          From the confusion matrix of RoBERTa and GPT-4, Figure 7 and Figure 6, it is evident that the models
struggled to accurately classify between the categories AID and WS, as well as IR and SC. One reason for
this dificulty is the existence of two distinct types of irony: verbal irony and situational irony. Verbal
irony, often referred to as sarcasm, implies that IR includes SC[
          <xref ref-type="bibr" rid="ref32">32</xref>
          ]. Another reason is that identifying
sarcasm in a sentence often requires contextual information[
          <xref ref-type="bibr" rid="ref32">32</xref>
          ]. Meanwhile, Llama 3-8B demonstrated
significantly better performance in the areas where RoBERTa and GPT-4 exhibited weaknesses, as
shown in Figure 8. GPT-4 with clustering shows a slight improvement compared to the vanilla GPT-4
model, from Figure 6.
        </p>
        <p>From Table 7, it is not hard to discover that fine-tuning LLMs is an efective method for humor
classification. Llama 3 has significantly better performance compared to GPT-4, with substantial
improvements in precision, recall, and F1-score for each class. Although non-tuned LLMs have great
general performance, they might not excel in specialized tasks. Even if fine-tuning LLMs is not available,
using smaller models like RoBERTa can also achieve acceptable performance.</p>
      </sec>
      <sec id="sec-7-2">
        <title>7.2. Error Analysis</title>
        <p>7.2.1. Llama 3
The model may occasionally produce unexpected responses, which can be attributed to the pre-training
data. For instance, if the input text is: "When negotiating whether to share your french fries, you
have quite a few bargaining chips.", the model might respond with: "lunch." In self-testing, 12 samples
produced unexpected outputs. Additional examples are provided in Table 11.</p>
        <p>
          This is the limitation of fine-tuning generative models such as LLMs. When employing the BERT
model for classification, the [CLS] token is inputted into the Multilayer Perceptron (MLP) [
          <xref ref-type="bibr" rid="ref33">33</xref>
          ]. The
model ensures the absence of unexpected output by maintaining a fixed output layer size and employing
the softmax function[
          <xref ref-type="bibr" rid="ref34">34</xref>
          ] to determine the probability of each output.
        </p>
        <p>We take an additional step to test those errors with ten more opportunities. Some of these errors
can be classified into one of the six classes. For instance, consider the input text 1: input text 1: "No
longer a female as I refuse to wear heels ever again" Llama 3-8B give an unexpected responce: "twitter",
but 1 out of 10 times, it give a responce "sarcasm". The same phenomenon occurred with input text 2:
"The leopard tried creeping up on the tigers using its camouflage but it was seen.", which received a
"wit-surprise" response 1 out of 10 times. Additionally, input text 8, "Doppelherz. The power of the</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>8. Conclusion &amp; Future Work</title>
      <sec id="sec-8-1">
        <title>8.1. Conclusion</title>
        <p>In this study, we conducted humor classification using deep learning models(RoBERTa), including LLMs
such as Llama 3-8B and GPT-4. The best performing model was Llama 3-8B, achieving an accuracy of
89.68% in self-testing and 69.78% in ofical result through fine-tuning and prompt engineering. We also
analyzed some unexpected responses from the LLMs to understand why they occurred.</p>
        <p>In summary, we found that fine-tuning LLMs can be very efective for humor classification.
Additionally, we discovered that clustering similar classes allows LLMs to achieve better performance.</p>
      </sec>
      <sec id="sec-8-2">
        <title>8.2. Future Work</title>
        <p>For future work, we can observe through the confusion matrix that EX and SD can be grouped into a
single category. This approach may improve overall accuracy. Additionally, the LLM could first score
the humor type present in the sentences and then classify based on a set threshold. Furthermore, the
clustering method can be applied to Llama 3-8B, which might also result in better performance.
two hearts," elicited a "wit-surprise" response 8 out of 10 times. These examples are shown in Table 12.
Meanwhile, other input texts remained unchanged.</p>
        <sec id="sec-8-2-1">
          <title>No longer a female as I refuse to wear heels ever again.</title>
        </sec>
        <sec id="sec-8-2-2">
          <title>The leopard tried creeping up on the tigers using its camouflage but it was seen.</title>
        </sec>
        <sec id="sec-8-2-3">
          <title>Doppelherz. The power of the two hearts.</title>
        </sec>
        <sec id="sec-8-2-4">
          <title>Response sarcasm wit-surprise wit-surprise</title>
        </sec>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Acknowledgments</title>
      <p>This study was supported by the National Science and Technology Council under the grant number
NSTC 113-2221-E-324-009.</p>
    </sec>
    <sec id="sec-10">
      <title>A. Appendix</title>
      <sec id="sec-10-1">
        <title>A.1. Our fine-tuned Llama 3-8B model for Humor Classification</title>
        <p>The fine-tuned Llama 3-8B model is available on Hugging Face.</p>
        <p>• Hugging Face</p>
      </sec>
      <sec id="sec-10-2">
        <title>A.2. Prompt of GPT-4 with clustering for each step.</title>
        <sec id="sec-10-2-1">
          <title>As a Humor Master, your task is to identify the type of humor from the following two categories . Take it step by step. This is a multi-category classification task. The aim is to automatically classify text according to the following classes: A,B.</title>
        </sec>
        <sec id="sec-10-2-2">
          <title>There are two humor types.</title>
        </sec>
        <sec id="sec-10-2-3">
          <title>Here are the two type of humour:</title>
        </sec>
        <sec id="sec-10-2-4">
          <title>A: These genres are primarily based on unexpected elements or clever twists for humorous efect.</title>
        </sec>
        <sec id="sec-10-2-5">
          <title>B: Usually involves exaggerating or distorting reality, or achieving</title>
          <p>humorous efects by teasing oneself or others.
###Limit number of words: no more than 3 tokens###
###Please answer directly without restating the question###
###Instructions: For each question, respond using only one of the
following abbreviations :A,B. Do not reply with answers other than
A,B.###</p>
        </sec>
        <sec id="sec-10-2-6">
          <title>As a Humor Master, your task is to identify the type of humor from</title>
          <p>the following two categories . Take it step by step. This is a multi
classification task. The aim is to automatically classify text according
to the following classes: WS,AID. There are two humor types.</p>
        </sec>
        <sec id="sec-10-2-7">
          <title>Here are the two type of humour:</title>
        </sec>
        <sec id="sec-10-2-8">
          <title>WS:Includes humor that uses intelligence and wit to elicit laughter through clever language or thought patterns. This type of humor may involve puns, quips, or logical deductions, allowing people to appreciate the author’s intelligence and creativity.</title>
          <p>AID:Includes humor that utilizes elements that defy common sense or
logic, or combines unrelated things to create a sense of absurdity or
incongruity. This type of humor often surprises and confuses people
because it goes against our expectations.
###Limit number of words: no more than 3 tokens###
###Please answer directly without restating the question###
###Instructions: For each question, respond using only one of the
following abbreviations:WS,AID. Do not reply with answers other than
WS,AID.###</p>
        </sec>
        <sec id="sec-10-2-9">
          <title>As a Humor Master, your task is to identify the type of humor from</title>
          <p>the following three categories. Take it step by step. This is a multi
classification task. The aim is to automatically classify text according
to the following classes: IR,EX,SD. There are three humor types.</p>
        </sec>
        <sec id="sec-10-2-10">
          <title>Here are the three type of humour:</title>
        </sec>
        <sec id="sec-10-2-11">
          <title>IR:Includes irony, which relies on the gap between literal meaning and actual intent to create humor, and sarcasm, which is used specifically to mock, criticize, or express contempt.</title>
        </sec>
        <sec id="sec-10-2-12">
          <title>SD:Covers self-deprecating humor that amuses audiences by high</title>
          <p>lighting personal flaws, weaknesses, or embarrassing situations in a
light-hearted way.</p>
          <p>EX:Involves exaggerating something, exaggerating certain features
beyond normal or realistic proportions to create a humorous efect.
###Limit number of words: no more than 3 tokens###
###Please answer directly without restating the question###
###Instructions: For each question, respond using only one of the
following abbreviations:IR,SD,EX. Do not reply with answers other
than IR,SD,EX.###</p>
        </sec>
        <sec id="sec-10-2-13">
          <title>As a Humor Master, your task is to identify the type of humor from</title>
          <p>the following two categories . Take it step by step. This is a multi
classification task. The aim is to automatically classify text according
to the following classes: IR,SC. There are two humor types.</p>
        </sec>
        <sec id="sec-10-2-14">
          <title>Here are the two type of humour:</title>
        </sec>
        <sec id="sec-10-2-15">
          <title>IR:Focuses on exploiting the discrepancy between literal meaning and actual intent to create humor, often by reversing or twisting expectations.</title>
        </sec>
        <sec id="sec-10-2-16">
          <title>SC:Focus on the use of irony to ridicule, criticize, or express contempt,</title>
          <p>often with a certain sharpness or criticalness.
###Limit number of words: no more than 3 tokens###
###Please answer directly without restating the question###
###Instructions: For each question, respond using only one of the
following abbreviations:IR,SC. Do not reply with answers other than
IR,SC.###</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          , A.-G. Bosser,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Thomas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. M. P.</given-names>
            <surname>Preciado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sidorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <article-title>Clef 2024 joker lab: Automatic humour analysis</article-title>
          , in: N.
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Tonellotto</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lipani</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
          </string-name>
          , I. Ounis (Eds.),
          <source>Advances in Information Retrieval</source>
          , Springer Nature Switzerland, Cham,
          <year>2024</year>
          , pp.
          <fpage>36</fpage>
          -
          <lpage>43</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>Performance analysis on deep learning models in humor detection task</article-title>
          ,
          <source>in: 2022 International Conference on Machine Learning and Knowledge Engineering (MLKE)</source>
          , IEEE,
          <year>2022</year>
          . URL: http://dx.doi.org/10.1109/MLKE55170.
          <year>2022</year>
          .
          <volume>00023</volume>
          . doi:
          <volume>10</volume>
          .1109/mlke55170.
          <year>2022</year>
          .
          <volume>00023</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <article-title>Discourse analysis on humor</article-title>
          ,
          <source>in: 2011 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC)</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>5002</fpage>
          -
          <lpage>5005</lpage>
          . doi:
          <volume>10</volume>
          .1109/AIMSEC.
          <year>2011</year>
          .
          <volume>6011180</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <article-title>Discourse analysis on humor</article-title>
          ,
          <source>in: 2011 2nd International Conference on Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC)</source>
          , IEEE,
          <year>2011</year>
          . URL: http: //dx.doi.org/10.1109/AIMSEC.
          <year>2011</year>
          .
          <volume>6011180</volume>
          . doi:
          <volume>10</volume>
          .1109/aimsec.
          <year>2011</year>
          .
          <volume>6011180</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kong</surname>
          </string-name>
          ,
          <article-title>Classification and regression combined model on accessing humor score with explanatory feature</article-title>
          ,
          <source>in: 2022 International Conference on Machine Learning and Knowledge Engineering (MLKE)</source>
          , IEEE,
          <year>2022</year>
          . URL: http://dx.doi.org/10.1109/MLKE55170.
          <year>2022</year>
          .
          <volume>00050</volume>
          . doi:
          <volume>10</volume>
          . 1109/mlke55170.
          <year>2022</year>
          .
          <volume>00050</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Sayyed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. Rushikesh</given-names>
            <surname>Sugave</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Paygude</surname>
          </string-name>
          ,
          <string-name>
            <surname>B. N Jazdale</surname>
          </string-name>
          ,
          <article-title>Study and analysis of emotion classification on textual data</article-title>
          ,
          <source>in: 2021 6th International Conference on Communication and Electronics Systems (ICCES)</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1128</fpage>
          -
          <lpage>1132</lpage>
          . doi:
          <volume>10</volume>
          .1109/ICCES51350.
          <year>2021</year>
          .
          <volume>9489204</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7] OpenAI, Gpt-4
          <source>technical report</source>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2303</volume>
          .
          <fpage>08774</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Meta</surname>
          </string-name>
          ,
          <article-title>Introducing Meta Llama 3: The most capable openly available LLM to date - ai</article-title>
          .meta.com, https://ai.meta.com/blog/meta-llama-3/,
          <year>2024</year>
          . [Accessed 29-
          <fpage>05</fpage>
          -2024].
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>[9] OpenAI, Language models are few-shot learners</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>2005</year>
          .14165.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Pitis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ba</surname>
          </string-name>
          ,
          <article-title>Boosted prompt ensembles for large language models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2304</volume>
          .
          <fpage>05970</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Tan</surname>
          </string-name>
          , G. Liu,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Connecting large language models with evolutionary algorithms yields powerful prompt optimizers</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2309</volume>
          .
          <fpage>08532</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Sahoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. K.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Saha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mondal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chadha</surname>
          </string-name>
          ,
          <article-title>A systematic survey of prompt engineering in large language models: Techniques and applications</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2402</volume>
          .
          <fpage>07927</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yuan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Hayashi</surname>
          </string-name>
          , G. Neubig,
          <article-title>Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing</article-title>
          ,
          <year>2021</year>
          . arXiv:
          <volume>2107</volume>
          .
          <fpage>13586</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Axmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pryzant</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Khani</surname>
          </string-name>
          , Prompt engineering a prompt engineer,
          <year>2024</year>
          . arXiv:
          <volume>2311</volume>
          .
          <fpage>05661</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ekin</surname>
          </string-name>
          ,
          <article-title>Prompt engineering for chatgpt: A quick guide to techniques, tips, and best practices (</article-title>
          <year>2023</year>
          ). URL: http://dx.doi.org/10.36227/techrxiv.22683919. doi:
          <volume>10</volume>
          .36227/techrxiv.22683919.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>White</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Fu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hays</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sandborn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Olea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gilbert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Elnashar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Spencer-Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <article-title>A prompt pattern catalog to enhance prompt engineering with chatgpt</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2302</volume>
          .
          <fpage>11382</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>X.</given-names>
            <surname>Amatriain</surname>
          </string-name>
          ,
          <article-title>Prompt design and engineering: Introduction and advanced methods</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2401</volume>
          .
          <fpage>14423</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schuurmans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bosma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Ichter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Chi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <article-title>Chain-of-thought prompting elicits reasoning in large language models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2201</volume>
          .
          <fpage>11903</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , Bert:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <year>2019</year>
          . arXiv:
          <year>1810</year>
          .04805.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Roberta: A robustly optimized bert pretraining approach</article-title>
          ,
          <year>2019</year>
          . arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          , Attention is all you need,
          <year>2023</year>
          . arXiv:
          <volume>1706</volume>
          .
          <fpage>03762</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Corrado,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <article-title>Eficient estimation of word representations in vector space</article-title>
          ,
          <year>2013</year>
          . arXiv:
          <volume>1301</volume>
          .
          <fpage>3781</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          , C. Manning, GloVe:
          <article-title>Global vectors for word representation</article-title>
          , in: A.
          <string-name>
            <surname>Moschitti</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Pang</surname>
          </string-name>
          , W. Daelemans (Eds.),
          <source>Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Doha, Qatar,
          <year>2014</year>
          , pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          . URL: https://aclanthology.org/D14-1162. doi:
          <volume>10</volume>
          .3115/v1/
          <fpage>D14</fpage>
          -1162.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>B.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Langrené</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>Unleashing the potential of prompt engineering: a comprehensive review</article-title>
          ,
          <year>2024</year>
          . arXiv:
          <volume>2310</volume>
          .
          <fpage>14735</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>H. T.</surname>
          </string-name>
          et al.,
          <source>Llama</source>
          <volume>2</volume>
          :
          <article-title>Open foundation and fine-tuned chat models</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2307</volume>
          .
          <fpage>09288</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>A. Q.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sablayrolles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mensch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bamford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Chaplot</surname>
          </string-name>
          , D. de las Casas,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bressand</surname>
          </string-name>
          , G. Lengyel,
          <string-name>
            <given-names>G.</given-names>
            <surname>Lample</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Saulnier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Lavaud</surname>
          </string-name>
          , M.
          <article-title>-</article-title>
          <string-name>
            <surname>A. Lachaux</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Stock</surname>
            ,
            <given-names>T. L.</given-names>
          </string-name>
          <string-name>
            <surname>Scao</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lavril</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Lacroix</surname>
            ,
            <given-names>W. E.</given-names>
          </string-name>
          <string-name>
            <surname>Sayed</surname>
          </string-name>
          , Mistral 7b,
          <year>2023</year>
          . arXiv:
          <volume>2310</volume>
          .
          <fpage>06825</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>G.</given-names>
            <surname>Team</surname>
          </string-name>
          ,
          <source>Gemma: Open models based on gemini research and technology</source>
          ,
          <year>2024</year>
          . arXiv:arXiv:
          <fpage>2403</fpage>
          .
          <fpage>08295</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>T.</given-names>
            <surname>Dettmers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pagnoni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Holtzman</surname>
          </string-name>
          , L. Zettlemoyer, Qlora: Eficient finetuning of quantized llms,
          <year>2023</year>
          . arXiv:
          <volume>2305</volume>
          .
          <fpage>14314</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>D.</given-names>
            <surname>Han</surname>
          </string-name>
          , M. Han,
          <string-name>
            <given-names>H. H.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          , Qubitium,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Belkada</surname>
          </string-name>
          ,
          <string-name>
            <surname>Z</surname>
          </string-name>
          , unslothai/unsloth,
          <year>2024</year>
          . URL: https: //github.com/unslothai/unsloth.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>R.</given-names>
            <surname>Taori</surname>
          </string-name>
          *, I. Gulrajani*, T. Zhang*,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Dubois*</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          *,
          <string-name>
            <given-names>C.</given-names>
            <surname>Guestrin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          , T. B.
          <string-name>
            <surname>Hashimoto</surname>
          </string-name>
          ,
          <article-title>Alpaca: A strong, replicable instruction-following model</article-title>
          , https://crfm.stanford.edu/
          <year>2023</year>
          /03/13/alpaca.html,
          <year>2021</year>
          . [Accessed 29-
          <fpage>05</fpage>
          -2024].
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>L.</given-names>
            <surname>Ermakova</surname>
          </string-name>
          , A.-G. Bosser,
          <string-name>
            <given-names>T.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. M. P.</given-names>
            <surname>Preciado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Sidorov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jatowt</surname>
          </string-name>
          ,
          <article-title>Overview of the clef 2024 joker track automatic humour analysis</article-title>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>E.</given-names>
            <surname>Filatova</surname>
          </string-name>
          ,
          <article-title>Irony and sarcasm: Corpus generation and analysis using crowdsourcing</article-title>
          , in: N.
          <string-name>
            <surname>Calzolari</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Choukri</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Declerck</surname>
            ,
            <given-names>M. U.</given-names>
          </string-name>
          <string-name>
            <surname>Doğan</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Maegaard</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mariani</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Moreno</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Odijk</surname>
          </string-name>
          , S. Piperidis (Eds.),
          <source>Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)</source>
          ,
          <source>European Language Resources Association (ELRA)</source>
          , Istanbul, Turkey,
          <year>2012</year>
          , pp.
          <fpage>392</fpage>
          -
          <lpage>398</lpage>
          . URL: http://www.lrec-conf.org/proceedings/lrec2012/pdf/661_Paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <surname>M.-C. Popescu</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Balas</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Perescu-Popescu</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Mastorakis</surname>
          </string-name>
          ,
          <article-title>Multilayer perceptron and neural networks</article-title>
          ,
          <source>WSEAS Transactions on Circuits and Systems</source>
          <volume>8</volume>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bridle</surname>
          </string-name>
          ,
          <article-title>Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters</article-title>
          , in: D. Touretzky (Ed.),
          <source>Advances in Neural Information Processing Systems</source>
          , volume
          <volume>2</volume>
          , Morgan-Kaufmann,
          <year>1989</year>
          . URL: https://proceedings.neurips.cc/ paper_files/paper/1989/file/0336dcbab05b9d5ad24f4333c7658a0e-Paper.pdf.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>