<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SINAI at CheckThat! 2024: Stealthy Character-Level Adversarial Attacks Using Homoglyphs and Iterative Search</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>José Valle-Aguilera</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto J. Gutiérrez-Megías</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Salud María Jiménez-Zafra</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>L. Alfonso Ureña-López</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eugenio Martínez-Cámara</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Department, SINAI, CEATIC Universidad de Jaén</institution>
          ,
          <addr-line>Campus Las Lagunillas, 23071, Jaén</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper describes the participation of the SINAI research group in Task 6 of CheckThat! Lab 2024 at CLEF. For this task, it is presented a language model adversarial attack scenario using the BODEGA framework. Given this scenario, the participants have to develop an Adversarial Attack Model Class, that is, a system that attacks a given text by modifying it, confusing the model, and making an incorrect prediction. We propose an adversarial attack experiment using search algorithms based on iterative search. Specifically, we propose a stealthy heuristic based on character level attack, modifying a letter (character) with its homoglyph. Our system obtained a 99% of preservation of meaning in the manual evaluation phase of this task, being the most stealthy proposal presented.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Adversarial Attack</kwd>
        <kwd>Homoglyph</kwd>
        <kwd>Iterative Search</kwd>
        <kwd>Stealth Attack</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        With the rise of language models and their applications to diferent tasks and tools, new needs have
arisen in the world of security, robustness and reliability of these models. In a scenario where language
models are used to classify and label a given data, for example, spam detection, a malicious actor could
modify the model’s input data, which is known as an untargeted adversarial attack, or its internal
architecture, which is called a targeted adversarial attack, to modify the behavior of the model, stealthily,
for the user who is using it. These attacks reduce the trustworthiness of the model since its performance
is made less significantly [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        This paper presents the details and results of diferent types of adversarial attacks developed by
the SINAI team for Task 6 of the CheckThat! lab [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] at CLEF 2024 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ][
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The goal of this task is to
develop an Adversarial Attack Model Class for attacking the input data in various language models
trained for a binary classification task, changing the label of the input data via transformations. These
transformations must be general attacks, and not focused on a model, because among the available
models, there are three diferent models, BERT [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] , BiLSTM [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], and RoBERTa-base [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], being the
last one unknown for the participants. In addition, these Adversarial Attacks need to be stealthy [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ],
that is, they have to maintain a semantic and character similarity between the provided data and
the result of the attack, called adversarial example. These attacks should be carried out with as few
transformations as necessary while keeping the text as intact as possible. This is measured using the
BODEGA framework [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], a system that provides metrics for semantic and character similarity. The
data for this task consists of five datasets on diferent classification tasks: Propaganda Detection (PR),
Fact-Checking (FC), Covid-19 (C19), Style-based News Bias Assessment (HN) and Rumour Detection
(RD).
      </p>
      <p>For the development of our Adversarial Attack Class, we use diferent search algorithms along with
character-based attacks, called heuristics, to determine which one (or which combination) ofers better
performance by reserving the label of the input data. We define character-based attacks as modifications,
deletions, or additions of a character in a given string. Among them, we define the heuristics that
modify a letter by its homoglyph and deleting simple characters such as punctuation marks or symbols.
Additionally, to discover the tokens we need to attack to change the label, we have developed two
search algorithms based on brute force. These search algorithms select each token iteratively to attack
it with a specified heuristic, in our case, we select a homoglyph attack because it is the most stealthy to
the human eye. Once we have the n most important tokens, we attack those strings with a search space,
in this case, using a character-level modification homoglyphs. With this approach, we obtained a 99%
of preservation of meaning in the manual evaluation phase of the competition, being the most stealthy
proposal presented in this task.</p>
      <p>The rest of the work is organized as follows: In Section 2 we describe the data used in this task. In
Section 3 we define our solution, explaining all the contemplated proposals and the final attack used.
We mention the results of our proposal in Section 4. We conduct an error analysis in Section 5. Finally,
the conclusion can be found in Section 6.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Datasets</title>
      <p>The datasets provided by the BODEGA framework are the ones used for this task. The BODEGA
framework has a set represented by 5 datasets of diferent contexts:
• Fact-Checking (FC): It is the most advanced way in which human experts can verify the credibility
of a given text, by assessing the veracity of the claims it includes concerning a knowledge
base. It deals with Natural Language Inference (NLI) in the field of encyclopedic knowledge and
newsworthy events [10].
• Style-based news bias assessment (HN): Assessing news credibility is usually based on three
factors, writing style, veracity, and context. Training data includes statistical information to
recognize sources with known credibility [11].
• Propaganda detection: (PR): Detecting propagandizing in a text does not necessarily imply that
something is false, but also detecting the persuasion techniques used in this type of journalism.</p>
      <p>This corpus belongs to SemEval 2020 task 11 scored by practitioners on 371 articles [12].
• Rumour detection (RD): A rumour is information that is spread among people even if it does
not originate from a reliable source. Although not all rumours have to be false, some can be
confirmed later by legitimate sources [13].
• Covid-19 (C19): With the COVID-19 pandemic, the creation of pandemic conspiracy theories and
disinformation became increasingly critical and dubious. Conspiracy theories about COVID-19
ifnd more support among the general public than misinformation about treatment and
transmissibility [14].</p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposal description</title>
      <p>The proposal presented for this task consists of diferent search algorithms, an attack heuristic and
several constraints:
• Search Algorithms: Search systems based on finding tokens in a target text to attack it. These
algorithms are based on brute force, diferentiating two heuristics, one stores information to
attack and the other does not.
• Heuristics: Attack algorithms that modify one or multiple elements of the input data. For this
task, we have focused on homoglyph attacks.
• Constraints: Restrictions to take into account when attacking. These constraints can change or
limit the attacks to make them less time-consuming or improve their performance.</p>
      <p>Each of these approaches is explained in detail below and the proposal in which they are combined is
presented at the end.</p>
      <sec id="sec-3-1">
        <title>3.1. Search Algorithms</title>
        <p>A search algorithm is defined as the method to select the tokens to attack which will change the
predicted label. In our proposal, we base these search algorithms on brute force attacks to try every
single token from the start of the text until it finishes. Also, two types of brute force attack are designed,
one keeping “memory” of the previously attacked tokens, and the other without “memory”. Figure 1
shows how the search algorithm and attack heuristic work altogether.</p>
        <p>In the case of the Non-Memory Search Algorithm, it selects the first token of the text and applies the
attack heuristic. Then, the system predicts the label of the modified text, called adversarial example. If
the label remains the same, the attack was unsuccessful and the search algorithm selects the next token
to attack from the unmodified text until the label changes to its goal or we reach the end of the text. If
the label changed to its goal, then the attack was successful and we may proceed with the subsequent
provided data, starting from the beginning. If we reach the end of the text without correctly changing
the label, we have failed to create an adversarial example and need to move on to the next one.</p>
        <p>In the case of the Memory Search Algorithms, the process is similar to the previous one, with one
single change. When an attack is unsuccessful, for the next iteration the search algorithm selects the
next token from the previous iteration, memorizing the changes made in the text. With this approach,
we “store” the changes made in the adversarial example and attack more tokens in every query we
make to the model.</p>
        <p>Now that the method for selecting a token to attack has been defined, it is time to establish the
heuristics and attacks implemented in the Adversarial Attack class.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Heuristic</title>
        <p>
          We select a character-level technique to choose the type of perturbation to use in the search space. This
type of attack is more interesting, for the model and the human eye, as it makes it more dificult to
identify these perturbations, with the use of imperceptible characters or small modifications [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>The heuristic implemented for this task is a homoglyph based attack. It is a character-based
modification of the input string without taking into account the meaning or structure of the objective tokens.
For the development of our system, we focused on character-level attacks due to their stealthiness.</p>
        <p>The words to be attacked are selected by brute force. Modifications are made by sweeping the target
text, attacking each word individually, and checking the success of the modification. For each selected
word, a letter is randomly chosen to be replaced by one of the available homoglyphs, also randomly
selected, since the selection of the homoglyph does not change the final result. This heuristic has two
variations explained in Section 3.4.</p>
        <p>A homoglyph of a character is a symbol that resembles that character, usually a letter of the alphabet.
This homoglyph confuses the language model, which has not been trained with these symbols, causing
it to fail and change the label of the input data. In addition, since this symbol resembles an alphabet
character, and has been changed to a real letter, it makes the attack very stealthy to the human eye.
In order to obtain the homoglyphs of a letter, we use the Homoglyphs [15] Python library. With this
library, we can obtain a list of homoglyphs of a given letter. For this approach, we randomly select
a letter from the token, obtain the list of homoglyphs for this letter, and exchange it for a randomly
selected homoglyph of the list.</p>
        <p>Another character-based heuristic is the use of invisible characters. Similar to the homoglyphs,
invisible characters are almost impossible to detect by the human eye and can reduce the model’s
precision. Finally, other character-based heuristics are deletion, modification or addition of arbitrary
characters. Normally these heuristics follow a rule-based algorithm or rely simply on random
modifications. However, these heuristics are easier to notice by the human eye than homoglyph attacks.
Additionally, homoglyph attacks do not worsen the character and semantic score as much as other
heuristics. Figure 2 shows examples of these attacks, with the homoglyph attack being the most stealthy
to the human eye.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Constraints</title>
        <p>We do small attack algorithms and restrictions that change the selection of tokens to attack or the
heuristic to be used. Within the constraints, we find:
• Remove simple characters: When attacking a selected token, regardless of the attack heuristic
applied, if the selected token has only one character, that token is removed from the text and it
is marked as attacked. On several occasions, simple characters are key to determining a label,
such as the beginning and end of a quote (" "). Finding a homoglyph is more complicated in these
cases, as it is not a letter. It is therefore considered necessary to remove this type of character.
• Exception management in attacks: When an exception occurs in an attack, the system tries to
recover, normally selecting another letter or word to attack. In the case of a homoglyph attack,
the letter “m” does not have any available homoglyph. In this case, the other letter is randomly
selected until the attack finalizes correctly.</p>
        <p>All these constraints are taken into account in the development of our system and can be easily
activated, deactivated or configured in the Adversarial Attack class variables.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Final System</title>
        <p>Now that each approach has been defined, we can describe our final proposal. One of the most important
issues limiting our system is the execution time. Approaches such as using Memory Search Algorithms
in large texts are too time-consuming. That is why we decided to limit and adapt some attack proposals,
in order to sacrifice performance for more eficient approaches.</p>
        <p>For sort texts (PR, C19 and FC datasets) we define our system as follows:
• Non-memory Search Algorithm: In the development phase, we detected that nearly 20% of the
attacks were successful only modifying one token from the text. In order to make the attack less
time-consuming, a first brute force attack without memory is made, that is, attack each token
without maintaining the changes made in the text.
• Memory Search Algorithm: If the attack is unsuccessful, we start from the beginning of the ,
and initiate a brute force attack with memory, that is, maintaining the previous changes to the
text until the attack succeeds or reaches the end of the text. In that case, the attack has been
unsuccessful and we move on to the next text in the dataset.</p>
        <p>For large texts (HN and RD), a diferent approach is taken into account, in order to reduce the time
and resources consumed. In this case, only a Non-memory brute force attack is made.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>We perform the experiments with the test sets provided in the competition using our two search
algorithms over RoBERTa-base as the victim model. Considering the results of attacking this new model
with a character-level technique, we can conclude that we obtain a favorable result taking into account
the semantic score and the Levenshtein distance (Character Score). This search space performs best
on the PR2, FC and HN dataset. These datasets have a higher hit rate than the other datasets using a
homoglyph-based search space.</p>
      <p>In general, the performance of the search space using homoglyphs works best with small text entries,
although in the HN dataset has also performed well compared to the other long text datasets, this
may be due to the fact that datasets such as RD and PR2 being unbalanced are more dificult to attack.
Due to the nature of long texts and the use of an iterative search that does not store information, it
cannot compromise the decision of the model in this type of dataset. In short texts, in many cases, by
removing punctuation marks or modifying a single word we are able to change the label of the victim
model in a short time without the need to store information. However, in large texts, such strategies
are not so successful, and using greedy techniques such as iterative search to store the information of
these changes is computationally time-consuming, which in a real detection environment would be
impractical.</p>
      <p>Our final result of the competition is the average BODEGA score of each attack performed on a
dataset and using a diferent victim model for each one. This means that the final score is the average of
15 executions, being our result 0.3507 avg. BODEGA score. This result is low considering, as shown in
Table 2, that we have very disparate results, especially in the long text datasets compared to the short
ones.</p>
      <p>On the other hand, besides the automatic evaluation, a manual evaluation of a subset of the data
has been carried out. Each of the attacked texts has been evaluated anonymously by two annotators,
obtaining an average pairwise annotator agreement of 0.59 in Cohen’s Kappa. The results are represented
in percentage which means how much they have kept the meaning of the text (N/100)%. Our work has
achieved a 99% meaning score by the annotators, being the team in the competition with the highest
score. We performed attacks based mostly on stealth. Therefore we chose to use character-level attacks.
Among these attacks we selected homoglyphs, as they are very small modifications in the text, they are
almost imperceptible to humans, so they do not change the meaning of the phrase. The results obtained
in the manual evaluation highlights that, in case of obtaining only short texts or with the use of other
more sophisticated techniques, the attack on character level, especially homoglyphs, can be a dificult
attack to detect in a real environment.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Error analysis</title>
      <p>After analyzing the results, we found that perturbing a single word in the dataset can change the
outcome of the model, but in specific cases, such as longer input texts, it may be less successful. There
is also a success ratio in the results if the dataset is balanced, as is the case for FC and HN. Iterative
search in long texts has not been as successful as expected, so using advanced explainability methods
or employing word or phrase-based techniques may be more efective in these cases. Many of the
successful modifications made in short texts are due to the removal of punctuation marks.</p>
      <p>We found that the use of search spaces based on brute-force techniques, without storing changes,
can be fast but not as successful. Therefore, it is necessary to identify tokens of importance to the
victim model and attack these targets using attacks such as homoglyphs, as they prove to be suficiently
stealthy for both the model and humans, and powerful enough to perform label switching.</p>
      <p>One of the most relevant problems is the processing of long texts using these heuristics because of
large processing time. One of the possible solutions could be to split the texts to be attacked in order to
have a better performance using the methodology that uses memory. Another possible solution would
be to identify which words are more important to modify in these longer texts, in order to increase
eficiency.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this paper an Adversarial Attack Model Class is proposed, using iterative search algorithms with an
homoglyph attack. It is also worth noting the stealthiness of this attack and the dificulty of detecting
them by an external agent, in addition to the fact that no advanced techniques are needed to attack
widely used language models. It should be noted the stealthiness of our proposal, obtaining a 99%
of meaning preservation among all datasets, being the most silent Attack Model presented in this
task. Finally, future work could involve using explainability, as it allows computers to understand
human language by understanding the grammatical structure of the text and meaning of individual
words. Methods could be used on long texts, splitting these long texts into smaller ones and applying
diferent heuristics, either the iterative ones mentioned above or a mixture of greedy iterative search
and explainability-based search space. Additionally, we can further refine these attacks to make them
less time-consuming on large texts and try to apply explicability without penalty in time and resource
consumption.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgements</title>
      <p>This paper has been partially supported by projects CONSENSO (PID2021-122263OB-C21),
MODERATES (TED2021-130145B-I00), SocialTOX (PDC2022-133146-C21) and FedDAP (PID2020-116118GA-I00)
funded by MCIN/AEI/ 10.13039/501100011033 and by the “European Union NextGenerationEU/PRTR”.
The research work conducted by Salud María Jiménez-Zafra has been supported by Action 7 from
Universidad de Jaén under the Operational Plan for Research Support 2023-2024.
[10] J. Thorne, A. Vlachos, C. Christodoulopoulos, A. Mittal, FEVER: a large-scale dataset for fact
extraction and VERification, in: NAACL-HLT, 2018.
[11] J. Kiesel, M. Mestre, R. Shukla, E. Vincent, D. Corney, P. Adineh, B. Stein, M. Potthast, Data for
pan at semeval 2019 task 4: Hyperpartisan news detection, November. type: dataset (2018).
[12] G. Da San Martino, A. Barrón-Cedeño, H. Wachsmuth, R. Petrov, P. Nakov, Semeval-2020 task 11:
Detection of propaganda techniques in news articles, in: Proceedings of the Fourteenth Workshop
on Semantic Evaluation, 2020, pp. 1377–1414.
[13] S. Han, J. Gao, F. Ciravegna, Augmented dataset of rumours and non-rumours for rumour detection,
2019. doi:10.5281/zenodo.3249977.
[14] A. M. Enders, J. E. Uscinski, C. Klofstad, J. Stoler, The diferent forms of covid-19 misinformation
and their consequences, The Harvard Kennedy School Misinformation Review (2020).
[15] Homoglyphs: get similar letters, convert to ascii, detect possible languages and utf-8 group.,
https://pypi.org/project/homoglyphs/#description, 2022. [Accessed 31-05-2024].</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N.</given-names>
            <surname>Papernot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>McDaniel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sinha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. P.</given-names>
            <surname>Wellman</surname>
          </string-name>
          ,
          <article-title>Sok: Security and privacy in machine learning</article-title>
          ,
          <source>in: 2018 IEEE European Symposium on Security and Privacy (EuroSP)</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>399</fpage>
          -
          <lpage>414</lpage>
          . doi:
          <volume>10</volume>
          .1109/EuroSP.
          <year>2018</year>
          .
          <volume>00035</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Przybyła</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shvets</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Mu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K. C.</given-names>
            <surname>Sheang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Saggion</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF2024 CheckThat! lab task 6 on robustness of credibility assessment with adversarial examples (incrediblae</article-title>
          ),
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Przybyła</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          ,
          <article-title>The clef-2024 checkthat! lab: Check-worthiness, subjectivity, persuasion, roles, authorities, and adversarial robustness</article-title>
          , in: N.
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Tonellotto</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lipani</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
          </string-name>
          , I. Ounis (Eds.),
          <source>Advances in Information Retrieval</source>
          , Springer Nature Switzerland, Cham,
          <year>2024</year>
          , pp.
          <fpage>449</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Przybyła</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Caselli</surname>
          </string-name>
          , G. Da San Martino,
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Piskorski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF-2024 CheckThat! Lab: Check-worthiness, subjectivity, persuasion, roles, authorities and adversarial robustness</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
            ,
            <given-names>G. M.</given-names>
          </string-name>
          <string-name>
            <surname>Di Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>García Seco de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ),
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          , in: J.
          <string-name>
            <surname>Burstein</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Doran</surname>
          </string-name>
          , T. Solorio (Eds.),
          <source>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . URL: https://aclanthology.org/N19-1423. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1423.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <article-title>Bidirectional lstm-crf models for sequence tagging</article-title>
          ,
          <source>arXiv preprint arXiv:1508</source>
          .
          <year>01991</year>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Du</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Levy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Roberta: A robustly optimized BERT pretraining approach</article-title>
          , CoRR abs/
          <year>1907</year>
          .11692 (
          <year>2019</year>
          ). URL: http://arxiv.org/abs/
          <year>1907</year>
          .11692. arXiv:
          <year>1907</year>
          .11692.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>N.</given-names>
            <surname>Boucher</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Shumailov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Anderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Papernot</surname>
          </string-name>
          ,
          <article-title>Bad characters: Imperceptible nlp attacks</article-title>
          ,
          <source>in: 2022 IEEE Symposium on Security and Privacy (SP)</source>
          , IEEE,
          <year>2022</year>
          , pp.
          <fpage>1987</fpage>
          -
          <lpage>2004</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Przybyła</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shvets</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Saggion</surname>
          </string-name>
          ,
          <article-title>Bodega: Benchmark for adversarial example generation in credibility assessment</article-title>
          ,
          <source>arXiv preprint arXiv:2303.08032</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>