<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>MMU NLP at CheckThat! 2024: Homoglyphs are Adversarial Attacks</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Charlie Roadhouse</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matthew Shardlow</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ashley Williams</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Manchester Metropolitan University, All Saints Campus, Metropolitan University</institution>
          ,
          <addr-line>Manchester M15 6BH</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>Work on adversarial attacks is becoming more necessary with the growing amount of misinformation online, as it can bolster future defence techniques. In this paper, we present a character-based approach for 2024 CheckThat! Lab Task 6, InCrediblAE. We pair an importance-based search technique with a homoglyph attack to replace characters with glyphs that are imperceptible to human readers but elicit a change in classification. The model is tested against three separate victim models across five distinct misinformation domains: COVID-19 misinformation, fact-checking, rumour detection, propaganda detection and style-based news bias. Results show that our model has comparable performance to the current state of the art, even exceeding current results in certain cases.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Adversarial Attacks</kwd>
        <kwd>Homoglyphs</kwd>
        <kwd>Character-based Attacks</kwd>
        <kwd>Semantic Preservation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>There is a growing body of research focused on making classification models more robust against
adversarial attacks [5, 6]. Some researchers view the generation of adversarial examples as a data
augmentation task, aiming to train future models on adversarial data to improve their robustness [7].
Others focus on developing new defence methods based on evolving attack strategies [8].</p>
      <p>Adversarial attack strategies can be broadly categorized into three types: character-based, word-based,
and sentence-based attacks.</p>
      <p>Character-based attacks modify individual characters within words using techniques such as insertion,
deletion, replacement, and transposition. This method tends to preserve the original semantics well, as
demonstrated by models like HotFlip [9] and TextFooler [10], though it can often be detected by spell
checkers, making it less robust [11].</p>
      <p>Word-based attacks afect entire words, employing techniques such as word removal, replacement, and
insertion. This approach typically preserves semantics by using synonyms of the original word, but the
resulting sentences can sometimes be easily identified by humans due to grammatical inconsistencies.
Examples include BERTattack [12] and BAE [13].</p>
      <p>Sentence-based attacks modify sentences either wholly or partially, using techniques like paraphrasing
to generate reworded versions of given phrases [14, 15]. This level of attack can cause issues with the
amount of semantic preservation retained [8].</p>
      <p>A key aspect of adversarial attack methodologies is the search technique used to identify which
parts of the text to attack. Two main techniques are prevalent in research: importance-based and
gradient-based.</p>
      <p>Importance-based search techniques generate perturbations only on words that are crucial for the
classifier’s decision. State-of-the-art models like BERTattack [ 12], BAE [13], and A2T [16] use the
BERT model to mask words in the sequence, ranking the most important words to attack based on
classification scores of these masked sequences.</p>
      <p>Gradient-based search techniques, similar to the Fast Gradient Sign Method (FGSM) [17] used in
adversarial image generation, are adapted to text by identifying perturbations that best afect the
model’s gradient [18]. However, this method requires a white-box environment with full access to the
model.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Task Description</title>
      <p>
        We participated in Task 6 InCredibleAE, as part of the CheckThat! 2024 Shared Task [
        <xref ref-type="bibr" rid="ref4">19</xref>
        ]. Task Six aimed
to generate high-quality adversarial examples of text across five domains: COVID-19 misinformation
(C19), Fact-Checking (FC), Rumour Detection (RD), Propaganda Detection (PR2), and Style Based News
Bias (HN). The distinguishing feature of this task was its emphasis on producing attacks with minimal
changes to the original text while efectively altering the classification.
      </p>
      <p>
        To evaluate the quality of the generated attacks, the BODEGA framework [
        <xref ref-type="bibr" rid="ref5">20</xref>
        ] was employed. This
framework provided an overall score based on the success of the classification change while also
considering semantic preservation using BLEURT [
        <xref ref-type="bibr" rid="ref6">21</xref>
        ] and character-based similarities calculated using
the Levenshtein distance metric [
        <xref ref-type="bibr" rid="ref7">22</xref>
        ].
      </p>
      <p>Additionally, participants engaged in a manual annotation stage where adverse sentences were
manually annotated based on their semantic similarity to the original sentences. This stage aimed
to further assess the semantic preservation of the generated attacks and provide insights into the
efectiveness of the methods employed.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Methodology</title>
      <p>The task of generating adversarial examples can be split into two stages. Stage one involves a method
that searches or evaluates text elements (such as words or characters) to be able to provide a target for
attack. Stage two attacks those identified elements by either removing, adding, or replacing them. Both
stages are important to delivering an efective classifier, and our approaches to each stage are outlined
below.</p>
      <sec id="sec-4-1">
        <title>4.1. Search Method</title>
        <p>We used an importance-based approach to identify candidates. Our implementation follows a similar
methodology to other popular state-of-the-art approaches [12, 13, 16]. Words are ranked based on their
importance to the model’s classification through the final logit output. There was no pre-processing
performed on the text to preserve the original meaning. Masking the words in the sentence is done
by iterating through each word and replacing it with the [MASK] token; the new masked sequence
is then run through the classifier. We then sort this list of words in descending order based on the
ranking. We take the top K words that have a high impact on the final logit output compared to the
base classification output. Our implementation is outlined through pseudo-code in Algorithm 1.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Attack Method</title>
        <p>We investigated two main attack methods: Homoglyph and Lexical replacement. Both of these
methods followed the same overall methodology. A vulnerable word in the sentence is replaced with a
corresponding adversarial word, and this new sentence is submitted to the classifier. If this adversarial
sentence elicited a change in classification then the goal is met, and the sentence is returned. If not, then
each next attacked word would replace the current, and the sentence would run through the classifier
again. For each new adversarial word, the distance is measured from the original classification logits to
the new classification logits, with the word with the highest distance being stored. If no perturbations
of the vulnerable word are successful in changing classification, then the word with the highest gap in
classification is permanently added to the sentence, and the next vulnerable word goes through the
same process. We found that without the addition of these additional words, the model did not perform
as well in terms of successful classification.</p>
        <sec id="sec-4-2-1">
          <title>4.2.1. Homoglyphs</title>
          <p>Homoglyphs are symbols (glyphs) that can represent a letter, word, or character that look very similar
to another letter/symbol but hold a diferent meaning. We can use these Homoglyphs to generate
adversarial examples, as the attacked sentence reads the same, but the tokeniser treats them diferently,
having to split them apart into multiple unknown tokens. We propose two experiments for the way
that Homoglyphs are attacked: HG1 randomly attacking characters and HG2 attacking the starting
and ending characters of a word.</p>
          <p>For HG1, we attacked the vulnerable word by randomly selecting characters from the word, ensuring
that the combination of characters was only generated once, meaning that a word containing only
homoglyphs could be generated but would only appear once. A maximum of 50 homoglyphs would
be created for each word, with less being created if that threshold is met through combinations of
characters. The list of attacked words is then sorted ascending by the number of changes made. For
generation of homoglyphs, a dictionary was created mapping each character to a range of homoglyphs,
but there was not enough variation of homoglyphs to be efective, so we used a Python library 2 for
generation of a multitude of homoglyphs. The full attack methodology is illustrated in more detail in
Algorithm 2. HG2 followed the same procedures and had a focused attack strategy of attacking only
the beginning and end characters of the word and would only attack the first and second characters of
each side of the word unless it was 4 characters or less, which would then only attack the first and last.
Algorithm 2 HG1 Algorithm
Aim: Adversarial attack framework to fool neural text classifier
Input: Text Sequence , label  , Victim Classifier  (· )
Output: Adversarial Text Sequence 
 ← (,  )
for  in  do
 ← Random replacement of characters for Homoglyphs
 ← Sort() in ascending
Initialise , ℎℎ
for  in  do
* ← replacing  with  in 
if  (* ) ̸=  then</p>
          <p>return  ← *
else
 ←  −  (* )
if  &gt; ℎℎ then</p>
          <p>ℎℎ ← 
end if
end if
end for
if ℎℎ &gt; 0 then</p>
          <p>← replace  with ℎℎ
end if
end for
return None
◁ Search Method (See Algorithm 1)</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>4.2.2. Lexical replacement</title>
          <p>
            We used large language models (LLM) to generate substitutions for vulnerable words. We implemented
a zero-shot prompting [
            <xref ref-type="bibr" rid="ref8">23</xref>
            ] approach, as prompts need to be able to work with multiple types of text.
For model selection, we tested our prompts on both base Llama8B3 and base Mistral 7B4. The Llama8B
model was selected for being one of the most complex open-source LLMs available throughout the
experimentation phase. Mistral was selected due to good performance on NLP tasks [
            <xref ref-type="bibr" rid="ref9">24</xref>
            ]. Table 1
provides an outline of the prompts which were experimented. Prompt 3 was used as a baseline but,
in the end, performed best in terms of computational time to run and overall BODEGA score. One
explanation for this could be that the reduced amount of instruction leads to less noise for the model.
2https://pypi.org/project/homoglyphs/
3https://huggingface.co/meta-llama/Meta-Llama-3-8B
4https://huggingface.co/mistralai/Mistral-7B-v0.3
          </p>
          <p>Prompt
Based on this sentence &lt;sentence&gt; identify the word &lt;word&gt; and replace it with
semantically similar words, picking the best matches for the context of the sentence.</p>
          <p>Replace a &lt;word&gt; in the given &lt;sentence&gt; with another word that conveys the same
meaning. Understand the meaning of the sentence. Generate potential synonyms for the
word to be replaced. Select the most contextually appropriate synonym.</p>
          <p>Generate a list of similar words to &lt;word&gt; in the context of this sentence &lt;sentence&gt;
without explanation.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>Table 2 presents the overall BODEGA score of the experimental models across all domains on the BERT
victim model on testing data. The model provides details on all three aspects that make up the BODEGA
score: success percentage, BLEURT score for semantic preservation, and the Levenshtein distance for
the character score. The TextFooler model is included as a baseline mode.</p>
      <p>HG1’s overall performance was the best-performing model out of all experimental models. It achieved
the highest semantic similarity scores in four of the domains (bar PR2). Comparing HG1 and HG2,
it shows that the refined HG2 attack technique seemed to have a negative impact on performance,
with it only achieving higher than HG1 in semantic preservation scores for the PR2 and C19 domains.
Both Homoglyph attack methods have a high Levenshtein distance, which is expected when attacking
characters with potentially unknown characters through Homoglyphs.</p>
      <p>The Mistral lexical replacement model outperforms the Llama3 model in terms of the success
percentage of classification changes, beating the LLama3 model in all but the RD dataset. Both models’
overall BODEGA scores are hindered by high Levenshtein distance and, in some cases, a lower semantic
score. Both models are able to slightly outperform the HG2 model.</p>
      <p>BERT</p>
      <sec id="sec-5-1">
        <title>5.1. Submission Results</title>
        <p>To evaluate the final model, we compare our results to other adversarial generation models provided in
the BODEGA framework. Table 3 illustrates five of these models, which achieved the highest BODEGA
score across the five domains and three victim models on the test dataset.</p>
        <p>The HG1’s performance mirrors that of the other models, exhibiting similar fluctuations across
various domains. Particularly, HN and RD yield lower BODEGA scores compared to domains with
shorter sequence lengths, such as C19 and FC. This is evident in the average word length of HN and RD
being 605 and 253, respectively, in contrast to C19’s 27 and FC’s 38 for the testing dataset. It suggests a
common challenge among all models when processing longer content forms. One theory behind this is
that longer content requires more alterations to afect a successful classification change, consequently
diminishing both semantic and character scores as more words/characters are replaced.</p>
        <p>HG1 performs well on BERT-based victim models, having the highest BODEGA score in C19, HN, and
FC for the BERT victim and C19, RD, and HN for Roberta. However, the model performs very poorly in
the BiLSTM model. The model performs poorly in the PR2 data; one theory for this is due to the shorter
word length of these sequences (average word count of 21). It could also show that the Homoglyph
attack does not work well on propaganda text. Overall, HG1 is able to outperform TextFooler, another
character-based attack methodology, only being outperformed in PR2 and FC for the BiLSTM model.
Though this could show that Homoglyphs are more efective in character-based attacks, it is more
likely to demonstrate that an importance-based search method is more efective for adversarial attacks.
Comparing HG1 to BERTattack (the other model using Importance-based search), both models perform
similarly but in diferent domains.</p>
        <p>Table 5 (see Appendix A) presents one example of each domain where the attack was successful on
the same piece of text across the three victim models from the testing dataset. This helps us understand
what the model is doing while also being able to identify other key features, such as discrepancies across
the diferent domains and victim models. Initial themes show that the models require diferent strengths
of attack in terms of the number of words that are attacked. Some words appear to be important for the
entire domain, showing up in all victim classes. For example, "breaking" is attacked in all three classes
for the RD domain, with "news" being attacked in just BiLSTM and Roberta. This shows that "breaking"
is an important word for rumour detection classification, which correlates with what rumour detection
is trying to do, as classifiers would classify based on language that draws people in.</p>
        <p>Parts of this reflect the overall BODEGA score presented in Table 3. For instance, HG1’s FC
perfor</p>
        <p>HG1
mance on the BiLSTM model was lower than both the BERT-based models and also lower than the
other state-of-the-art models. When inspecting the number of changes required to change classification,
BiLSTM (14) requires 12 more than BERT (2) and RoBERTa (2). However, this pattern is not followed
through the whole as HG1’s best-performing victim model for the HN domain is the BERT model, yet
the text shows that it requires the most changes to bring about a change in classification.</p>
        <p>Table 4 illustrates HG1’s performance in terms of runtime compared to the other state of the art
models, running the testing data on the RoBERTa victim model. Overall it shows that HG1 follows
the same patterns in performance with the models taking longer to attack some domains compared to
others. It shows us that importance based search techniques which are employed in models such as
HG1 and BERTattack perform better on RD data compared to the others.</p>
        <p>
          Finally, in the context of the other competing teams that submitted to InCrediblAE; our Homoglyph
model performed low in the overall BODEGA score; placing 5th out of the 6 teams. However in the
manual evaluation stage; our model scored the second highest for semantic preservation [
          <xref ref-type="bibr" rid="ref4">19</xref>
          ]. Overall
showing that the models that were better at semantic preservation to humans were less efective in
terms of BODEGA score, but the sentences read closer to the original sentence than the high BODEGA
scoring models.
        </p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion and Future Work</title>
      <p>With this contribution, our main attention was on the use of Homoglyphs as a viable technique for
successful adversarial attacks, with a specific focus on minimising the amount of semantic preservation
lost. We also show that these homoglyph replacement techniques cannot only compete with but exceed
state-of-the-art adversarial attacks, showing that a randomised attack of characters can be more efective
than a specific character attack in terms of success rates. Future work will consist of integrating more
character-based methodologies to identify a more eficient way of replacing characters. Even though
our random attack pattern was successful in changing classification, a more refined approach could
help lower the character distance score while increasing semantic preservation.
[4] N. Chattopadhyay, A. Goswami, A. Chattopadhyay, Adversarial attacks and dimensionality in text
classifiers, 2024. arXiv:2404.02660.
[5] D. Pruthi, B. Dhingra, Z. C. Lipton, Combating adversarial misspellings with robust word
recognition, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics,
2019, pp. 5582–5591.
[6] N. Papernot, P. McDaniel, X. Wu, S. Jha, A. Swami, Distillation as a defense to adversarial
perturbations against deep neural networks, in: 2016 IEEE Symposium on Security and Privacy
(SP), 2016, pp. 582–597. doi:10.1109/SP.2016.41.
[7] X. Liu, S. Dai, G. Fiumara, P. De Meo, An adversarial training method for text classification,
Journal of King Saud University - Computer and Information Sciences 35 (2023) 101697. URL:
https://www.sciencedirect.com/science/article/pii/S1319157823002513. doi:https://doi.org/
10.1016/j.jksuci.2023.101697.
[8] L. Huber, M. A. Kühn, E. Mosca, G. Groh, Detecting word-level adversarial text attacks via SHapley
additive exPlanations, in: S. Gella, H. He, B. P. Majumder, B. Can, E. Giunchiglia, S. Cahyawijaya,
S. Min, M. Mozes, X. L. Li, I. Augenstein, A. Rogers, K. Cho, E. Grefenstette, L. Rimell, C. Dyer
(Eds.), Proceedings of the 7th Workshop on Representation Learning for NLP, Association for
Computational Linguistics, Dublin, Ireland, 2022, pp. 156–166. URL: https://aclanthology.org/2022.
repl4nlp-1.16. doi:10.18653/v1/2022.repl4nlp-1.16.
[9] J. Ebrahimi, A. Rao, D. Lowd, D. Dou, Hotflip: White-box adversarial examples for text classification,
in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics
(Volume 2: Short Papers), 2018, pp. 31–36.
[10] D. Jin, Z. Jin, J. T. Zhou, P. Szolovits, Is bert really robust? a strong baseline for natural language
attack on text classification and entailment, in: Proceedings of the AAAI conference on artificial
intelligence, volume 34, 2020, pp. 8018–8025.
[11] E. Jones, R. Jia, A. Raghunathan, P. Liang, Robust encodings: A framework for combating
adversarial typos, in: D. Jurafsky, J. Chai, N. Schluter, J. Tetreault (Eds.), Proceedings of the 58th
Annual Meeting of the Association for Computational Linguistics, Association for Computational
Linguistics, Online, 2020, pp. 2752–2765. URL: https://aclanthology.org/2020.acl-main.245. doi:10.
18653/v1/2020.acl-main.245.
[12] L. Li, R. Ma, Q. Guo, X. Xue, X. Qiu, Bert-attack: Adversarial attack against bert using bert,
in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing
(EMNLP), 2020, pp. 6193–6202.
[13] S. Garg, G. Ramakrishnan, BAE: BERT-based adversarial examples for text classification, in:
B. Webber, T. Cohn, Y. He, Y. Liu (Eds.), Proceedings of the 2020 Conference on Empirical Methods
in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online,
2020, pp. 6174–6181. URL: https://aclanthology.org/2020.emnlp-main.498. doi:10.18653/v1/
2020.emnlp-main.498.
[14] M. Iyyer, J. Wieting, K. Gimpel, L. Zettlemoyer, Adversarial example generation with syntactically
controlled paraphrase networks, in: M. Walker, H. Ji, A. Stent (Eds.), Proceedings of the 2018
Conference of the North American Chapter of the Association for Computational Linguistics:
Human Language Technologies, Volume 1 (Long Papers), Association for Computational
Linguistics, New Orleans, Louisiana, 2018, pp. 1875–1885. URL: https://aclanthology.org/N18-1170.
doi:10.18653/v1/N18-1170.
[15] Y. Zhang, J. Baldridge, L. He, Paws: Paraphrase adversaries from word scrambling, in: Proceedings
of the 2019 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 1298–
1308.
[16] J. Y. Yoo, Y. Qi, Towards improving adversarial training of nlp models, in: Findings of the</p>
      <p>Association for Computational Linguistics: EMNLP 2021, 2021, pp. 945–956.
[17] I. J. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples, stat 1050
(2015) 20.
[18] C. Guo, A. Sablayrolles, H. Jégou, D. Kiela, Gradient-based adversarial attacks against text</p>
    </sec>
    <sec id="sec-7">
      <title>A. Additional Tables</title>
      <p>BERT
BERT
HN
BiLSTM
HN
RoBERTa
HN</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Del Vicario</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bessi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Zollo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Scala</surname>
          </string-name>
          , G. Caldarelli,
          <string-name>
            <given-names>H. E.</given-names>
            <surname>Stanley</surname>
          </string-name>
          , W. Quattrociocchi,
          <article-title>The spreading of misinformation online</article-title>
          ,
          <source>Proceedings of the national academy of Sciences</source>
          <volume>113</volume>
          (
          <year>2016</year>
          )
          <fpage>554</fpage>
          -
          <lpage>559</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>K.</given-names>
            <surname>Sharma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ruchansky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , Y. Liu,
          <article-title>Combating fake news: A survey on identification and mitigation techniques</article-title>
          ,
          <source>ACM Transactions on Intelligent Systems and Technology (TIST) 10</source>
          (
          <year>2019</year>
          )
          <fpage>1</fpage>
          -
          <lpage>42</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Kondamudi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. R.</given-names>
            <surname>Sahoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Chouhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Yadav</surname>
          </string-name>
          ,
          <article-title>A comprehensive survey of fake news in social networks: Attributes, features, and detection approaches</article-title>
          ,
          <source>Journal of King Saud University - Computer and Information Sciences</source>
          <volume>35</volume>
          (
          <year>2023</year>
          )
          <article-title>101571</article-title>
          . URL: https://www.sciencedirect.com/science/ article/pii/S1319157823001258. doi:https://doi.org/10.1016/j.jksuci.
          <year>2023</year>
          .
          <volume>101571</volume>
          . transformers,
          <source>in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>5747</fpage>
          -
          <lpage>5757</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>A.</given-names>
            <surname>Barrón-Cedeño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Alam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chakraborty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elsayed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Przybyła</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Struß</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Haouari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hasanain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ruggeri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Song</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Suwaileh</surname>
          </string-name>
          ,
          <article-title>The clef-2024 checkthat! lab: Check-worthiness, subjectivity, persuasion, roles, authorities, and adversarial robustness</article-title>
          , in: N.
          <string-name>
            <surname>Goharian</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Tonellotto</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>He</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Lipani</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>McDonald</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Macdonald</surname>
          </string-name>
          , I. Ounis (Eds.),
          <source>Advances in Information Retrieval</source>
          , Springer Nature Switzerland, Cham,
          <year>2024</year>
          , pp.
          <fpage>449</fpage>
          -
          <lpage>458</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>P.</given-names>
            <surname>Przybyła</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shvets</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Saggion</surname>
          </string-name>
          ,
          <article-title>Verifying the robustness of automatic credibility assessment</article-title>
          ,
          <year>2023</year>
          . arXiv:
          <volume>2303</volume>
          .
          <fpage>08032</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>T.</given-names>
            <surname>Sellam</surname>
          </string-name>
          ,
          <string-name>
            <surname>D. Das</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Parikh</surname>
          </string-name>
          ,
          <article-title>Bleurt: Learning robust metrics for text generation</article-title>
          ,
          <source>in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>7881</fpage>
          -
          <lpage>7892</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>V. I.</given-names>
            <surname>Levenshtein</surname>
          </string-name>
          , et al.,
          <article-title>Binary codes capable of correcting deletions, insertions, and reversals</article-title>
          ,
          <source>in: Soviet physics doklady</source>
          , volume
          <volume>10</volume>
          ,
          <string-name>
            <surname>Soviet</surname>
            <given-names>Union</given-names>
          </string-name>
          ,
          <year>1966</year>
          , pp.
          <fpage>707</fpage>
          -
          <lpage>710</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kojima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. S.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Reid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Matsuo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Iwasawa</surname>
          </string-name>
          ,
          <article-title>Large language models are zero-shot reasoners</article-title>
          , arXiv e-prints (
          <year>2022</year>
          ) arXiv-
          <fpage>2205</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>A. Q.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sablayrolles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mensch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Bamford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. Singh</given-names>
            <surname>Chaplot</surname>
          </string-name>
          , D. de las Casas,
          <string-name>
            <given-names>F.</given-names>
            <surname>Bressand</surname>
          </string-name>
          , G. Lengyel,
          <string-name>
            <given-names>G.</given-names>
            <surname>Lample</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Saulnier</surname>
          </string-name>
          , et al.,
          <string-name>
            <surname>Mistral</surname>
            <given-names>7b</given-names>
          </string-name>
          , arXiv e-prints (
          <year>2023</year>
          ) arXiv-
          <fpage>2310</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>K. W.</given-names>
            <surname>Church</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Kordoni</surname>
          </string-name>
          , Emerging trends: Sota-chasing,
          <source>Natural Language Engineering</source>
          <volume>28</volume>
          (
          <year>2022</year>
          )
          <fpage>249</fpage>
          -
          <lpage>269</lpage>
          . doi:
          <volume>10</volume>
          .1017/S1351324922000043.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>S.</given-names>
            <surname>Ren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Che</surname>
          </string-name>
          ,
          <article-title>Generating natural language adversarial examples through probability weighted word saliency</article-title>
          , in: A.
          <string-name>
            <surname>Korhonen</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Traum</surname>
          </string-name>
          , L. Màrquez (Eds.),
          <article-title>Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics</article-title>
          , Florence, Italy,
          <year>2019</year>
          , pp.
          <fpage>1085</fpage>
          -
          <lpage>1097</lpage>
          . URL: https://aclanthology.org/ P19-1103. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P19</fpage>
          -1103.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>