<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">TurQUaz at CheckThat! 2024: Creating Adversarial Examples using Genetic Algorithm</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Basak</forename><surname>Demirok</surname></persName>
							<email>g.demirok@etu.edu.tr</email>
							<affiliation key="aff0">
								<orgName type="institution">TOBB University of Economics and Technology</orgName>
								<address>
									<settlement>Ankara</settlement>
									<country>Türkiye</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Selin</forename><surname>Mergen</surname></persName>
							<email>s.mergen@etu.edu.tr</email>
							<affiliation key="aff0">
								<orgName type="institution">TOBB University of Economics and Technology</orgName>
								<address>
									<settlement>Ankara</settlement>
									<country>Türkiye</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Bugra</forename><surname>Oz</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">TOBB University of Economics and Technology</orgName>
								<address>
									<settlement>Ankara</settlement>
									<country>Türkiye</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Mucahid</forename><surname>Kutlu</surname></persName>
							<email>mucahidkutlu@qu.edu.qa</email>
							<affiliation key="aff1">
								<orgName type="institution">Qatar University</orgName>
								<address>
									<settlement>Doha</settlement>
									<country key="QA">Qatar</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">TurQUaz at CheckThat! 2024: Creating Adversarial Examples using Genetic Algorithm</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">FB50FA7C0E201E700E50B28D2CB93700</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:52+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Adversarial Examples, Credibility Assessment, Natural Language Processing, Robustness M. Kutlu) 0000-0003-1620-2013 (B. Demirok)</term>
					<term>0009-0002-3284-9490 (S. Mergen)</term>
					<term>0009-0003-7145-6726 (B. Oz)</term>
					<term>0000-0002-5660-4992 (M. Kutlu)</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>As we increasingly integrate artificial intelligence into our daily tasks, it is crucial to ensure that these systems are reliable and robust against adversarial attacks. In this paper, we present our participation in Task 6 of CLEF CheckThat! 2024 lab. In our work, we explore several methods, which can be grouped into two categories. The first group focuses on using a genetic algorithm to detect words and changing them via several methods such as adding/deleting words and using homoglyphs. In the second group of methods, we use large language models to generate adversarial attacks. Based on our comprehensive experiments, we pick the genetic algorithm-based model which utilizes a combination of splitting words and homoglyphs as a text manipulation method, as our primary model. We are ranked third based on both BODEGA metric and manual evaluation.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>With the impressive developments in artificial intelligence (AI) technologies, we started to use AI tools in various daily tasks. Therefore, any failure of these models might negatively affect our lives. While it is already challenging to develop robust models that can work correctly in real-world, people might also intentionally attempt to deceive these models with adversarial examples <ref type="bibr" target="#b0">[1]</ref>. Thus, it is crucial to develop robust models which are not vulnerable to such attacks.</p><p>In this paper, we present our participation in Task 6 <ref type="bibr" target="#b1">[2]</ref> of CLEF-2024 CheckThat! Lab <ref type="bibr" target="#b2">[3]</ref>. We explore several approaches to create adversarial examples. Our proposed methods can be grouped into two groups: i) genetic algorithm-based methods and ii) large language model (LLM) based methods. In genetic algorithm <ref type="bibr" target="#b3">[4,</ref><ref type="bibr" target="#b4">5]</ref> based methods, we first identify the words that need to be manipulated. Next, we apply various text manipulation methods including adding/removing a letter, shuffling the order of the letters, using homoglyphs of letters, and splitting words by inserting space character. Regarding LLM-based approaches, we propose three different approaches: i) paraphrasing the text via LLAMA3, ii) utilizing LLMs to identify words to be manipulated, and iii) using LLMs to directly create adversarial examples.</p><p>In our experiments, we observe that genetic algorithm-based methods outperform all LLM-based approaches. Among genetic algorithm-based methods, the one that uses a combination of splitting words and using homoglyphs outperforms others. Thus, we use this model as our primary model. In the official ranking, we are ranked third based on both BODEGA score and manual evaluation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Researchers have investigated various adversarial attacks and defense mechanisms for NLP tasks <ref type="bibr" target="#b5">[6]</ref>. Chen et al. <ref type="bibr" target="#b6">[7]</ref> examined backdoor attacks where manipulated training data causes models to fail with specific triggers but perform normally otherwise. Yang et al. <ref type="bibr" target="#b7">[8]</ref> demonstrated that altering a single word embedding can fool sentiment analysis models without affecting clean data results. Kurita et al. <ref type="bibr" target="#b8">[9]</ref> compared various backdoor attacks for different NLP tasks, finding that attack success varies across tasks. Dai et al. <ref type="bibr" target="#b9">[10]</ref> showed that inserting trigger sentences into LSTM-based models is highly effective. In this study, we target already trained models.</p><p>Researchers have also examined the vulnerabilities of NLP models in a black-box setting. The methods explored in prior work can be categorized into three types: 1) character-level changes, where words are modified with different spelling errors <ref type="bibr" target="#b10">[11]</ref>, 2) word-level changes, involving the replacement, removal, or addition of words <ref type="bibr" target="#b11">[12]</ref>, and 3) sentence-level changes, where new sentences or phrases are added, or existing ones are removed or paraphrased <ref type="bibr" target="#b12">[13]</ref>. While our text manipulation techniques are similar to the ones in the literature, we use a genetic algorithm to decide the words to be changed and how to change them. In addition, we explore the utilization of LLMs for adversarial example generation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Proposed Methods</head><p>In our work, we propose two different methods including a genetic algorithm-based approach and LLM-based approach. Now we explain these methods.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Genetic Algorithm Based Approach</head><p>Some words are more important than others in several NLP tasks. For instance, let us consider the following statement for the fact-checking task: The capital city of Turkiye is Ankara. If we change the word Ankara to any other city name, the statement will be false. Therefore, if we can make the word Ankara unreadable for the models, it is likely that the model will be confused in the prediction. Once an important word is selected, then the next question is how to modify it to fool the models. Therefore, our approach can be considered in two steps: i) detecting the words to be modified and ii) applying the modification method. Now we explain these two steps in detail.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.1.">Selecting Words to be Modified</head><p>We develop a genetic algorithm for selecting the words to be modified. Algorithm 1 describes our genetic algorithm. As the first step, we tokenize the input text and create potential mutations to form an initial population [Line 1]. Based on our mutation strategy, we apply mutations to each word with a probability defined by the mutation_rate (0.1) Each candidate's fitness is evaluated based on its ability to deceive the target model [Line 3]. If a candidate changes the label of the input, it receives a fitness score which is reflective of the modifications made. Otherwise, the candidate gets a 1000-point penalty additional to the modifications. Through the selection phase, the most promising candidates are retained, and through genetic operations like crossover and mutation, a new generation of text variants is created [Lines 4-5]. The crossover operation is executed by selecting a random, appropriate point in the token list of the chosen parents, ensuring that the structural integrity of words is maintained. Offspring are then produced by merging segments from each parent up to and beyond this point [Line 5]. As for the mutation, it alters each token based on the chosen mutation strategy/strategies, such as homoglyph replacement or various strategic word splits with a probability defined by the mutation_rate [Line 5]. We explain the mutation strategies in detail in Section 3.1.2. This iterative process continues until a successful adversarial text is generated or the maximum number of generations is reached [Lines 2-7]. If no successful manipulation is achieved, the original text is returned as a fallback [Line 8]. We set the maximum number of generations to 10 and the population size to 20 in all our experiments. Evaluate Fitness: calculate fitness for each candidate.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>4:</head><p>Selection: select the top half of the population by fitness.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>5:</head><p>Crossover and Mutation: apply crossover and mutation on selected parents to create new population.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>6:</head><p>if any candidate change the label then 7:</p><p>return adversarial text 8: return original_text as fallback</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.2.">Mutation Methods</head><p>In this section, we explain the word modification techniques used in our study. Figure2 shows an example modified sentence for each method. Homoglyph Replacement (HomoglyphRep). Some letters are visually similar, i.e., homoglyphs, but their encodings are different. So in this approach, we replace the characters with their visually similar ones. Figure <ref type="figure" target="#fig_0">1</ref> shows the letters we replaced and which letters were used for the replacement. In case there are multiple homoglyphs for a letter, we randomly select one of them. Splitting Words Randomly (Split R ). If a word is split into two, we can still easily understand the meaning of the corresponding sentence/phrase<ref type="foot" target="#foot_0">1</ref> . However, models like BERT can be highly affected by this kind of typo because it will lead to incorrect tokenization and it might cause getting out-ofvocabulary representations for these words. Therefore, in this approach, we split words by adding a space character to a randomly selected index of the word. Splitting Words Meaningfully (Split M ). In this method, instead of splitting the words from a random position, we focus on trying to create meaningful words after splitting, e.g., "langu age". By this approach, the model will get a completely irrelevant but valid word, causing huge changes in its representation. We use the NLTK word corpus to identify the valid words and split them accordingly. As an exceptional case, we avoid splitting the first and last characters unless the word starts or ends with 'a'. Split Words Heuristically (Split H ). In this method, we split based on the first and the last character of the targeted word. In particular, if the targeted word starts or ends with characters 'a' or 'i', we divide them from the beginning or the end accordingly. Otherwise, we choose a random index to split. Combine HomoglyphRep&amp;Split : In this method, we combine HomoglyphRep and Split 𝐻 methods. In particular, we randomly choose one of them and apply accordingly.</p><p>Combine HADSSh . In this method, we randomly select from one of the following methods: i) Homo-glyphRep, ii) Split R , iii) adding random characters into words, iv) deleting a randomly selected letter, and v) shuffling the order of the letters within the targeted word. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">LLM Based Approach</head><p>As LLMs have impressive performance in text generation and semantic analysis of text, we investigate how they can be utilized to create adversarial examples. We propose three different methods based on LLMs, LLAMA 3<ref type="foot" target="#foot_1">2</ref> and Mistral <ref type="foot" target="#foot_2">3</ref> . The details of these methods are explained below. Figure3 shows an example modified sentence for each LLM based method.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1.">Paraphrasing with LLMs (LLM Paraphrase )</head><p>In this method, we explore the impact of paraphrasing using LLMs. We use LLAMA3 to paraphrase the texts with the following prompt: "Paraphrase the following sentence with similar length T: S" where S is the input sentence and T represents the token count of the given text.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2.">Identifying Words to be Changed Using LLMs (LLM Identify )</head><p>In this method, we use LLMs to identify the words that convey the most important information for the general meaning of the given text. We directly ask LLAMA 3 to identify two important words and then apply HomoglyphRep method for those methods. We use following prompt for this task: "You are an information extractor and your task is to extract and return the two most important words that convey the meaning of the sentence. You should output the extracted words in the 'word1, word2' format. Sentence: {sentence}"</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3.">Creating Adversarial Examples (LLM Adversarial )</head><p>In this method, we utilize LLMs to create adversarial examples and pre-evaluate its validity by another LLM. Figure <ref type="figure" target="#fig_4">4</ref> shows the process flow of our method. In particular, firstly, we do not ask only to paraphrase a given text but ask LLAMA3 to create an adversarial example for a given text. Next, we ask Mistral to check if the generated text is an adversarial attack for the corresponding task. If Mistral verifies it, we use that generated text. Otherwise, we generate another sample using LLAMA3. This generation and verification process continues at most three iterations. After three attempts, we use LLAMA3's output although Mistral does not verify that. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiments</head><p>In this section, we present the experimental setup and the results.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Experimental Setup</head><p>Datasets. The dataset shared by the organizers of the lab covers five different binary classification tasks: Style-based news bias assessment (HN), propaganda detection (PR2), fact-checking (FC), rumor detection (RD), and COVID-19 misinformation detection (C19). Table <ref type="table" target="#tab_0">1</ref> provides statistics about the datasets. Evaluation Metrics: This task uses the Bodega score <ref type="bibr" target="#b13">[14]</ref> to evaluate the systems, which is basically the multiplication of confusion score (or success rate), semantic score, and character score. This score takes values between 0 and 1. A high score indicates that the model is deceived by preserving the meaning and appearance, while a low score indicates a weak deception by changing the and/or appearance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Experimental Results</head><p>In this section, we present results for our proposed methods on the test set. Firstly, we present the result for the genetic algorithm-based approaches against victim models by taking the average of all problem domains (Section 4.2.1). Next, we report the results for both LLM-based and genetic algorithm-based approaches in the fact-checking task (Section 4.2.2). Lastly, we present our official results (Section 4.3).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.1.">Results for Genetic Algorithm Based Approach</head><p>We report genetic algorithm-based results when the target model is BiLSTM, BERT, and Surprise. The results are shown in  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.2.">Results for LLM Based Approach</head><p>As getting results for LLM based approaches require much more computation power and time, we could get results only for the fact checking task and for BERT and BiLSTM models. Table3 shows the results</p><p>for LLM based approaches and also corresponding genetic algorithm based approaches for comparison. All LLM based approaches resulted in lower BODEGA scores compared to all genetic algorithm based methods. Furthermore, genetic algorithm based methods achieve very high success rates and character scores. Combine HomoglyphRep&amp;Split achieves a perfect rate in both cases but HomoglyphRep slightly outperforms Combine HomoglyphRep&amp;Split in terms of average BODEGA score.</p><p>Among LLM based approaches, our results are mixed. LLM Identify achieves the lowest BODEGA score when the target model is BERT. However, it yields the highest BODEGA score when the target model is BiLSTM. This suggests that BiLSTM models are highly affected by homoglyph attacks. Interestingly, LLM Paraphrase outperforms LLM Adversarial in both target models although the prompt we use in LLM Paraphrase just asks to paraphrase the text without any intention of creating an adversarial example while we ask to create adversarial example in LLM Adversarial . </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Official Ranking</head><p>We submitted the results of Combine HomoglyphRep&amp;Split as our official run because of its superior performance on average. We achieved 0.4859 BODEGA score on average, ranking third among participants.</p><p>Based on the results with manual annotations for preserving meaning, we achieved 0.62, ranking again third.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>In this paper, we present our participation in Task 6 of the CLEF 2024 CheckThat! Lab. In our study, we explore two different approaches to create adversarial examples. In the first approach, we use a genetic algorithm to detect the words to be changed and to identify text manipulation methods. We investigate various text manipulation methods, such as adding/deleting a letter, using homoglyphs, and shuffling the order of letters within a text. In the second approach, we utilize large language models to create adversarial examples. This involves three different methods: asking LLMs to paraphrase a given text, using LLMs to directly generate adversarial examples, and employing LLMs to identify the words that need to be changed to create adversarial examples.</p><p>In our comprehensive set of experiments, which involve six different tasks, three different target models, and a total of nine methods, we have the following observations. Firstly, genetic algorithm-based methods outperform all LLM-based approaches. Secondly, among the genetic algorithm-based methods, using the combination of homoglyphs and splitting as text manipulation outperforms the other methods. This suggests that we need to be more selective in text manipulation methods instead of using all possible methods. In the official ranking, our primary model is ranked third based on the BODEGA score and semantic preservation scores which are based on manual annotations.</p><p>In the future, we plan to extend this work in two different directions. Firstly, although LLM models did not achieve high performance in this task, we believe that their effectiveness can be improved through several strategies, such as using different prompts and fine-tuning them specifically for this task. Secondly, regarding the genetic algorithm-based methods, we plan to explore other text manipulation methods to enhance this model further.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Algorithm 1 :</head><label>1</label><figDesc>Genetic Algorithm Structure 1: Initialize Population: generate initial 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛_𝑠𝑖𝑧𝑒 many mutations. 2: for 𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑖𝑜𝑛 = 1 to 𝑚𝑎𝑥_𝑔𝑒𝑛𝑒𝑟𝑎𝑡𝑖𝑜𝑛𝑠 do 3:</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: The letters with their homoglyphs used in our study. While there are other homoglyph character options available, we select only those that are indistinguishable to the human eye.</figDesc><graphic coords="3,82.70,340.21,427.40,148.80" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Example outputs of genetic algorithm-based attacks. The texts written in red show the modified parts in original sentence: "Emma Watson. Emma Charlotte Duerre Watson ( born 15 April 1990 ) is a French-British actress , model , and activist . Ẽmma Watson is an Italian actress. "</figDesc><graphic coords="4,138.83,118.20,315.12,192.92" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Example outputs of LLM-based attacks. The red letter shows the modified letter by HomoglyphRep method.</figDesc><graphic coords="5,72.00,65.61,453.96,125.32" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: The process flow of our method LLM Adversarial .</figDesc><graphic coords="5,72.00,321.28,451.28,106.36" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Dataset size for each task.</figDesc><table><row><cell>Task</cell><cell cols="3">Train Development Test</cell></row><row><cell>Style-based news bias assessment</cell><cell>60,234</cell><cell>3,600</cell><cell>400</cell></row><row><cell>Propaganda Detection</cell><cell>11,546</cell><cell>3.186</cell><cell>407</cell></row><row><cell>Fact Checking</cell><cell>172,763</cell><cell>10,010</cell><cell>405</cell></row><row><cell>Rumor Detection</cell><cell>8,683</cell><cell>2,070</cell><cell>415</cell></row><row><cell>COVID-19 Misinformation Detection</cell><cell>1,130</cell><cell>-</cell><cell>595</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2 .</head><label>2</label><figDesc>Our observations based on the results are as follows. Firstly, among split models, Split M has significantly lower success rate and BODEGA score, but slightly higher semantic and character scores. While Split R and Split H have highly similar scores, Split R outperforms Split R slightly in terms of BODEGA. Secondly, HomoglyphRep achieves the highest semantic score in all cases and the highest character score in two cases, but its BODEGA and success rate are lower than our combination-based methods. Thirdly, Combine HomoglyphRep&amp;Split outperforms Combine HADSSh in all cases in terms of BODEGA, suggesting that we can focus on only a few text manipulation methods instead of covering all. Finally, among all models, Combine HomoglyphRep&amp;Split and Split R has the same average BODEGA score, but Combine HomoglyphRep&amp;Split slightly outperforms Split R in terms of success rate and semantic score.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2</head><label>2</label><figDesc>Average performance of approaches on all problem domains on the test set. Evaluation measures include BODEGA score (B.), success rate (SR), Semantic Score (SemSc), character score (CharSc) and number of queries to the attacked model (Q.)<ref type="bibr" target="#b13">[14]</ref>. The best performing score for each case is written in bold.</figDesc><table><row><cell cols="2">Target Model Method</cell><cell>BODEGA</cell><cell cols="3">SR SemSc CharSc</cell><cell>Q.</cell></row><row><cell></cell><cell>Split R</cell><cell>0.61</cell><cell>0.89</cell><cell>0.70</cell><cell>0.97</cell><cell>97.98</cell></row><row><cell></cell><cell>Split H</cell><cell>0.61</cell><cell>0.89</cell><cell>0.70</cell><cell cols="2">0.97 103.30</cell></row><row><cell>BiLSTM</cell><cell>Split M HomoglyphRep</cell><cell>0.50 0.59</cell><cell>0.69 0.83</cell><cell>0.73 0.74</cell><cell cols="2">0.98 172.11 0.95 149.41</cell></row><row><cell></cell><cell cols="2">Combine HomoglyphRep&amp;Split 0.62</cell><cell>0.91</cell><cell>0.71</cell><cell cols="2">0.95 116.80</cell></row><row><cell></cell><cell>Combine HADSSh</cell><cell>0.59</cell><cell>0.94</cell><cell>0.66</cell><cell cols="2">0.95 118.86</cell></row><row><cell></cell><cell>Split R</cell><cell>0.50</cell><cell>0.75</cell><cell>0.68</cell><cell cols="2">0.96 157.77</cell></row><row><cell></cell><cell>Split H</cell><cell>0.50</cell><cell>0.75</cell><cell>0.68</cell><cell cols="2">0.97 162.17</cell></row><row><cell>BERT</cell><cell>Split M HomoglyphRep</cell><cell>0.44 0.49</cell><cell>0.63 0.73</cell><cell>0.71 0.71</cell><cell cols="2">0.98 203.51 0.93 229.69</cell></row><row><cell></cell><cell cols="2">Combine HomoglyphRep&amp;Split 0.50</cell><cell>0.77</cell><cell>0.68</cell><cell cols="2">0.94 213.93</cell></row><row><cell></cell><cell>Combine HADSSh</cell><cell>0.47</cell><cell>0.77</cell><cell>0.64</cell><cell cols="2">0.94 227.60</cell></row><row><cell></cell><cell>Split R</cell><cell>0.37</cell><cell>0.56</cell><cell>0.68</cell><cell cols="2">0.96 236.94</cell></row><row><cell></cell><cell>Split H</cell><cell>0.36</cell><cell>0.55</cell><cell>0.67</cell><cell cols="2">0.96 241.56</cell></row><row><cell>Surprise</cell><cell>Split M HomoglyphRep</cell><cell>0.31 0.36</cell><cell>0.44 0.55</cell><cell>0.70 0.70</cell><cell cols="2">0.98 276.73 0.92 341.65</cell></row><row><cell></cell><cell cols="2">Combine HomoglyphRep&amp;Split 0.36</cell><cell>0.55</cell><cell>0.68</cell><cell cols="2">0.94 331.73</cell></row><row><cell></cell><cell>Combine HADSSh</cell><cell>0.34</cell><cell>0.57</cell><cell>0.64</cell><cell cols="2">0.94 336.89</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3</head><label>3</label><figDesc>Results of LLM-used and GA-used methods on Fact-Checking Task's attack dataset. Evaluation measures include BODEGA score, success rate (SR), semantic score (SemSc), character score (CharSC) and number of queries to the attacked model (Q.)<ref type="bibr" target="#b13">[14]</ref>. Bold scores indicate the highest score of the corresponding target model.</figDesc><table><row><cell cols="2">Target Model Method</cell><cell>BODEGA</cell><cell cols="3">SR SemSc CharSC</cell><cell>Q.</cell></row><row><cell></cell><cell>LLM Paraphrase</cell><cell>0.066</cell><cell>0.380</cell><cell>0.429</cell><cell>0.397</cell><cell>2.380</cell></row><row><cell></cell><cell>LLM Identify</cell><cell>0.027</cell><cell>0.032</cell><cell>0.86</cell><cell>0.96</cell><cell>2.02</cell></row><row><cell></cell><cell>LLM Adversarial</cell><cell>0.056</cell><cell>0.496</cell><cell>0.312</cell><cell>0.326</cell><cell>2.496</cell></row><row><cell></cell><cell>Split R</cell><cell>0.73</cell><cell>0.98</cell><cell>0.76</cell><cell>0.97</cell><cell>69.20</cell></row><row><cell>BERT</cell><cell>Split H</cell><cell>0.72</cell><cell>0.97</cell><cell>0.77</cell><cell>0.97</cell><cell>75.41</cell></row><row><cell></cell><cell>Split M</cell><cell>0.63</cell><cell>0.82</cell><cell>0.78</cell><cell cols="2">0.98 132.55</cell></row><row><cell></cell><cell>HomoglyphRep</cell><cell>0.75</cell><cell>0.97</cell><cell>0.82</cell><cell>0.94</cell><cell>79.85</cell></row><row><cell></cell><cell cols="2">Combine HomoglyphRep&amp;Split 0.74</cell><cell>1.00</cell><cell>0.78</cell><cell>0.95</cell><cell>70.70</cell></row><row><cell></cell><cell>Combine HADSSh</cell><cell>0.69</cell><cell>0.98</cell><cell>0.74</cell><cell>0.95</cell><cell>99.42</cell></row><row><cell></cell><cell>LLM Paraphrase</cell><cell>0.070</cell><cell>0.404</cell><cell>0.426</cell><cell>0.402</cell><cell>2.404</cell></row><row><cell></cell><cell>LLM Identify</cell><cell>0.110</cell><cell>0.130</cell><cell>0.867</cell><cell>0.969</cell><cell>2.128</cell></row><row><cell></cell><cell>LLM Adversarial</cell><cell>0.048</cell><cell>0.420</cell><cell>0.312</cell><cell>0.319</cell><cell>2.420</cell></row><row><cell></cell><cell>Split R</cell><cell>0.78</cell><cell>1.00</cell><cell>0.80</cell><cell>0.98</cell><cell>38.60</cell></row><row><cell>BiLSTM</cell><cell>Split H</cell><cell>0.77</cell><cell>1.00</cell><cell>0.79</cell><cell>0.98</cell><cell>42.55</cell></row><row><cell></cell><cell>Split M</cell><cell>0.65</cell><cell>0.83</cell><cell>0.79</cell><cell cols="2">0.98 121.60</cell></row><row><cell></cell><cell>HomoglyphRep</cell><cell>0.79</cell><cell>0.98</cell><cell>0.84</cell><cell>0.95</cell><cell>54.54</cell></row><row><cell></cell><cell cols="2">Combine HomoglyphRep&amp;Split 0.79</cell><cell>1.00</cell><cell>0.82</cell><cell>0.97</cell><cell>39.44</cell></row><row><cell></cell><cell>Combine HADSSh</cell><cell>0.75</cell><cell>1.00</cell><cell>0.77</cell><cell>0.97</cell><cell>45.67</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">e.g., "nat ural language processing"</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://ollama.com/library/llama3:8b</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://ollama.com/library/mistral v02</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Adversarial attacks on deep-learning models in natural language processing: A survey</title>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">E</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><forename type="middle">Z</forename><surname>Sheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Alhazmi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Transactions on Intelligent Systems and Technology (TIST)</title>
		<imprint>
			<biblScope unit="volume">11</biblScope>
			<biblScope unit="page" from="1" to="41" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of the CLEF-2024 CheckThat! lab task 6 on robustness of credibility assessment with adversarial examples (incrediblae)</title>
		<author>
			<persName><forename type="first">P</forename><surname>Przybyła</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Shvets</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Mu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">C</forename><surname>Sheang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Saggion</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of CLEF 2024 -Conference and Labs of the Evaluation Forum, CLEF 2024</title>
				<meeting><address><addrLine>Grenoble, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Overview of the CLEF-2024 CheckThat! Lab: Check-worthiness, subjectivity, persuasion, roles, authorities and adversarial robustness</title>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Alam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Struß</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Nakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Chakraborty</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Elsayed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Przybyła</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Caselli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Da San Martino</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Haouari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Piskorski</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ruggeri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Song</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Suwaileh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association</title>
				<editor>
			<persName><forename type="first">L</forename><surname>Goeuriot</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Mulhem</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Quénot</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Schwab</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Soulier</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><forename type="middle">M</forename><surname>Di Nunzio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Galuščáková</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>García Seco De Herrera</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<meeting><address><addrLine>CLEF</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024. 2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Generating natural language adversarial examples</title>
		<author>
			<persName><forename type="first">M</forename><surname>Alzantot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sharma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Elgohary</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B.-J</forename><surname>Ho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Srivastava</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K.-W</forename><surname>Chang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing</title>
				<meeting>the 2018 Conference on Empirical Methods in Natural Language Processing</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="2890" to="2896" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Bad characters: Imperceptible nlp attacks</title>
		<author>
			<persName><forename type="first">N</forename><surname>Boucher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Shumailov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Anderson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Papernot</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Symposium on Security and Privacy (SP), IEEE</title>
				<imprint>
			<date type="published" when="2022">2022. 2022</date>
			<biblScope unit="page" from="1987" to="2004" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Adversarial attacks and defenses in deep learning</title>
		<author>
			<persName><forename type="first">K</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Zheng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Qin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Engineering</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="page" from="346" to="360" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Badnl: Backdoor attacks against nlp models with semantic-preserving improvements</title>
		<author>
			<persName><forename type="first">X</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Salem</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Backes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 37th Annual Computer Security Applications Conference</title>
				<meeting>the 37th Annual Computer Security Applications Conference</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="554" to="569" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Be careful about poisoned word embeddings: Exploring the vulnerability of the embedding layers in nlp models</title>
		<author>
			<persName><forename type="first">W</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Ren</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>He</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
				<meeting>the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="2048" to="2058" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Weight poisoning attacks on pretrained models</title>
		<author>
			<persName><forename type="first">K</forename><surname>Kurita</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Michel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Neubig</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</title>
				<meeting>the 58th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="2793" to="2806" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">A backdoor attack against lstm-based text classification systems</title>
		<author>
			<persName><forename type="first">J</forename><surname>Dai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page" from="138872" to="138878" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Synthetic and natural noise both break neural machine translation</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Belinkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bisk</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Learning Representations</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Is bert really robust? a strong baseline for natural language attack on text classification and entailment</title>
		<author>
			<persName><forename type="first">D</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">T</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Szolovits</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the AAAI conference on artificial intelligence</title>
				<meeting>the AAAI conference on artificial intelligence</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="page" from="8018" to="8025" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Catch me if you can: Deceiving stance detection and geotagging models to protect privacy of on twitter</title>
		<author>
			<persName><forename type="first">D</forename><surname>Dogan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Altun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Zengin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Kutlu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Elsayed</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the International AAAI Conference on Web and Social Media</title>
				<meeting>the International AAAI Conference on Web and Social Media</meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="page" from="173" to="184" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<author>
			<persName><forename type="first">P</forename><surname>Przybyła</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Shvets</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Saggion</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2303.08032</idno>
		<title level="m">Bodega: Benchmark for adversarial example generation in credibility assessment</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
